Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Scratchpad-memory management for multi-threaded applications on many-core architectures
Venkataramani V., Chan M., Mitra T.  ACM Transactions on Embedded Computing Systems 18 (1): 1-28, 2019. Type: Article
Date Reviewed: Oct 7 2020

This paper focuses on improving many-core architectures via software programmable or scratchpad memory (SPM):

An SPM contains an array of [static random-access memory, SRAM] cells. A portion of the memory address space is dedicated to the SPM. Any address that falls within this dedicated address space can directly index into the SPM to access the corresponding data.

Thus, by maintaining a dedicated area, the “coherency among multiple SPMs” at the software level can be eliminated. This use of software-level access to the data “thereby eliminat[es] the hardware area/power required for cache coherence,” as well as cache access. In a many-core architecture environment, data access on many cores can drastically reduce performance due to coherency issues and long delays related to data access from different cores.

In a many-core, multi-threaded architecture, as well as on-chip and off-chip, data accesses can lead to nonuniform, long-latency, and irregular data accesses. To overcome these difficulties in nonuniform data accesses, the paper proposes “a compile-time, coordinated data management framework called CDM, for many-core SPMs.” For this paper, “the 16-core Epiphany SoC consists of an array of simple RISC processors (eCores) programmable in C connected together in a 2D-mesh NOC and supporting a single shared address space.” Because a Xilinx Zynq system on chip (SoC) supports these eCores on the same development board, it is more energy efficient, unlike traditional cache memory. The eCores are not only able to access local memory, but are also capable of accessing remote memory.

Several kernel applications from embedded, multithreaded benchmarks are used in the evaluation, including two benchmarks related to the decryption and encryption of data (AESD and AESE) and three long-term evolution (LTE) benchmarks (PHY_ACI, PHY_DEMAP, and PHY_MICF). The authors use a GREEDY approach as their baseline; SNAP-S allows only one copy of data, and SNAP-M uses a replication mechanism. As a result, “the SNAP-M approach provides an average speed-up of 1.84x and an energy reduction of 1.83x when compared to the GREEDY strategy.” The SNAP-S approach “provides an average speed-up and energy reduction of 1.09x.” Thus, these two approaches effectively speed up as well as reduce the energy usage due to no cache-like memory, which consumes more power when the data is accessed.

The authors take advantage of bringing in off-chip data to the on-chip memory and not using cache-like memory; the use of SoC reduces energy consumption. Currently, a new type of memory is on the rise that can drastically reduce power consumption and is faster than DRAM and cache. When such memory comes into use, this paper will be obsolete. The overhead of bringing in off-chip data to the on-chip memory must also be considered. Besides, the SNAP-S speed-up compared to the GREEDY strategy is not significant; only when the data is replicated is significant improvement observed. One would expect a significant reduction in the SNAP-S strategy, because even the remote memory access data is reduced to the local memory accesses; however, that is not seen in the experimental results.

Reviewer:  J. Arul Review #: CR147077 (2102-0038)
Bookmark and Share
Real-Time And Embedded Systems (C.3 ... )
Would you recommend this review?
Other reviews under "Real-Time And Embedded Systems": Date
Resource characterisation of personal-scale sensing models on edge accelerators
Antonini M., Vu T., Min C., Montanari A., Mathur A., Kawsar F.  AIChallengeIoT 2019 (Proceedings of the First International Workshop onChallenges in Artificial Intelligence and Machine Learning for Internet of Things, New York, NY,  Nov 10-13, 2019) 49-55, 2019. Type: Proceedings
Aug 18 2021
Embedded software for the IoT (3rd ed.)
Elk K.,  DE GRUYTER, Boston, MA, 2019. 294 pp. Type: Book (978-1-547417-15-5)
Nov 30 2020
Embedded computing for high performance: efficient mapping of computations using customization, code transformations and compilation
Cardoso J., Coutinho J., Diniz P.,  Morgan Kaufmann Publishers Inc., San Francisco, CA, 2017. 320 pp. Type: Book (978-0-128041-89-5), Reviews: (2 of 2)
Jun 26 2018

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2021 ThinkLoud, Inc.
Terms of Use
| Privacy Policy