A Data Prefetch and Reuse Strategy for Coarse-Grained Reconfigurable Architectures
スポンサーリンク
概要
- 論文の詳細を見る
The Coarse Grained Reconfigurable Architectures (CGRAs) are proposed as new choices for enhancing the ability of parallel processing. Data transfer throughput between Reconfigurable Cell Array (RCA) and on-chip local memory is usually the main performance bottleneck of CGRAs. In order to release this stress, we propose a novel data transfer strategy that is called Heuristic Data Prefetch and Reuse (HDPR), for the first time in the case of explicit CGRAs. The HDPR strategy provides not only the flexible data access schedule but also the high data throughput needed to realize fast pipelined implementations of various loop kernels. To improve the data utilization efficiency, a dual-bank cache-like data reuse structure is proposed. Furthermore, a heuristic data prefetch is also introduced to decrease the data access latency. Experimental results demonstrate that when compared with conventional explicit data transfer strategies, our work achieves a significant speedup improvement of, on average, 1.73 times at the expense of only 5.86% increase in area.
著者
-
Shi Longxing
National ASIC Center, Southeast University
-
GE Wei
National ASIC System Engineering Research Center, Southeast University
-
QI Zhi
National ASIC System Engineering Research Center, Southeast University
-
DU Yue
National ASIC System Engineering Research Center, Southeast University
-
MA Lu
National ASIC System Engineering Research Center, Southeast University
関連論文
- Current reused Colpitts VCO and frequency divider with quadrature outputs
- Memory-Efficient and High-Performance Two-Dimensional Discrete Wavelet Transform Architecture Based on Decomposed Lifting Algorithm
- A Harmonic-Free All Digital Delay-Locked Loop Using an Improved Fast-Locking Successive Approximation Register-Controlled Scheme
- Integrated Current Sensing Technique Suitable for Step-Down Switch-Mode Power Converters
- Date Flow Optimization of Dynamically Coarse Grain Reconfigurable Architecture for Multimedia Applications
- An optimized QFP structure for use in radio frequency multi-chip module applications
- Fast AdaBoost-Based Face Detection System on a Dynamically Coarse Grain Reconfigurable Architecture
- Reconfiguration Process Optimization of Dynamically Coarse Grain Reconfigurable Architecture for Multimedia Applications
- Handling Deafness Problem of Scheduled Multi-Channel Polling MACs
- Parallelism Analysis of H.264 Decoder and Realization on a Coarse-Grained Reconfigurable SoC
- Hardware Software Co-design of H.264 Baseline Encoder on Coarse-Grained Dynamically Reconfigurable Computing System-on-Chip
- A Data Prefetch and Reuse Strategy for Coarse-Grained Reconfigurable Architectures
- A novel DC-12GHz variable gain amplifier in InGaP/GaAs HBT technology
- A wide-range and ultra fast-locking all-digital SAR DLL without harmonic-locking
- A Data Prefetch and Reuse Strategy for Coarse-Grained Reconfigurable Architectures
- Hardware Software Co-design of H.264 Baseline Encoder on Coarse-Grained Dynamically Reconfigurable Computing System-on-Chip