Affine Transformations for Communication and Reconfiguration Optimization of Mapping Loop Nests on CGRAs
スポンサーリンク
概要
- 論文の詳細を見る
A coarse-grained reconfigurable architecture (CGRA) is typically hybrid architecture, which is composed of a reconfigurable processing unit (RPU) and a host microprocessor. Many computation-intensive kernels (e.g., loop nests) are often mapped onto RPUs to speed up the execution of programs. Thus, mapping optimization of loop nests is very important to improve the performance of CGRA. Processing element (PE) utilization rate, communication volume and reconfiguration cost are three crucial factors for the performance of RPUs. Loop transformations can affect these three performance influencing factors greatly, and would be of much significance when mapping loops onto RPUs. In this paper, a joint loop transformation approach for RPUs is proposed, where the PE utilization rate, communication cost and reconfiguration cost are under a joint consideration. Our approach could be integrated into compilers for CGRAs to improve the operating performance. Compared with the communication-minimal approach, experimental results show that our scheme can improve 5.8% and 13.6% of execution time on motion estimation (ME) and partial differential equation (PDE) solvers kernels, respectively. Also, run-time complexity is acceptable for the practical cases.
著者
-
Yin Shouyi
Institute Of Microelectronics Tsinghua University
-
Liu Leibo
Institute Of Microelectronics Tsinghua University
-
Wei Shaojun
Institute Of Microelectronics Tsinghua University
-
LIU Dajiang
Institute of Microelectronics, Tsinghua University
関連論文
- Compiler Framework for Reconfigurable Computing Architecture
- A Cycle-Accurate Simulator for a Reconfigurable Multi-Media System
- Parallelization of Computing-Intensive Tasks of the H.264 High Profile Decoding Algorithm on a Reconfigurable Multimedia System
- CropNET : A Wireless Multimedia Sensor Network for Agricultural Monitoring
- Configuration Context Reduction for Coarse-Grained Reconfigurable Architecture
- Hybrid Wired/Wireless On-Chip Network Design for Application-Specific SoC
- Multi-Battery Scheduling for Battery-Powered DVS Systems
- Reconfiguration Process Optimization of Dynamically Coarse Grain Reconfigurable Architecture for Multimedia Applications
- Mapping Optimization of Affine Loop Nests for Reconfigurable Computing Architecture
- Hardware Software Co-design of H.264 Baseline Encoder on Coarse-Grained Dynamically Reconfigurable Computing System-on-Chip
- Affine Transformations for Communication and Reconfiguration Optimization of Mapping Loop Nests on CGRAs
- Parallelization of Computing-Intensive Tasks of SIFT Algorithm on a Reconfigurable Architecture System
- An Inductive-Coupling Interconnected Application-Specific 3D NoC Design
- Battery-Aware Task Mapping for Coarse-Grained Reconfigurable Architecture
- Concurrent Detection and Recognition of Individual Object Based on Colour and p-SIFT Features
- Hardware Software Co-design of H.264 Baseline Encoder on Coarse-Grained Dynamically Reconfigurable Computing System-on-Chip
- Concurrent Detection and Recognition of Individual Object Based on Colour and p-SIFT Features