Decoupled Iteration Mapping: Improving Dependency-loop Performance on SIMD Processors
スポンサーリンク
概要
- 論文の詳細を見る
Wide Single Instruction Multiple Data (SIMD) architectures are very important in the compute-intensive applications, but less efficient for applications with cross-iteration dependency loops which are difficult to parallelize and vectorize. This paper introduces Decoupled Iteration Mapping (DIM), a technique that dynamically maps a cross-iteration dependency loop onto the improved SIMD architecture which achieved multicore-like thread-parallel performance. The minor modification on the baseline architecture is composed of a Prefetch Unit & Instruction Buffer Array (PU&IBA), a Loop Control Unit & Instruction Dispatch Unit (LCU&IDU), and a Data Buffer Chain (DBC). Experimental results show that, the proposed DIM scheme can achieve average 3.04x performance speedup with a cost of only 6.44% area overhead.
- The Institute of Electronics, Information and Communication Engineersの論文
著者
-
Yang Hui
College of Materials Science and Chemical Engineering, Zhejiang University, Hangzhou 310027, China
-
Chen Shuming
College of Computer, National University of Defense Technology
-
Wan Jianghua
College of Computer, National University of Defense Technology
-
Dai Huanyao
Luoyang Electronic Equipment Test Center of China
-
Yang Hui
College of Computer, National University of Defense Technology
関連論文
- A New System of Low Temperature Sintering ZnO–SiO2 Dielectric Ceramics
- Decoupled Iteration Mapping: Improving Dependency-loop Performance on SIMD Processors