Design of High-Performance Asynchronous Pipeline Using Synchronizing Logic Gates
スポンサーリンク
概要
- 論文の詳細を見る
This paper introduces a novel design method of an asynchronous pipeline based on dual-rail dynamic logic. The overhead of handshake control logic is greatly reduced by constructing a reliable critical datapath, which offers the pipeline high throughput as well as low power consumption. Synchronizing Logic Gates (SLGs), which have no data dependency problem, are used in the design to construct the reliable critical datapath. The design targets latch-free and extremely fine-grain or gate-level pipeline, where the depth of every pipeline stage is only one dual-rail dynamic logic. HSPICE simulation results, in a 65nm design technology, indicate that the proposed design increases the throughput by 120% and decreases the power consumption by 54% compared with PS0, a classic dual-rail asynchronous pipeline implementation style, in 4-bit wide FIFOs. Moreover, this method is applied to design an array style multiplier. It shows that the proposed design reduces power by 37.9% compared to classic synchronous design when the workloads are 55%. A chip has been fabricated with a 4×4 multiplier function, which works well at 2.16G data-set/s (Post-layout simulation).
- 2012-08-01
著者
-
HARIYAMA Masanori
the Graduate School of Information Sciences, Tohoku University
-
KAMEYAMA Michitaka
the Graduate School of Information Sciences, Tohoku University
-
Kameyama Michitaka
The Graduate School Of Information Sciences And Also With The Faculty Of Engineering Tohoku Universi
-
XIA Zhengfan
the Graduate School of Information Sciences, Tohoku University
-
ISHIHARA Shota
the Graduate School of Information Sciences, Tohoku University
関連論文
- A Three-Dimensional Instrumentation VLSI Processor Based on a Concurrent Memory-Access Scheme
- Unified Scheduling of High Performance Parallel VLSI Processors for Robotics (Special Section on JTC-CSCC '92)
- Design of High-Performance Asynchronous Pipeline Using Synchronizing Logic Gates
- Acceleration of Block Matching on a Low-Power Heterogeneous Multi-Core Processor Based on DTU Data-Transfer with Data Re-Allocation