A Fast Selector-Based Subtract-Multiplication Unit and Its Application to Butterfly Unit
スポンサーリンク
概要
- 論文の詳細を見る
Large-scale network and multimedia application LSIs include application specific arithmetic units. A multiply-accumulator unit or a MAC unit which is one of these optimized units arranges partial products and decreases carry propagations. However, there is no method similar to MAC to execute "subtract-multiplication". In this paper, we propose a high-speed subtract-multiplication unit that decreases latency of a subtract operation by bit-level transformation using selector logics. By using bit-level transformation, its partial products are calculated directly. The proposed subtract-multiplication units can be applied to any types of systems using subtract-multiplications and a butterfly operation in FFT is one of their suitable applications. We apply them effectively to Radix-2 butterfly units and Radix-4 butterfly units. Experimental results show that our proposed operation units using selector logics improves the performance by up to 13.92%, compared to a conventional approach.
- 一般社団法人情報処理学会の論文
- 2011-02-08
著者
-
Yanagisawa Masao
Department Of Computer Science Waseda University
-
Nozomu Togawa
Department of Computer Science and Engineering, Waseda University
-
Tatsuo Ohtsuki
Department of Computer Science and Engineering, Waseda University
-
Masao Yanagisawa
Department of Electronic and Photonic Systems, Waseda University
-
Youhei Tsukamoto
Department of Computer Science and Engineering, Waseda University
関連論文
- FPGA-Based Reconfigurable Adaptive FEC(System Level Design)(VLSI Design and CAD Algorithms)
- Floorplan-Aware High-Level Synthesis for Generalized Distributed-Register Architectures
- Selective Low-Care Coding : A Means for Test Data Compression in Circuits with Multiple Scan Chains(Selected Papers from the 18th Workshop on Circuits and Systems in Karuizawa)
- A Fast Elliptic Curve Cryptosystem LSI Embedding Word-Based Montgomery Multiplier (System LSIs and Microprocessors, VLSI Design Technology in the Sub-100nm Era)
- A SIMD Instruction Set and Functional Unit Synthesis Algorithm with SIMD Operation Decomposition(Programmable Logic, VLSI, CAD and Layout, Recent Advances in Circuits and Systems-Part 1)
- Sub-operation Parallelism Optimization in SIMD Processor Core Synthesis(Selected Papers from the 17th Workshop on Circuits and Systems in Karuizawa)
- High-Level Power Optimization Based on Thread Partitioning(System Level Design)(VLSI Design and CAD Algorithms)
- A Hardware/Software Cosynthesis Algorithm for Processors with Heterogeneous Datapaths(Selected Papers from the 16th Workshop on Circuits and Systems in Karuizawa)
- A Hardware/Software Partitioning Algorithm for Processor Cores with Packed SIMD-Type Instructions(Design Methodology)(VLSI Design and CAD Algorithms)
- A Retargetable Simulator Generator for DSP Processor Cores with Packed SIMD-type Instructions(Simulation Acceletor)(VLSI Design and CAD Algorithms)
- A Retargetable Simulator Generator for DSP Processor Cores with Packed SIMD-type Instructions
- A Hardware/Software Cosynthesis System for Processor Cores with Content Addressable Memories
- A High-Level Energy-Optimizing Algorithm for System VLSIs Based on Area/Time/Power Estimation(Special Section on VLSI Design and CAD Algorithms)
- An Algorithm and a Flexible Architecture for Fast Block-Matching Motion Estimation(Special Section on VLSI Design and CAD Algorithms)
- C-5 A Software/Hardware Codesign for MPEG Encoder
- High-Level Area/Delay/Power Estimation for Low Power System VLSIs with Gated Clocks(Special Section of Selected Papers from the 14th Workshop on Circuits and Systems in Karuizawa)
- A New Hardware/Software Partitioning Algorithm for DSP Processor Cores with Two Types of Register Files(Special Section on VLSI Design and CAD Algorithms)
- Area and Delay Estimation in Hardware/Software Cosynthesis for Digital Signal Processor Cores(Special Section on VLSI Design and CAD Algorithms)
- An Area/Time Optimizing Algorithm in High-Level Synthesis of Control-Based Hardwares (Special Section on Discrete Mathematics and Its Applications)
- CAM Processor Synthesis Based on Behavioral Descriptions (Special Section on VLSI Design and CAD Algorithms)
- A Hardware / Software Cosynthesis System for Digital Signal Processor Cores with Two Types of Register Files (Special Section of Selected Papers from the 12th Workshop on Circuit and Systems in Karuizawa)
- A Two-Level Cache Design Space Exploration System for Embedded Applications
- An L1 Cache Design Space Exploration System for Embedded Applications
- A Built-in Reseeding Technique for LFSR-Based Test Pattern Generation(Timing Verification and Test Generation)(VLSI Design and CAD Algorithms)
- A Selective Scan Chain Reconfiguration through Run-Length Coding for Test Data Compression and Scan Power Reduction(Test)(VLSI Design and CAD Algorithms)
- A Hybrid Dictionary Test Data Compression for Multiscan-Based Designs(Test)(VLSI Design and CAD Algorithms)
- A Scan-Based Attack Based on Discriminators for AES Cryptosystems
- X-Handling for Current X-Tolerant Compactors with More Unknowns and Maximal Compaction
- Unified Dual-Radix Architecture for Scalable Montgomery Multiplications in GF(P) and GF(2^n)
- A Unified Test Compression Technique for Scan Stimulus and Unknown Masking Data with No Test Loss
- A Secure Test Technique for Pipelined Advanced Encryption Standard
- Scan-Based Side-Channel Attack against RSA Cryptosystems Using Scan Signatures
- A Hardware/Software Cosynthesis System for Digital Signal Processor Cores (Special Section on VLSI Design and CAD Algorithms)
- A Depth-Constrained Technology Mapping Algorithm for Logic-Blocks Composed of Tree-Structured LUTs (Special Section on Selected Papers from the 11th Workshop on Circuits and Systems in Karuizawa)
- A Fast Scheduling Algorithm Based on Gradual Time-Frame Reduction for Datapath Synthesis
- An FPGA Layout Reconfiguration Algorithm Based on Global Routes for Engineering Changes in System Design Specifications(Special Section on Discrete Mathematics and Its Applications)
- Greedy Optimization Algorithm for the Power/Ground Network Design to Satisfy the Voltage Drop Constraint
- Integrating Wearable Sensor Technology into Project-management Process
- Greedy Algorithm for the On-Chip Decoupling Capacitance Optimization to Satisfy the Voltage Drop Constraint
- Scan-based Attack against DES and Triple DES Cryptosystems Using Scan Signatures (Preprint)
- Energy-efficient High-level Synthesis for HDR Architectures
- Scan Vulnerability in Elliptic Curve Cryptosystems
- A Fault-Secure High-Level Synthesis Algorithm for RDR Architectures
- A Fast Selector-Based Subtract-Multiplication Unit and Its Application to Butterfly Unit
- Floorplan-Driven High-Level Synthesis for Distributed/Shared-Register Architectures
- A Fast Weighted Adder by Reducing Partial Product for Reconstruction in Super-Resolution
- Exact, Fast and Flexible L1 Cache Configuration Simulation for Embedded Systems