The Optimal Architecture Design of Two-Dimension Matrix Multiplication Jumping Systolic Array
スポンサーリンク
概要
- 論文の詳細を見る
This paper proposes an efficient systolic array construction method for optimal planar systolic design of the matrix multiplication. By connection network adjustment among systolic array processing element (PE), the input/output data are jumping in the systolic array for multiplication operation requirements. Various 2-D systolic array topologies, such as square topology and hexagonal topology, have been studied to construct appropriate systolic array configuration and realize high performance matrix multiplication. Based on traditional Kung-Leiserson systolic architecture, the proposed “Jumping Systolic Array (JSA)” algorithm can increase the matrix multiplication speed with less processing elements and few data registers attachment. New systolic arrays, such as square jumping array, redundant dummy latency jumping hexagonal array, and compact parallel flow jumping hexagonal array, are also proposed to improve the concurrent system operation efficiency. Experimental results prove that the JSA algorithm can realize fully concurrent operation and dominate other systolic architectures in the specific systolic array system characteristics, such as band width, matrix complexity, or expansion capability.
- (社)電子情報通信学会の論文
- 2008-04-01
著者
-
KIMURA Shinji
Graduate School of Information Science, Nara Institute of Science and Technology
-
Yang Yun
Graduate School Of Information Production And Systems Waseda University
-
Kimura Shinji
Graduate School Of Information Production And Systems Waseda University
-
Yang Yun
Waseda Univ. Kitakyushu‐shi Jpn
-
Kimura Shinji
Graduate School Of Engineering Nagoya University
関連論文
- Exact Minimization of Free BDDs and Its Application to Pass-Transistor Logic Optimization (Special Section on VLSI Design and CAD Algorithms)
- Hardware Synthesis from C Programs with Estimation of Bit Length of Variables (Special Section on VLSI Design and CAD Algorithms)
- Timing Verification of Sequential Logic Circuits Based on Controlled Multi-Clock Path Analysis (Special Section on VLSI Design and CAD Algorithms)
- Selective Low-Care Coding : A Means for Test Data Compression in Circuits with Multiple Scan Chains(Selected Papers from the 18th Workshop on Circuits and Systems in Karuizawa)
- The Optimal Architecture Design of Two-Dimension Matrix Multiplication Jumping Systolic Array
- Fine-Grained Power Gating Based on the Controlling Value of Logic Elements
- Fine-grained power gating based on the controlling value of logic gates (VLSI設計技術)
- Fine-grained power gating based on the controlling value of logic gates (システムLSI設計技術)
- Behavioral Circuit Macromodeling and Analog LSI Implementation for Automobile Engine Intake System(Selected Papers from the 19th Workshop on Circuits and Systems in Karuizawa)
- Formula-Based Method for Capacitance Extraction of Interconnects with Dummy Fills(Selected Papers from the 18th Workshop on Circuits and Systems in Karuizawa)
- Modeling the Influence of Input-to-Output Coupling Capacitance on CMOS Inverter Delay(Selected Papers from the 18th Workshop on Circuits and Systems in Karuizawa)
- Second-Order Polynomial Expressions for On-Chip Interconnect Capacitance(Interconnect, VLSI Design and CAD Algorithms)
- Finite Input-Memory Automaton Based Checker Synthesis of System Verilog Assertions for FPGA Prototyping
- _
- Issue Mechanism for Embedded Simultaneous Multithreading Processor
- Multi-Cycle Path Detection Based on Propositional Satisfiability with CNF Simplification Using Adaptive Variable Insertion (Special Section on VLSI Design and CAD Algorithms)
- Bit Length Optimization of Fractional Part on Floating to Fixed Point Conversion for High-Level Synthesis(Logic and High Synthesis)(VLSI Design and CAD Algorithms)
- Look Up Table Compaction Based on Folding of Logic Functions(Special Section on VLSI Design and CAD Algorithms)
- A Built-in Reseeding Technique for LFSR-Based Test Pattern Generation(Timing Verification and Test Generation)(VLSI Design and CAD Algorithms)
- A Built-in Reseeding Technique for LFSR-Based Test Pattern Generation
- RAY-SPACE CODING USING SINUSOIDAL STRUCTURE IN CIRCULAR CAMERA ARRANGEMENT(International Workshop on Advanced Image Technology 2006)
- Bit-Length Optimization Method for High-Level Synthesis Based on Non-linear Programming Technique(System Level Design,VLSI Design and CAD Algorithms)
- A Selective Scan Chain Reconfiguration through Run-Length Coding for Test Data Compression and Scan Power Reduction(Test)(VLSI Design and CAD Algorithms)
- A Hybrid Dictionary Test Data Compression for Multiscan-Based Designs(Test)(VLSI Design and CAD Algorithms)
- Efficient Large Scale Integration Power/Ground Network Optimization Based on Grid Genetic Algorithm(Power/Ground Network, VLSI Design and CAD Algorithms)
- Unified Dual-Radix Architecture for Scalable Montgomery Multiplications in GF(P) and GF(2^n)
- Optimizing Controlling-Value-Based Power Gating with Gate Count and Switching Activity
- Coverage Estimation Using Transition Perturbation for Symbolic Model Checking in Hardware Verification(Simulation and Verification,VLSI Design and CAD Algorithms)
- Structural Coverage of Traversed Transitions for Symbolic Model Checking
- Structural Coverage of Traversed Transitions for Symbolic Model Checking
- Structural Coverage of Traversed Transitions for Symbolic Model Checking
- Structural Coverage of Traversed Transitions for Symbolic Model Checking
- Power Optimization of Sequential Circuits Using Switching Activity Based Clock Gating
- Checker circuit generation for System Verilog Assertions in prototyping verification (システムLSI設計技術)
- Efficient Hybrid Grid Synthesis Method Based on Genetic Algorithm for Power/Ground Network Optimization with Dynamic Signal Consideration
- Automatic Multi-Stage Clock Gating Optimization Using ILP Formulation
- Multi-Operand Adder Synthesis Targeting FPGAs
- On Gate Level Power Optimization of Combinational Circuits Using Pseudo Power Gating
- Write Control Method for Nonvolatile Flip-Flops Based on State Transition Analysis
- An Exact Approach for GPC-Based Compressor Tree Synthesis
- Dual-Stage Pseudo Power Gating with Advanced Clustering Algorithm for Gate Level Power Optimization