Transpose-free Variable-Size FFT Accelerator Based On-chip SRAM
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents a transpose-free variable-size fast fourier transform (FFT) accelerator on a digital signal processing (DSP) chip. Several parallel schemes are utilized to calculate a batch of small-size FFT algorithms to achieve high performance and throughput. For middle- and large-size of FFT, we propose a transpose-free Cooley-Tukey scheme that uses the random access feature of on-chip SRAM memory to avoid the DDR access of matrix with column-wise and improves the utilization of DDR bandwidth. Experimental results show that our FFT accelerator, implemented with 65mn library and run at 500 MHz, can achieve the energy efficiency improvement by two orders of magnitude compared with Intel Xeon CPU and obtain above 50x performance improvement compared with TI TMS320C64X DSP chip.
著者
-
Lei Yuanwu
National Laboratory For Parallel And Distribution Processing National University Of Defense Technology
-
Zhou Jie
National Laboratory For Parallel And Distribution Processing National University Of Defense Technology
-
TANG Yuhua
National Laboratory for Parallel and Distribution Processing, National University of Defense Technology
-
GUO Lei
National Laboratory for Parallel and Distribution Processing, National University of Defense Technology
-
Dou Yong
National Laboratory for Parallel and Distributed Processing, National University of Defense Technology
関連論文
- FPGA-Specific Custom VLIW Architecture for Arbitrary Precision Floating-Point Arithmetic
- High performance sparse matrix-vector multiplication on FPGA
- Window Memory Layout Scheme for Alternate Row-Wise/Column-Wise Matrix Access
- Transpose-free Variable-Size FFT Accelerator Based On-chip SRAM