Task Allocation with Algorithm Transformation for Reducing Data-Transfer Bottlenecks in Heterogeneous Multi-Core Processors : A Case Study of HOG Descriptor Computation
スポンサーリンク
概要
- 論文の詳細を見る
Heterogeneous multi-core processors are attracted by the media processing applications due to their capability of drawing strengths of different cores to improve the overall performance. However, the data transfer bottlenecks and limitations in the task allocation due to the accelerator-incompatible operations prevents us from gaining full potential of the heterogeneous multi-core processors. This paper presents a task allocation method based on algorithm transformation to increase the freedom of task allocation. We use approximation methods such as CORDIC algorithms to map the accelerator-incompatible operations to accelerator cores. According to the experimental results using HOG descriptor computation, the proposed task allocation method reduces the data transfer time by more than 82% and the total processing time by more than 79% compared to the conventional task allocation method.
- (社)電子情報通信学会の論文
- 2010-12-01
著者
-
Hariyama Masanori
Graduate School Of Information Sciences Tohoku University
-
Hariyama Masanori
The Department Of Computer And Mathematical Sciences Graduate School Of Information Sciences Tohoku
-
Waidyasooriya Hasitha
Graduate School Of Information Sciences Tohoku University
-
Kameyama M
Graduate School Of Information Sciences Tohoku University
-
Kameyama Michitaka
Graduate School Of Information Science Tohoku University
-
Hariyama M
Graduate School Of Information Sciences Tohoku University
-
OKUMURA Daisuke
Graduate School of Information Sciences, Tohoku University
-
Okumura Daisuke
Graduate School Of Information Sciences Tohoku University
関連論文
- Network coding-based reliable multicast scheme in wireless networks (無線通信システム)
- Adaptive Group-Based Job Scheduling for High Performance and Reliable Volunteer Computing
- Design and Evaluation of Fine-Grain Field-Programmable VLSI Based on Multiple-Valued Source-Coupled Logic
- FPGA Implementation of a Stereo Matching Processor Based on Window-Parallel-and-Pixel-Parallel Architecture(VLSI Architecture, VLSI Design and CAD Algorithms)
- Architecture of a Stereo Matching VLSI Processor Based on Hierarchically Parallel Memory Access(Digital Circuits and Computer Arithmetic, Recent Advances in Circuits and Systems-Part 1)
- C-12-8 Design of a Very Compact Cell for a Multiple-Valued Fine-Grain Reconfigurable VLSI
- Implementation of a DRAM-Cell-Based Multiple-Valued Logic-in-Memory Circuit
- Dynamic-Storage-Based Logic-in-Memory Circuit and Its Application to a Fine-Grain Pipelined System(Special Issue on High-Performance and Low-Power Microprocessors)
- Group Testing Based Detection of Web Service DDoS Attackers
- A Multi-Context FPGA Using Floating-Gate-MOS Functional Pass-Gates(Novel Device Architectures and System Integration Technologies)
- Architecture of a Fine-Grain Field-Programmable VLSI Based on Multiple-Valued Source-Coupled Logic(New System Paradigms for Integrated Electronics)
- Design of Highly Parallel Linear Digital System for ULSI Processors (Special Issue on New Architecture LSIs)
- Code Assignment Algorithm for Highly Parallel Multiple-Valued Combinational Circuits Based on Partition Theory (Special Issue on Multiple-Valued Logic)
- Advanced VLSI Architecture for Intelligent Integrated Systems(Plenary Session,AWAD2006)
- Advanced VLSI Architecture for Intelligent Integrated Systems(Plenary Session,AWAD2006)
- Design of a Trinocular-Stereo-Vision VLSI Processor Based on Optimal Scheduling
- Minimizing Energy Consumption Based on Dual-Supply-Voltage Assignment and Interconnection Simplification(Novel Device Architectures and System Integration Technologies)
- Low-Power Field-Programmable VLSI Using Multiple Supply Voltages(Low Power Methodology, VLSI Design and CAD Algorithms)
- C-12-4 Low Power Field Programmable VLSI Processor Using Multiple Supply Voltages
- Field-Programmable VLSI Based on a Bit-Serial Fine-Grain Architecture(New System Paradigms for Integrated Electronics)
- Highly-Parallel Stereo Vision VLSI Processor Based on an Optimal Parallel Memory Access Scheme
- An FPGA-Oriented Motion-Stereo Processor with a Simple Interconnection Network for Parallel Memory Access
- Architecture of a high-performance stereo vision VLSI processor
- Collision Detection VLSI Processor for Highly-Safe Intelligent Vehicles Using a Multiport Content-Addressable Memory
- A Three-Dimensional Instrumentation VLSI Processor Based on a Concurrent Memory-Access Scheme
- A VLSI-Oriented Model-Based Robot Vision Processor for 3-D Instrumentation and Object Recognition (Special Issue on Super Chip for Intelligent Integrated Systems)
- Generalized Hough Transform VLSI Processor for Model-Based Edge Detection
- Fine-Grain Multiple-Valued Reconfigurable VLSI Using Series-Gating Differential-Pair Circuits and Its Evaluation
- Implementation of a Partially Reconfigurable Multi-Context FPGA Based on Asynchronous Architecture
- Evaluation of Interconnect-Complexity-Aware Low-Power VLSI Design Using Multiple Supply and Threshold Voltages
- Memory Allocation for Multi-Resolution Image Processing
- Evaluation of a Field-Programmable VLSI Based on an Asynchronous Bit-Serial Architecture
- Multi-Context FPGA Using Fine-Grained Interconnection Blocks and Its CAD Environment
- Design of a Reconfigurable Parallel Processor for Digital Control Using FPGAs (Special Issue on Super Chip for Intelligent Integrated Systems)
- Special Section on VLSI Technology toward Frontiers of New Market
- A Minimum-Latency Linear Array FFT Processor for Robotics
- Pixel-Serial and Window-Parallel VLSI Processor for Stereo Matching Using a Variable Window Size
- Multiple-Valued Code Assignment Algorithm for VLSI-Oriented Highly Parallel K-Ary Operation Circuits (Special Issue on New Architecture LSIs)
- Multiple-Valued Programmable Logic Array Based on a Resonant-Tunneling Diode Model
- Design of a CAM-Based Collision Detection VLSI Processor for Robotics (Special Issue on Super Chip for Intelligent Integrated Systems)
- A Collision Detection Processor for Intelligent Vehicles (Special Issue on ASICs for Automotive Electronics)
- Design Methodology for Human-Oriented Intelligent Integrated Systems
- Design and Evaluation of a 4-Valued Universal-Literal CAM for Cellular Logic Image Processing (Special Issue on New Concept Device and Novel Architecture LSIs)
- Adaptive Group-Based Job Scheduling for High Performance and Reliable Volunteer Computing
- An Asynchronous FPGA Based on LEDR/4-Phase-Dual-Rail Hybrid Architecture
- A Switch Block Architecture for Multi-Context FPGAs Based on a Ferroelectric-Capacitor Functional Pass-Gate Using Multiple/Binary Valued Hybrid Signals
- Memory Allocation for Window-Based Image Processing on Multiple Memory Modules with Simple Addressing Functions
- Task Allocation with Algorithm Transformation for Reducing Data-Transfer Bottlenecks in Heterogeneous Multi-Core Processors : A Case Study of HOG Descriptor Computation
- Logic-In-Control-Architecture-Based Reconfigurable VLSI Using Multiple-Valued Differential-Pair Circuits
- FOREWORD
- Code Assignment Algorithm for Highly Parallel Multiple-Valued k-Ary Operation Circuits Using Partition Thory
- Design of a Rule-Based Highly-Safe Intelligent Vehicle Using a Content-Addressable Memory
- Implementation of a Low-Power FPGA Based on Synchronous/Asynchronous Hybrid Architecture
- Memory-Access-Driven Context Partitioning for Window-Based Image Processing on Heterogeneous Multicore Processors
- Acceleration of Block Matching on a Low-Power Heterogeneous Multi-Core Processor Based on DTU Data-Transfer with Data Re-Allocation
- Machine Learning Based Adaptive Contour Detection Using Algorithm Selection and Image Splitting
- A Multiple-Valued Reconfigurable VLSI Architecture Using Binary-Controlled Differential-Pair Circuits
- Platform and Mapping Methodology for Heterogeneous Multicore Processors
- Evaluation of an FPGA-Based Heterogeneous Multicore Platform with SIMD/MIMD Custom Accelerators
- Machine Learning Based Adaptive Contour Detection Using Algorithm Selection and Image Splitting ( Fundamental Aspects and Recent Developments in Multimedia and VLSI Systems)
- Platform and Mapping Methodology for Heterogeneous Multicore Processors ( Fundamental Aspects and Recent Developments in Multimedia and VLSI Systems)
- Multiple-Valued Fine-Grain Reconfigurable VLSI Using a Global Tree Local X-Net Network