A Sub-100mW Dual-Core HOG Accelerator VLSI for Parallel Feature Extraction Processing for HDTV Resolution Video
スポンサーリンク
概要
- 論文の詳細を見る
This paper describes a Histogram of Oriented Gradients (HOG) feature extraction accelerator that features a VLSI-oriented HOG algorithm with early classification in Support Vector Machine (SVM) classification, dual core architecture for parallel feature extraction and multiple object detection, and detection-window-size scalable architecture with reconfigurable MAC array for processing objects of several shapes. To achieve low-power consumption for mobile applications, early classification reduces the amount of computations in SVM classification efficiently with no accuracy degradation. The dual core architecture enables parallel feature extraction in one frame for high-speed or low-power computing and detection of multiple objects simultaneously with low power consumption by HOG feature sharing. Objects of several shapes, a vertically long object, a horizontally long object, and a square object, can be detected because of cooperation between the two cores. The proposed methods provide processing capability for HDTV resolution video (1920×1080 pixels) at 30 frames per second (fps). The test chip, which has been fabricated using 65nm CMOS technology, occupies 4.2×2.1mm2 containing 502 Kgates and 1.22Mbit on-chip SRAMs. The simulated data show 99.5mW power consumption at 42.9MHz and 1.1V.
著者
-
Kawaguchi Hiroshi
Kobe Univ. Kobe‐shi Jpn
-
YOSHIMOTO Masahiko
Kobe University
-
MIZUNO Kosuke
Kobe University
-
TERACHI Yosuke
Kobe University
-
IZUMI Shintaro
Kobe University
-
TAKAGI Kenta
Kobe University
関連論文
- Cross-Layer Design for Low-Power Wireless Sensor Node Using Wave Clock
- A VGA 30-fps Realtime Optical-Flow Processor Core for Moving Picture Recognition
- A Dependable SRAM with 7T/14T Memory Cells
- A 10T Non-precharge Two-Port SRAM Reducing Readout Power for Video Processing
- Area Comparison between 6T and 8T SRAM Cells in Dual-V_ Scheme and DVS Scheme(Memory Design and Test,VLSI Design and CAD Algorithms)
- Area Optimization in 6T and 8T SRAM Cells Considering V_ Variation in Future Processes(Next-Generation Memory for SoC,VLSI Technology toward Frontiers of New Market)
- An Energy-Harvesting Wireless-Interface SoC for Short-Range Data Communication
- A 58-μW Single-Chip Sensor Node Processor with Communication Centric Design
- A 433-MHz Rail-to-Rail Voltage Amplifier with Carrier Sensing Function for Wireless Sensor Networks
- Counter-Based Broadcasting with Hop Count Aware Random Assessment Delay Extension for Wireless Sensor Networks
- A Sub 100mW H.264 MP@L4.1 Integer-Pel Motion Estimation Processor Core for MBAFF Encoding with Reconfigurable Ring-Connected Systolic Array and Segmentation-Free, Rectangle-Access Search-Window Buffer
- Data Transmission Scheduling Based on RTS/CTS Exchange for Periodic Data Gathering Sensor Networks(Ubiquitous Sensor Networks)
- Aggregation Efficiency-Aware Greedy Incremental Tree Routing for Wireless Sensor Networks(Mobile Multimedia Communications)
- A 50% Power Reduction in H.264/AVC HDTV Video Decoder LSI by Dynamic Voltage Scaling in Elastic Pipeline(VLSI Architecture,VLSI Design and CAD Algorithms)
- A sub-mW H.264 baseline-profile motion estimation processor core with a VLSI-oriented block partitioning strategy and SIMD/systolic-array architecture
- A Power- and Area-Efficient SRAM Core Architecture with Segmentation-Free and Horizontal/Vertical Accessibility for Super-Parallel Video Processing(Novel Device Architectures and System Integration Technologies)
- Service Interval Optimization with Delay Bound Guarantee for HCCA in IEEE802.11e WLANs(Network)
- A New Scheduler to Guarantee Delay Bound with Bandwidth Optimization for HCCA in IEEE 802.11e WLANs(QoS及びトラヒック管理(2),ユビキタスネットワーク,モバイルネットワーク及び一般)
- Low-Power Low-Leakage FPGA Design Using Zigzag Power Gating, Dual-V_/V_ and Micro-V_-Hopping (Low Power Techniques, VLSI Design Technology in the Sub-100nm Era)
- Trends of On-Chip Interconnects in Deep Sub-Micron VLSI (Interconnect Technique, VLSI Design Technology in the Sub-100nm Era)
- A 0.3-V operating, Vth-variation-tolerant SRAM under DVS environment for memory-rich SoC in 90-nm technology era and beyond
- Low-Power High-Speed Reduced-Clock-Swing Flip-Flops Based on Contention Reduction Techniques
- A Low-Power Real-Time SIFT Descriptor Generation Engine for Full-HDTV Video Recognition
- VLSI Architecture of GMM Processing and Viterbi Decoder for 60,000-Word Real-Time Continuous Speech Recognition
- 0.5-V 4-MB Variation-Aware Cache Architecture Using 7T/14T SRAM and Its Testing Scheme (System LSI Design Methodology Vol.5)
- A Low-Power Multi Resolution Spectrum Sensing Architecture for a Wireless Sensor Network with Cognitive Radio
- Divided Static Random Access Memory for Data Aggregation in Wireless Sensor Nodes
- A Low-Power Multi-Phase Oscillator with Transfer Gate Phase Coupler Enabling Even-Numbered Phase Output
- 7T SRAM Enabling Low-Energy Instantaneous Block Copy and Its Application to Transactional Memory
- Multiple-Bit-Upset and Single-Bit-Upset Resilient 8T SRAM Bitcell Layout with Divided Wordline Structure
- A 0.15-µm FD-SOI Substrate Bias Control SRAM with Inter-Die Variability Compensation Scheme
- A 0.15-μm FD-SOI Substrate Bias Control SRAM with Inter-Die Variability Compensation Scheme
- A 40-nm 0.5-V 12.9-pJ/Access 8T SRAM Using Low-Energy Disturb Mitigation Scheme
- A Process-Variation-Adaptive Network-on-Chip with Variable-Cycle Routers and Variable-Cycle Pipeline Adaptive Routing
- A 128-bit Chip Identification Generating Scheme Exploiting Load Transistors' Variation in SRAM Bitcells
- A Sub-100mW Dual-Core HOG Accelerator VLSI for Parallel Feature Extraction Processing for HDTV Resolution Video
- A 168-mW 2.4×-Real-Time 60-kWord Continuous Speech Recognition Processor VLSI
- Multiple-Cell-Upset Tolerant 6T SRAM Using NMOS-Centered Cell Layout
- Bit-Error and Soft-Error Resilient 7T/14T SRAM with 150-nm FD-SOI Process
- Multiple-Bit-Upset and Single-Bit-Upset Resilient 8T SRAM Bitcell Layout with Divided Wordline Structure
- Soft-Error Resilient and Margin-Enhanced N-P Reversed 6T SRAM Bitcell
- A 128-bit Chip Identification Generating Scheme Exploiting Load Transistors' Variation in SRAM Bitcells
- A Sub-100mW Dual-Core HOG Accelerator VLSI for Parallel Feature Extraction Processing for HDTV Resolution Video
- A 168-mW 2.4×-Real-Time 60-k Word Continuous Speech Recognition Processor VLSI