A Binary-Tree Hierarchical Multiple-Chip Architecture for Real-Time Large-Scale Learning Processor Systems
スポンサーリンク
概要
- 論文の詳細を見る
A binary-tree hierarchical multiple-chip processor architecture leveraging the $K$-means clustering algorithm has been developed for real-time learning of large amounts of sample data. To improve the computational speed, an embedded memory configuration has been used to store all sample data on the same chip for massively parallel processing. As a solution to the problem of the maximum sample data size limited by the chip area, a multiple-chip architecture has been developed, in which dedicated processor chips are connected in a binary-tree hierarchical structure. As a result, the system has been made extendible to any larger scale depending on the application needs. Furthermore, the pipeline calculation flow has been introduced to compensate for the interchip data communication delay. A proof-of-concept chip was designed using a 0.18 μm five-metal complementary metal–oxide–semiconductor (CMOS) technology. The chip operation was verified by NanoSim simulation, and pipeline calculation flow was demonstrated by test chip measurement.
- Published by the Japan Society of Applied Physics through the Institute of Pure and Applied Physicsの論文
- 2010-04-25
著者
-
Shibata Tadashi
Department of Electrical Engineering and Information System, The University of Tokyo, Bunkyo, Tokyo 113-8656, Japan
-
Yitao Ma
Department of Electrical Engineering and Information Systems, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo 113-8656, Japan
-
Ma Yitao
Department of Electrical Engineering and Information Systems, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo 113-8656, Japan
関連論文
- Electron Spin Resonance in One-Dimensional Antiferromagnet CuGeO_3
- Fully-Parallel VLSI Implementation of Vector Quantization Processor Using Neuron-MOS Technology (Special Issue on Integrated Electronics and New System Paradigms)
- A Comparative Examination of Ion Implanted n^+p Junctions Annealed at 1000℃ and 450℃
- Effect of Substrate Boron Concentration on the Integrity of 450℃-Annealed Ion-Implanted Junctions
- Reducing Reverse-Bias Current in 450℃-Annealed n^+p Junction by Hydrogern Radical Sintering
- Oscillatory High-Field Magnetization in LaP Doped with Ce
- An Ego-Motion Detection System Employing Directional-Edge-Based Motion Field Representations
- A Compact Memory-Merged Vector-Matching Circuitry for Neuron-MOS Associative Processor (Special Issue on Integrated Electronics and New System Paradigms)
- Low Power Neuron-MOS Technology for High-Functionality Logic Gate Synthesis (Special Issue on New Concept Device and Novel Architecture LSIs)
- Minimizing Wafer Surface Damage and Chamber Material Contamination in New Plasma Processing Equipment
- An Analog Edge-Filtering Processor Employing Only-Nearest-Neighbor Interconnects
- Hot-Carrier-Immunity Degradation in Metal Oxide Semiconductor Field Effect Transistors Caused by Ion-Bombardment Processes
- New compact and power-efficient implementations of rank-order-filters and sorting engines using time-domain computation technique (画像工学)
- New compact and power-efficient implementations of rank-order-filters and sorting engines using time-domain computation technique (信号処理)
- New compact and power-efficient implementations of rank-order-filters and sorting engines using time-domain computation technique (集積回路)
- A Right-Brain/Left-Brain Integrated Associative Processor Employing Convertible MIMD Elements
- A High-Performance Ramp-Voltage-Scan Winner-Take-All Circuit in an Open Loop Architecture
- A High-Performance Time-Domain Winner-Take-All Circuit Employing OR-Tree Architecture
- Automatic Defect Pattern Detection on LSI Wafers Using Image Processing Techniques
- Optimizing Vector-Quantization Processor Architecture for Intelligent Query-Search Applications
- Optimizing Associative Processor Architecture for Intelligent Internet Search Applications
- A Compact and Power-Efficient Implementation of Rank Order Filters Using Time-Domain Digital Computation Technique
- Superior Generalization Capability of Hardware-Learing Algorithm Developed for Self-Learning Neuron-MOS Neural Networks
- Functionality Enhancement in Elemental Devices for Implementing Intelligence on Integrated Circuits (Special Issue on New Concept Device and Novel Architecture LSIs)
- A Moving-Object-Localization Hardware Algorithm Employing OR-Amplification of Pixel Activities
- An Edge Cache Memory Architecture for Early Visual Processing VLSIs
- Fully Parallel Self-Learning Analog Support Vector Machine Employing Compact Gaussian Generation Circuits (Special Issue : Solid State Devices and Materials (2))
- Efficient Image-Vector-Generation Processor for Edge-Based Complementary Feature Representations
- A Hardware-Implementation-Friendly Pulse-Coupled Neural Network Algorithm for Analog Image-Feature-Generation Circuits
- A Simple Random Noise Generator Employing Metal-Oxide-Semiconductor-Field-Effect-Transistor Channel $kT/C$ Noise and Low-Capacitance Loading Buffer
- Hardware Architecture for Pseudo-Two-Dimensional Hidden-Markov-Model-Based Face Recognition Systems Employing Laplace Distribution Functions
- Moving-Object-Localization Hardware Algorithm Employing OR-Amplification of Pixel Activities
- Real-Time Very Large-Scale Integration Recognition System with an On-Chip Adaptive K-Means Learning Algorithm
- Compact and Power-Efficient Implementation of Rank-Order Filters Using Time-Domain Digital Computation Technique
- A Digital-Pixel-Sensor-Based Global Feature Extraction Processor for Real-Time Object Recognition
- Right-Brain/Left-Brain Integrated Associative Processor Employing Convertible Multiple-Instruction-Stream Multiple-Data-Stream Elements
- A Binary-Tree Hierarchical Multiple-Chip Architecture for Real-Time Large-Scale Learning Processor Systems