A K-Means-Based Multi-Prototype High-Speed Learning System with FPGA-Implemented Coprocessor for 1-NN Searching
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we propose a hardware solution for overcoming the problem of high computational demands in a nearest neighbor (NN) based multi-prototype learning system. The multiple prototypes are obtained by a high-speed K-means clustering algorithm utilizing a concept of software-hardware cooperation that takes advantage of the flexibility of the software and the efficiency of the hardware. The one nearest neighbor (1-NN) classifier is used to recognize an object by searching for the nearest Euclidean distance among the prototypes. The major deficiency in conventional implementations for both K-means and 1-NN is the high computational demand of the nearest neighbor searching. This deficiency is resolved by an FPGA-implemented coprocessor that is a VLSI circuit for searching the nearest Euclidean distance. The coprocessor requires 12.9% logic elements and 58% block memory bits of an Altera Stratix III E110 FPGA device. The hardware communicates with the software by a PCI Express (×4) local-bus-compatible interface. We benchmark our learning system against the popular case of handwritten digit recognition in which abundant previous works for comparison are available. In the case of the MNIST database, we could attain the most efficient accuracy rate of 97.91% with 930 prototypes, the learning speed of 1.3×10-4s/sample and the classification speed of 3.94×10-8s/character.
著者
-
Mattausch Hans
Research Center For Nanodevices And Systems Hiroshima University
-
Koide Tetsushi
Research Center For Nanodevices And Systems Hiroshima University
-
MATTAUSCH Hans
Research Institute for Nanodevice and Bio Systems, Hiroshima University
-
KOIDE Tetsushi
Research Institute for Nanodevice and Bio systems, Hiroshima University
-
AN Fengwei
Research Institute for Nanodevice and Bio systems, Hiroshima University
関連論文
- 4-Port Unified Data/Instruction Cache Design with Distributed Crossbar and Interleaved Cache-Line Words(Integrated Electronics)
- Acceleration of DCT Processing with Massive-Parallel Memory-Embedded SIMD Matrix Processor(Image Processing and Video Processing)
- Realization of K-Nearest-Matches Search Capability in Fully-Parallel Associative Memories(VLSI Design Technology and CAD)
- A Performance-Driven Floorplanning Method with Interconnect Performance Estimation(Special Section on VLSI Design and CAD Algorithms)
- Circuit-Simulation Model of C_ Changes in Small-Size MOSFETs Due to High Channel-Field Gradients(the IEEE International Conference on SISPAD '02)
- A Compact Model of the Pinch-off Region of 100nm MOSFETs Based on the Surface-Potential(Semiconductor Materials and Devices)
- 1/f-Noise Characteristics in 100 nm-MOSFETs and Its Modeling for Circuit Simulation(Semiconductor Materials and Devices)
- Quantum Effect in Sub-0.1μm MOSFET with Pocket Technologies and Its Relevance for the On-Current Condition
- Circuit Simulation Models for Coming MOSFET Generations(Special Section of Selected Papers from the 14th Workshop on Circuits and Systems in Karuizawa)
- Integration Architecture of Content Addressable Memory and Massive-Parallel Memory-Embedded SIMD Matrix for Versatile Multimedia Processor
- Scalable FPGA/ASIC Implementation Architecture for Parallel Table-Lookup-Coding Using Multi-Ported Content Addressable Memory(Image Processing and Video Processing)
- Real-Time Huffman Encoder with Pipelined CAM-Based Data Path and Code-Word-Table Optimizer(Image Processing and Video Processing)
- A CAM-Based Signature-Matching Co-processor with Application-Driven Power-Reduction Features(Integrated Electronics)
- Digital Low-Power Real-Time Video Segmentation by Region Growing
- 100 nm-MOSFET Model for Circuit Simulation : Challenges and Solutions(Devices and Circuits for Next Generation Multi-Media Communication Systems)
- Analysis of Within-Die Complementary Metal--Oxide--Semiconductor Process Variation with Reconfigurable Ring Oscillator Arrays Using HiSIM
- MOSFET Harmonic Distortion up to the Cutoff Frequency : Measurement and Theoretical Analysis
- 100 nm-MOSFET Model for Circuit Simulation : Challenges and Solutions
- Fast and Compact Central Arbiter for High Access-Bit-Rate Multi-Port Caches
- Bank-Type Associative Memory for High-Speed Nearest Manhattan Distance Search in Large Reference-Pattern Space
- Efficient Video-Picture Segmentation Algorithm for Cell-Network-Based Digital CMOS Implementation(Image Processing, Image Pattern Recognition)
- High Speed Frequency-Mapping-Based Associative Memory Using Compact Multi-Bit Encoders and a Path-Selecting Scheme (Special Issue : Solid State Devices and Materials (2))
- Low-Power Silicon-Area-Efficient Image Segmentation Based on a Pixel-Block Scanning Architecture
- Software-Based Parallel Cryptographic Solution with Massive-Parallel Memory-Embedded SIMD Matrix Architecture for Data-Storage Systems
- A K-Means-Based Multi-Prototype High-Speed Learning System with FPGA-Implemented Coprocessor for 1-NN Searching
- Experimental Analysis of Within-Die Process Variation in 65 and 180 nm Complementary Metal--Oxide--Semiconductor Technology Including Its Distance Dependences
- Mixed Digital–Analog Associative Memory Enabling Fully-Parallel Nearest Euclidean Distance Search
- Automatic Pattern-Learning Architecture Based on Associative Memory and Short/Long Term Storage Concept