Initial evaluation of the drivers' Japanese speech corpus in a car environment (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
スポンサーリンク
概要
- 論文の詳細を見る
Car navigation systems are getting more and more popular and many of them equip a speech recognition system for hands-free interface. However, the speech input interface is not widely used because of insufficient recognition performance. In order to improve the recognition performance and make the speech interface more practical, a real-car-environment speech corpus "Drivers' Japanese Speech Corpus in a Car Environment" is under construction by a project supported by the Japanese Ministry of Economy, Trade and Industry. In this study, we used the command task portion of the corpus recorded under three conditions: idling, running in a city, and running on a highway. We used the data from the corpus only as a test set and made a recognition system by optimally combining several existing corpora with several noise robustness techniques. Experimental results show that using an HMM trained on multiple conditions with spectral subtraction is the best for the car noises. Recognition performance was largely improved and more than 90% word accuracy was achieved for all the recording conditions. In particular, over a 50% absolute improvement in accuracy was observed for speeches given by female speakers uttered when driving on a highway.
- 社団法人電子情報通信学会の論文
- 2008-03-13
著者
-
SHINODA Koichi
Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Instit
-
FURUI Sadaoki
Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Instit
-
Shinoda Koichi
Department Of Computer Science Tokyo Institute Of Technology
-
Shinoda Koichi
Department Of Computer Science Graduate School Of Information Science And Engineering Tokyo Institut
-
HIRAKI Kousuke
Department of Computer Science, Tokyo Institute of Technology
-
SHINOZAKI Takahiro
Department of Computer Science, Tokyo Institute of Technology
-
BETKOWSKA Agnieszka
Department of Computer Science, Tokyo Institute of Technology
-
IWANO Koji
Department of Computer Science, Tokyo Institute of Technology
-
Iwano Koji
Department Of Computer Science Tokyo Institute Of Technology
-
Furui Sadaoki
Department Of Computer Science Graduate School Of Information Science And Engineering Tokyo Institut
-
Hiraki Kousuke
Department Of Computer Science Tokyo Institute Of Technology
-
Iwano Koji
Tokyo Inst. Technol. Tokyo Jpn
-
Shinozaki Takahiro
Department Of Computer Science Tokyo Institute Of Technology
-
Betkowska Agnieszka
Department Of Computer Science Tokyo Institute Of Technology
-
Shinoda Koichi
Department Of Computer Science Graduate School Of Information Science And Engineering Tokyo Institut
関連論文
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Tree-Structured Clustering Methods for Piecewise Linear-Transformation-Based Noise Adaptation(Speech and Hearing)
- Robust Scene Extraction Using Multi-Stream HMMs for Baseball Broadcast(Image Processing and Video Processing)
- Automatic recognition of Indonesian declarative questions and statements using polynomial coefficients of the pitch contours
- Initial evaluation of the drivers' Japanese speech corpus in a car environment (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Accent analysis for Mandarin large vocabulary continuous speech recognition (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Evaluation of a Noise-Robust Multi-Stream Speaker Verification Method Using F_0 Information
- Topic Extraction Based on Continuous Speech Recognition in Broadcast News Speech
- Cervical Plexus Block Helps in Diagnosis of Orofacial Pain Originating from Cervical Structures
- Robust Acoustic Modeling for Speech Recognition
- Invited: Robust Acoustic Modeling for Speech Recognition (国際ワークショップ"Beyond HMM")
- Robust Acoustic Modeling for Speech Recognition
- Noise Robust Speech Recognition Using F_0 Contour Information(Speech Dynamics by Ear, Eye, Mouth and Machine)
- Initial evaluation of the drivers' Japanese speech corpus in a car environment
- Phonology and Morphology Modeling in a Very Large Vocabulary Hungarian Dictation System(Speech and Hearing)
- 連続発話認識のための言語モデル
- Dynamic Bayesian Network-Based Acoustic Models Incorporating Speaking Rate Effects(Speech and Hearing)
- Neural-network-based HMM adaptation for noisy speech recognition
- Nonlinear Normalization Using g-Logarithm for Robust Speech Recognition
- Subject Adaptation and Adaptive Training for Gait-based Person Identification (音声)
- Subject Adaptation and Adaptive Training for Gait-based Person Identification (パターン認識・メディア理解)
- Speaker Verification Using MMAP Adaptation (言語理解とコミュニケーション)
- Speaker Verification Using MMAP Adaptation (音声)
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Two-pass Approach for Recognizing Code-Switching Speech
- Two-pass Approach for Recognizing Code-Switching Speech
- Two-pass Approach for Recognizing Code-Switching Speech
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Online Speaker Clustering Using Incremental Learning of an Ergodic Hidden Markov Model
- A video watermarking method to objects robust against various attacks
- Speaker Verification Using MMAP Adaptation
- A video watermarking method to objects robust against various attacks
- A video watermarking method to objects robust against various attacks
- Two-pass Approach for Recognizing Code-Switching Speech