Committee-Based Active Learning for Speech Recognition
スポンサーリンク
概要
- 論文の詳細を見る
We propose a committee-based method of active learning for large vocabulary continuous speech recognition. Multiple recognizers are trained in this approach, and the recognition results obtained from these are used for selecting utterances. Those utterances whose recognition results differ the most among recognizers are selected and transcribed. Progressive alignment and voting entropy are used to measure the degree of disagreement among recognizers on the recognition result. Our method was evaluated by using 191-hour speech data in the Corpus of Spontaneous Japanese. It proved to be significantly better than random selection. It only required 63h of data to achieve a word accuracy of 74%, while standard training (i.e., random selection) required 103h of data. It also proved to be significantly better than conventional uncertainty sampling using word posterior probabilities.
- (社)電子情報通信学会の論文
- 2011-10-01
著者
-
SHINODA Koichi
Tokyo Institute of Technology
-
Furui Sadaoki
Tokyo Inst. Of Technol. Tokyo Jpn
-
Furui Sadaoki
Tokyo Institute Of Technology
-
HAMANAKA Yuzo
Tokyo Institute of Technology
-
TSUTAOKA Takuya
Tokyo Institute of Technology
-
EMORI Tadashi
NEC Corporation
-
KOSHINAKA Takafumi
NEC Corporation
関連論文
- Acoustic Model Adaptation for Speech Recognition
- Tree-Structured Clustering Methods for Piecewise Linear-Transformation-Based Noise Adaptation(Speech and Hearing)
- Acoustic Model Adaptation for Speech Recognition
- Recent Progress in Corpus-Based Spontaneous Speech Recognition(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- THE USE OF FINITE-STATE TRANSDUCERS FOR MODELING PHONOLOGICAL AND MORPHOLOGICAL CONSTRAINTS IN AUTOMATIC SPEECH RECOGNITION
- Adaptation to Pronunciation Variations in Indonesian Spoken Query-Based Information Retrieval
- Committee-Based Active Learning for Speech Recognition
- Robust Gait-Based Person Identification against Walking Speed Variations
- Selected Topics from LVCSR Research for Asian Languages at Tokyo Tech
- Active Learning Using Phone-Error Distribution for Speech Modeling
- Distance-based Factor Graph Linearization and Sampled Max-sum Algorithm for Efficient 3D Potential Decoding of Macromolecules
- Spectral Subtraction Based on Non-extensive Statistics for Speech Recognition
- Active Learning Using Phone-Error Distribution for Speech Modeling