The Use of Transformed Normal Speech Data in Acoustic Model Training for Non-Audible Murmur Recognition
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents a novel approach to the acoustic model training for Non-Audible Murmur (NAM) recognition using normal speech data transformed into NAM data. NAM is extremely soft murmur, which is so quiet that people around the speaker hardly hear it. NAM recognition is one of the promising silent speech interfaces for man-machine speech communication. Our previous work has shown the effectiveness of Speaker Adaptive Training (SAT) based on Constrained Maximum Likelihood Linear Regression (CMLLR) in the NAM acoustic model training. However, since the amount of available NAM data is still small, the effect of SAT is limited. In this paper we propose modified SAT methods capable of using a larger amount of normal speech data by transforming them into NAM data. The transformation of normal speech data is performed with the CMLLR adaptation. The experimental results demonstrate that the proposed methods yield an absolute increase of around 2% in word accuracy compared with the conventional method.
- 2011-01-28
著者
-
Denis Babani
Nara Institute of Science and Technology
-
Tomoki Toda
Nara Institute of Science and Technology
-
Hiroshi Saruwatari
Nara Institute of Science and Technology
-
Kiyohiro Shikano
Nara Institute of Science and Technology
関連論文
- Stacked Generalization for Topic Classification of Spoken Inquiries
- The Use of Transformed Normal Speech Data in Acoustic Model Training for Non-Audible Murmur Recognition
- An Evaluation of Discriminative Training for Hidden Markov Models in a Real-Environment Speech-Oriented Guidance System
- Inquiry Classification in a Speech-Oriented Guidance System Using Discriminative Learning
- Comparison of Methods for Topic Classification of Spoken Inquiries (Preprint)