Robust Speech Recognition Using Discrete-Mixture HMMs(Speech and Hearing)
スポンサーリンク
概要
- 論文の詳細を見る
This paper introduces new methods of robust speech recognition using discrete-mixture HMMs (DMHMMs). The aim of this work is to develop robust speech recognition for adverse conditions that contain both stationary and non-stationary noise. In particular, we focus on the issue of impulsive noise, which is a major problem in practical speech recognition system. In this paper, two strategies were utilized to solve the problem. In the first strategy, adverse conditions are represented by an acoustic model. In this case, a large amount of training data and accurate acoustic models are required to present a variety of acoustic environments. This strategy is suitable for recognition in stationary or slow-varying noise conditions. The second is based on the idea that the corrupted frames are treated to reduce the adverse effect by compensation method. Since impulsive noise has a wide variety of features and its modeling is difficult, the second strategy is employed. In order to achieve those strategies, we propose two methods. Those methods are based on DMHMM framework which is one type of discrete HMM (DHMM). First, an estimation method of DMHMM parameters based on MAP is proposed aiming to improve trainability. The second is a method of compensating the observation probabilities of DMHMMs by threshold to reduce adverse effect of outlier values. Observation probabilities of impulsive noise tend to be much smaller than those of normal speech. The motivation in this approach is that flooring the observation probability reduces the adverse effect caused by impulsive noise. Experimental evaluations on Japanese LVCSR for read newspaper speech showed that the proposed method achieved the average error rate reduction of 48.5% in impulsive noise conditions. Also the experimental results in adverse conditions that contain both stationary and impulsive noises showed that the proposed method achieved the average error rate reduction of 28.1%.
- 社団法人電子情報通信学会の論文
- 2005-12-01
著者
-
Kohda Masaki
Faculty Of Engineering Yamagata University
-
Kohda Masaki
Graduate School Of Science And Engineering Yamagata University
-
KOSAKA Tetsuo
Graduate School of Science and Engineering, Yamagata University
-
Kosaka Tetsuo
Graduate School Of Science And Engineering Yamagata University
-
KOSAKA Tetsuo
Faculty of Engineering, Yamagata University
-
KATOH Masaharu
Faculty of Engineering, Yamagata University
-
Katoh Masaharu
Graduate School Of Science And Engineering Yamagata University
関連論文
- Fast optimization of language model weight and insertion penalty from n-best candidates
- Lecture Speech Recognition Using Discrete-Mixture HMMs
- Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition
- Histogram equalization for noise-robust speech recognition using discrete-mixture HMMs
- Robust Speech Recognition Using Discrete-Mixture HMMs(Speech and Hearing)