Duration Modeling with Decreased Intra-Group Temporal Variation for HMM-Based Phoneme Recognition
スポンサーリンク
概要
- 論文の詳細を見る
A new clustering method was proposed to increase the effect of duration modeling on the HMM-based phoneme recognition. A precise observation on the temporal correspondences between a phoneme HMM with output probabilities by single Gaussian modeling and its training data indicated that there were two extreme cases, one with several types of correspondences in a phoneme class completely different from each other, and the other with only one type of correspondence. Although duration modeling was commonly used to incorporate the temporal information in the HMMs, a good modeling could not be obtained for the former case. Further observation for phoneme HMMs with output probabilities by Gaussian mixture modeling also showed that some HMMs still had multiple temporal correspondences, though the number of such phonemes was reduced as compared to the case of single Gaussian modeling. An appropriate duration modeling cannot be obtained for these phoneme HMMs by the conventional methods, where the duration distribution for each HMM state is represented by a distribution function. In order to cope with the problem, a new method was proposed which was based on the clustering of phoneme classes with plural types of temporal correspondences into sub-classes. The clustering was conducted so as to reduce the variations of the temporal correspondences in sub-classes. After the clustering, an HMM was constructed for each sub-class. Using the proposed method, speaker dependent recognition experiments were performed for phonemes segmented from isolated words. A few-percent increase was realized in the recognition rate, which was not obtained by another method based on the duration modeling with a Gaussian mixture.
- 社団法人電子情報通信学会の論文
- 1995-06-25
著者
-
Hirose Keikichi
Faculty Of Engineering The University Of Tokyo
-
Minematsu N
Univ. Tokyo Tokyo Jpn
-
Minematsu Nobuaki
Faculty of Engineering, The University of Tokyo
関連論文
- Duration Modeling with Decreased Intra-Group Temporal Variation for HMM-Based Phoneme Recognition
- Tone Recognition of Chinese Dissyllables Using Hidden Markov Models
- A Dialogue Processing System for Speech Response with High Adaptability to Dialogue Topics (Special Issue on Speech and Discourse Processing in Dialogue Systems)