Automatic Generation of Non-uniform and Context-Dependent HMMs Based on the Variational Bayesian Approach(Feature Extraction and Acoustic Medelings, <Special Section>Corpus-Based Speech Technologies)
スポンサーリンク
概要
- 論文の詳細を見る
We propose a new method both for automatically creating non-uniform, context-dependent HMM topologies, and selecting the number of mixture components based on the Variational Bayesian (VB) approach. Although the Maximum Likelihood (ML) criterion is generally used to create HMM topologies, it has an over-fitting problem. Recently, to avoid this problem, the VB approach has been applied to create acoustic models for speech recognition. We introduce the VB approach to the Successive State Splitting (SSS) algorithm, which can create both contextual and temporal variations for HMMs. Experimental results indicate that the proposed method can automatically create a more efficient model than the original method. We evaluated a method to increase the number of mixture components by using the VB approach and considering temporal structures. The VB approach obtained almost the same performance as the smaller number of mixture components in comparison with that obtained by using ML-based methods.
- 社団法人電子情報通信学会の論文
- 2005-03-01
著者
-
NAKAMURA Satoshi
Spoken Language Communication Group, Knowledge Creating Communication Research Center, National Inst
-
Jitsuhiro Takatoshi
Spoken Language Translation Research Laboratories Advanced Telecommunications Research Institute Int
-
Nakamura Satoshi
Spoken Language Communication Group Knowledge Creating Communication Research Center National Institute Of Information And Communications Technology
関連論文
- An Improved Greedy Search Algorithm for the Development of a Phonetically Rich Speech Corpus
- Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition
- Automatic Generation of Non-uniform and Context-Dependent HMMs Based on the Variational Bayesian Approach(Feature Extraction and Acoustic Medelings, Corpus-Based Speech Technologies)
- Language Modeling Using Patterns Extracted from Parse Trees for Speech Recognition (Special Issue on Speech Information Processing)
- Automatic Generation of Non-uniform HMM Topologies Based on the MDL Criterion(Speech and Hearing)
- Iterative mapping function estimation and environment structure refinement in the online phase of the ESSEM approach (音声)
- An Unsupervised Model of Redundancy for Answer Validation