Production-Oriented Models for Speech Recognition(Speech Recognition, <Special Section> Statistical Modeling for Speech Processing)
スポンサーリンク
概要
- 論文の詳細を見る
Acoustic modeling in speech recognition uses very little knowledge of the speech production process. At many levels our models continue to model speech as a surface phenomenon. Typically, hidden Markov model (HMM) parameters operate primarily in the acoustic space or in a linear transformation thereof; state-to-state evolution is modeled only crudely, with no explicit relationship between states, such as would be afforded by the use of phonetic features commonly used by linguists to describe speech phenomena, or by the continuity and smoothness of the production parameters governing speech. This survey article attempts to provide an overview of proposals by several researchers for improving acoustic modeling in these regards. Such topics as the controversial Motor Theory of Speech Perception, work by Hogden explicitly using a continuity constraint in a pseudo-articulatory domain, the Kalman filter based Hidden Dynamic Model, and work by many groups showing the benefits of using articulatory features instead of phones as the underlying units of speech, will be covered.
- 社団法人電子情報通信学会の論文
- 2006-03-01
著者
-
Nakamura Atsushi
Ntt Communication Science Laboratories Ntt Corporation
-
Mcdermott Erik
Ntt Communication Science Laboratories Ntt Corporation
関連論文
- Improved Sequential Dependency Analysis Integrating Labeling-Based Sentence Boundary Detection
- Efficient discriminative training of error corrective models using high-WER competitors (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Efficient discriminative training of error corrective models using high-WER competitors
- Speech Recognition Based on Student's t-Distribution Derived from Total Bayesian Framework(Speech Recognition, Statistical Modeling for Speech Processing)
- Selection of Shared-State Hidden Markov Model Structure Using Bayesian Criterion(the 2003 IEICE Excellent Paper Award)
- Efficient Combination of Likelihood Recycling and Batch Calculation for Fast Acoustic Likelihood Calculation
- Production-Oriented Models for Speech Recognition(Speech Recognition, Statistical Modeling for Speech Processing)
- Model Shrinkage for Discriminative Language Models