A Survey on Automatic Speech Recognition(<特集>Special Issue on the 2000 IEICE Excellent Paper Award)
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we describe the recent trend in automatic speech recognition. First, we should point out that the current art of speech recognition by machines is admittedly inferior to the ability of human beings. In particular, we assert that the improvement of acoustic models is necessary. Second, we describe robust feature parameters for noisy environments, which are important in practical usage. Then, we indicate that much training data in the same environment as the recognition stage are useful from the viewpoints of information theory and pattern recognition. Third, we discuss acoustic models and language models which are central issues in speech recognition techniques. Then the principle and limitations of the hidden Markov model (HMM) and recent extended models are discussed. The role of language models is to eliminate improbable candidate words, that is, to reduce the search space. In other words, language models having smaller entropy are preferable. From this standpoint, we survey stochastic language models. Finally, we state some points which deserve attention when constructing speech recognition systems.
- 社団法人電子情報通信学会の論文
- 2002-03-01
著者
-
Nakagawa Seiichi
Toyohashi Univ. Technol. Toyohashi‐shi Jpn
-
Nakagawa Seiichi
Toyohashi Univ. Of Technol. Toyohashi‐shi Jpn
-
Nakagawa Seiichi
Toyohashi University Of Technology
関連論文
- Topic dependent language model based on on-line voting (言語理解とコミュニケーション)
- A transitive translation for Indonesian-Japanese CLQA (自然言語処理)
- A Machine Learning Approach for an Indonesian-English Cross Language Question Answering System(Natural Language Processing)
- Indonesian-Japanese Transitive Translation using English for CLIR
- Auditory perception versus automatic estimation of location and orientation of an acoustic source in a real environment
- Topic dependent language model based on on-line voting (音声)
- Topic dependent language model based on clustering of noun word history
- Word and class dependency of N-gram language model (音声言語情報処理)
- Word and class dependency of N-gram language model (言語理解とコミュニケーション・第9回音声言語シンポジウム)
- Word and class dependency of N-gram language model (音声・第9回音声言語シンポジウム)
- TEXT-INDEPENDENT SPEAKER IDENTIFICATION ON TIMIT DATABASE
- Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM(Speaker Recognition, Statistical Modeling for Speech Processing)
- Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training
- LVCSR based on context-dependent syllable acoustic models (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Linear Discriminant Analysis Using a Generalized Mean of Class Covariances and Its Application to Speech Recognition
- Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN
- LVCSR based on context-dependent syllable acoustic models
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN
- Noisy Speech Recognition Based on Integration/Selection of Multiple Noise Suppression Methods Using Noise GMMs
- A Survey on Automatic Speech Recognition(Special Issue on the 2000 IEICE Excellent Paper Award)
- Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions
- Distant Speech Recognition Using a Microphone Array Network
- Auditory perception versus automatic estimation of location and orientation of an acoustic source in a real environment
- Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
- Response Timing Detection Using Prosodic and Linguistic Information for Human-friendly Spoken Dialog Systems
- INVESTIGATIONS ON TEXT-INDEPENDENT SPEAKER IDENTIFICATION
- Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription
- Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription