Speech Recognition Using Function-Word N-Grams and Content-Word N-Grams
スポンサーリンク
概要
- 論文の詳細を見る
This paper proposes a new stochastic language model for speech recognition based on function-word N-grams and content-word N-grams. The conventional word N-gram models are effective for speech recognition, but they represent only local constraints within a few successive words and lack the ability to capture global syntactic or semantic relationships between words. To represent more global constraints, the proposed language model gives the N-gram probabilities of word sequences, with attention given only to function words or to content words. The sequences of function words and of content words are expected to represent syntactic and semantic constraints, respectively. Probabilities of function-word bigrams and content-word bigrams were estimated from a 10, 000-sentence text database, and analysis using information theoretic measure showed that expected constraints were extracted appropriately. As an application of this model to speech recognition, a post-processor was constructed to select the optimum sentence candidate from a phrase lattice obtained by a phrase recognition system. The phrase candidate sequence with the highest total acoustic and linguistic score was sought by dynamic programming. The results of experiments carried out on the utterances of 12 speakers showed that the proposed method is more accurate than a CFG-based method, thus demonstrating its effectiveness in improving speech recognition performance.
- 社団法人電子情報通信学会の論文
- 1995-06-25
著者
-
Sagayama Shigeki
Ntt Human Interface Laboratories
-
Matsunaga S
Ntt Corp. Yokosuka‐shi Jpn
-
Matsunaga Shoichi
Atr Interpreting Telecommunications Research Laboratories
-
Isotani Ryosuke
ATR Interpreting Telecommunications Research Laboratories
関連論文
- Spoken Sentence Recognition Based on HMM-LR with Hybrid Language Modeling (Special Issue on Natural Language Processing and Understanding)
- LR Parsing with a Category Reachability Test Applied to Speech Recognition (Special Issue on Speech and Discourse Processing in Dialogue Systems)
- Speaker-Consistent Parsing for Speaker-Independent Continuous Speech Recognition
- Automatic Determination of the Number of Mixture Components for Continuous HMMs Based on a Uniform Variance Criterion
- Unsupervised Speaker Adaptation Using All-Phoneme Ergodic Hidden Markov Network
- Speech Recognition Using Function-Word N-Grams and Content-Word N-Grams
- Discriminative Training Based on Minimum Classification Error for a Small Amount of Data Enhanced by Vector-Field-Smoothed Bayesian Learning