Utterance Verification Using State-Level Log-Likelihood Ratio with Frame and State Selection
スポンサーリンク
概要
- 論文の詳細を見る
This paper suggests utterance verification system using state-level log-likelihood ratio with frame and state selection. We use hidden Markov models for speech recognition and utterance verification as acoustic models and anti-phone models. The hidden Markov models have three states and each state represents different characteristics of a phone. Thus we propose an algorithm to compute state-level log-likelihood ratio and give weights on states for obtaining more reliable confidence measure of recognized phones. Additionally, we propose a frame selection algorithm to compute confidence measure on frames including proper speech in the input speech. In general, phone segmentation information obtained from speaker-independent speech recognition system is not accurate because triphone-based acoustic models are difficult to effectively train for covering diverse pronunciation and coarticulation effect. So, it is more difficult to find the right matched states when obtaining state segmentation information. A state selection algorithm is suggested for finding valid states. The proposed method using state-level log-likelihood ratio with frame and state selection shows that the relative reduction in equal error rate is 18.1% compared to the baseline system using simple phone-level log-likelihood ratios.
- (社)電子情報通信学会の論文
- 2010-03-01
著者
-
Kwon Suk-bong
Korea Advanced Institute Of Science And Technology (kaist)
-
KIM Hoirin
Korea Advanced Institute of Science and Technology (KAIST)
-
Kim Hoirin
Korea Advanced Inst. Sci. And Technol. Daejeon Kor
関連論文
- Utterance Verification Using State-Level Log-Likelihood Ratio with Frame and State Selection
- Noise Robust Speaker Identification Using Sub-Band Weighting in Multi-Band Approach(Speech and Hearing)
- Text-Independent Speaker Identification in a Distant-Talking Multi-Microphone Environment(Speech and Hearing)
- Response Time Reduction of Speech Recognizers Using Single Gaussians(Speech and Hearing)