A Novel Discriminative Method for Pronunciation Quality Assessment
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we presented a novel method for automatic pronunciation quality assessment. Unlike the popular "Goodness of Pronunciation" (GOP) method, this method does not map the decoding confidence into pronunciation quality score, but differentiates the different pronunciation quality utterances directly. In this method, the student's utterance need to be decoded for two times. The first-time decoding was for getting the time points of each phone of the utterance by a forced alignment using a conventional trained acoustic model (AM). The second-time decoding was for differentiating the pronunciation quality for each triphone using a specially trained AM, where the triphones in different pronunciation qualities were trained as different units, and the model was trained in discriminative method to ensure the model has the best discrimination among the triphones whose names were same but pronunciation quality scores were different. The decoding network in the second-time decoding included different pronunciation quality triphones, so the phone-level scores can be obtained from the decoding result directly. The phone-level scores were combined into the sentence-level scores using maximum entropy criterion. The experimental results shows that the scoring performance was increased significantly compared to the GOP method, especially in sentence-level.
著者
-
ZHAO Qingwei
Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences
-
YAN Yonghong
Key Laboratory of Speech Acoustics and Content Understanding
-
PAN Fuping
Key Laboratory of Speech Acoustics and Content Understanding
-
ZHANG Junbo
Key Laboratory of Speech Acoustics and Content Understanding
-
DONG Bin
Key Laboratory of Speech Acoustics and Content Understanding
関連論文
- Factor Analysis of Neighborhood-Preserving Embedding for Speaker Verification
- Logarithmic Adaptive Quantization Projection for Audio Watermarking
- Noise Robust Feature Scheme for Automatic Speech Recognition Based on Auditory Perceptual Mechanisms
- A Forced Alignment Based Approach for English Passage Reading Assessment
- A Novel Discriminative Method for Pronunciation Quality Assessment
- Discriminative Approach to Build Hybrid Vocabulary for Conversational Telephone Speech Recognition of Agglutinative Languages
- Logarithmic Adaptive Quantization Projection for Audio Watermarking
- Smoothing Method for Improved Minimum Phone Error Linear Regression