A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we propose a hybrid model adaptation approach in which pronunciation and acoustic models are adapted by incorporating the pronunciation and acoustic variabilities of non-native speech in order to improve the performance of non-native automatic speech recognition (ASR). Specifically, the proposed hybrid model adaptation can be performed at either the state-tying or triphone-modeling level, depending at which acoustic model adaptation is performed. In both methods, we first analyze the pronunciation variant rules of non-native speakers and then classify each rule as either a pronunciation variant or an acoustic variant. The state-tying level hybrid method then adapts pronunciation models and acoustic models by accommodating the pronunciation variants in the pronunciation dictionary and by clustering the states of triphone acoustic models using the acoustic variants, respectively. On the other hand, the triphone-modeling level hybrid method initially adapts pronunciation models in the same way as in the state-tying level hybrid method; however, for the acoustic model adaptation, the triphone acoustic models are then re-estimated based on the adapted pronunciation models and the states of the re-estimated triphone acoustic models are clustered using the acoustic variants. From the Korean-spoken English speech recognition experiments, it is shown that ASR systems employing the state-tying and triphone-modeling level adaptation methods can relatively reduce the average word error rates (WERs) by 17.1% and 22.1% for non-native speech, respectively, when compared to a baseline ASR system.
- (社)電子情報通信学会の論文
- 2010-09-01
著者
-
KIM Hong
School of Information and Communications, Gwangju Institute of Science and Technology (GIST)
-
Kim Hong
School Of Electronics And Information Engineering Cheongju University
-
Kim Hong
School Of Information And Communications Gwangju Institute Of Science And Technology (gist)
-
Oh Yoo
School Of Information And Communications Gwangju Institute Of Science And Technology (gist)
-
Oh Yoo
School Of Information And Communications Gwangju Inst. Of Sci. And Technol. (gist)
-
Kim Hong
School of Electronic and Information Engineering, Cheongju University, 36 Naedok-Dong Sangdang-Gu, Cheongju, Chungbuk 360-764, Korea
関連論文
- Correlation between organic chemical reaction and chemical shift in carbon-doped silicon oxide film (Electron devices: 第15回先端半導体デバイスの基礎と応用に関するアジア・太平洋ワークショップ(AWAD2007))
- Correlation between organic chemical reaction and chemical shift in carbon-doped silicon oxide film (Silicon devices and materials: 第15回先端半導体デバイスの基礎と応用に関するアジア・太平洋ワークショップ(AWAD2007))
- A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition
- A Statistical Approach to Error Compensation in Spectral Quantization(Speech and Hearing)
- Bandwidth-Scalable Stereo Audio Coding Based on a Layered Structure
- A205 NUMERICAL STUDY ON TRIBRACHIAL FLAME PROPAGATION IN A 2-D MIXING LAYER(Laminar flame-1)
- A Hybrid Acoustic and Pronunciation Model Adaptation Approach for Non-native Speech Recognition
- Phonetically Balanced Text Corpus Design Using a Similarity Measure for a Stereo Super-Wideband Speech Database
- Correlation of Grain Size of Pentacene-Deposited Surface and Carbon Content Analyzed by X-ray Photoelectron Spectroscopy