Regularized Maximum Likelihood Linear Regression Adaptation for Computer-Assisted Language Learning Systems
スポンサーリンク
概要
- 論文の詳細を見る
This study focuses on speaker adaptation techniques for Computer-Assisted Language Learning (CALL). We first investigate the effects and problems of Maximum Likelihood Linear Regression (MLLR) speaker adaptation when used in pronunciation evaluation. Automatic scoring and error detection experiments are conducted on two publicly available databases of Japanese learners English pronunciation. As we expected, over-adaptation causes misjudgment of pronunciation accuracy. Following the analysis, we propose a novel method, Regularized Maximum Likelihood Regression (Regularized-MLLR) adaptation, to solve the problem of the adverse effects of MLLR adaptation. This method uses a group of teachers data to regularize learners transformation matrices so that erroneous pronunciations will not be erroneously transformed as correct ones. We implement this idea in two ways: one is using the average of the teachers transformation matrices as a constraint to MLLR, and the other is using linear combinations of the teachers matrices to represent learners transformations. Experimental results show that the proposed methods can better utilize MLLR adaptation and avoid over-adaptation.
著者
-
LUO Dean
Graduate School of Engineering, The University of Tokyo
-
QIAO Yu
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences and The Chinese University o
-
MINEMATSU Nobuaki
Graduate School of Information Science and Technology, The University of Tokyo
-
HIROSE Keikichi
Graduate School of Information Science and Technology, The University of Tokyo
-
Hirose Keikichi
Graduate School Of Frontier Sciences The University Of Tokyo
-
MINEMATSU Nobuaki
Graduate School of Frontier Sciences, The University of Tokyo
-
QIAO Yu
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences and The Chinese University of Hong Kong
関連論文
- Regularized Maximum Likelihood Linear Regression Adaptation for Computer-Assisted Language Learning Systems
- Speaker Verification in Realistic Noisy Environment in Forensic Science
- Automatic Estimation of Accentual Attribute Values of Words for Accent Sandhi Rules of Japanese Text-to-Speech Conversion (Special Issue on Speech Information Processing)
- Prosodic Analysis and Modeling of Nagauta Singing to Generate Prosodic Contours from Standard Scores(Speech Dynamics by Ear, Eye, Mouth and Machine)
- Applying generation process model constraint to fundamental frequency contours generated by hidden-Markov-model-based speech synthesis