Accent analysis for Mandarin large vocabulary continuous speech recognition (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents our work on accent issues in Mandarin large vocabulary continuous speech recognition. Across a vast region and a huge population, there are varieties of accented Mandarin spoken in China, which are mainly caused by speakers' dialects. What we want to address in this paper are two questions about Mandarin: whether accents affect speech recognition greatly; how we can solve the problem. For the first question, we focus on three types of mispronunciations as the dominant problems of accented Mandarin. We analyze their effects on speech recognition for each speaker. For the second question, we perform maximum likelihood linear regression (MLLR) adaptation for each speaker and then analyze the recognition results. Experimental results show that up to 45% of the accent related errors get corrected for accented speakers and there is no such improvement for standard speakers. Our experimental analysis and results support us to conclude that the accent is a serious problem in Mandarin speech recognition and the MLLR adaptation is effective in reducing the mismatch caused by accents.
- 社団法人電子情報通信学会の論文
- 2008-03-13
著者
-
FURUI Sadaoki
Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Instit
-
YANG Dong
Department of Obstetrics & Gynecology, Second Affiliated Hospital of Sun Yat-Sen University
-
Yang Dong
Department Of Computer Science Tokyo Institute Of Techonology
-
IWANO Koji
Department of Computer Science, Tokyo Institute of Technology
-
Iwano Koji
Department Of Computer Science Tokyo Institute Of Technology
-
Iwano Koji
Department Of Computer Science Tokyo Institute Of Techonology
-
Furui Sadaoki
Department Of Computer Science Graduate School Of Information Science And Engineering Tokyo Institut
-
Furui Sadaoki
Department Of Computer Science Tokyo Institute Of Techonology
関連論文
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Gait-based Person Identification Robust against Speed Variation using CHLAC features and HMMs
- Incremental Prognostic Value of C-Reactive Protein and N-Terminal ProB-Type Natriuretic Peptide in Acute Coronary Syndrome
- Involvement of cystic fibrosis transmembrane conductance regulator (CFTR) in the pathogenesis of hydrosalpinx induced by Chlamydia trachomatis infection
- Tree-Structured Clustering Methods for Piecewise Linear-Transformation-Based Noise Adaptation(Speech and Hearing)
- Prevalence of the Brugada-Type ECG Recorded From Higher Intercostal Spaces in Healthy Korean Males
- Eosinophil Peroxidase Deficiency in Humans and Mice
- Arrhythmogenic Right Ventricular Cardiomyopathy and Sudden Cardiac Death in Young Koreans
- Overcoming Two Post-fertilization Genetic Barriers in Interspecific Hybridization between Capsicum annuum and C. baccatum for Introgression of Anthracnose Resistance
- Robust Scene Extraction Using Multi-Stream HMMs for Baseball Broadcast(Image Processing and Video Processing)
- Automatic recognition of Indonesian declarative questions and statements using polynomial coefficients of the pitch contours
- Initial evaluation of the drivers' Japanese speech corpus in a car environment (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Accent analysis for Mandarin large vocabulary continuous speech recognition (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Accent analysis for Mandarin large vocabulary continuous speech recognition
- Evaluation of a Noise-Robust Multi-Stream Speaker Verification Method Using F_0 Information
- Topic Extraction Based on Continuous Speech Recognition in Broadcast News Speech
- Noise Robust Speech Recognition Using F_0 Contour Information(Speech Dynamics by Ear, Eye, Mouth and Machine)
- Serum Uric Acid as an Independent and Incremental Prognostic Marker in Addition to N-Terminal Pro-B-Type Natriuretic Peptide in Patients With Acute Myocardial Infarction
- Phonology and Morphology Modeling in a Very Large Vocabulary Hungarian Dictation System(Speech and Hearing)
- 連続発話認識のための言語モデル
- Dynamic Bayesian Network-Based Acoustic Models Incorporating Speaking Rate Effects(Speech and Hearing)
- Neural-network-based HMM adaptation for noisy speech recognition
- Speaker Verification Using MMAP Adaptation (言語理解とコミュニケーション)
- Speaker Verification Using MMAP Adaptation (音声)
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Two-pass Approach for Recognizing Code-Switching Speech
- Two-pass Approach for Recognizing Code-Switching Speech
- Two-pass Approach for Recognizing Code-Switching Speech
- Subject Adaptation and Adaptive Training for Gait-based Person Identification
- Speaker Verification Using MMAP Adaptation
- Two-pass Approach for Recognizing Code-Switching Speech
- The Usefulness of a Fragmented QRS Complex in Patients with Myocardial Ischemia
- Usefulness of Surgical Parameters as Predictors of Postoperative Cardiac Events in Patients Undergoing Non-Cardiac Surgery