An Unsupervised Speaker Adaptation Method for Lecture-Style Spontaneous Speech Recognition Using Multiple Recognition Systems(Spoken Language Systems, <Special Section>Corpus-Based Speech Technologies)
スポンサーリンク
概要
- 論文の詳細を見る
This paper describes an accurate unsupervised speaker adaptation method for lecture style spontaneous speech recognition using multiple LVCSR systems. In an unsupervised speaker adaptation framework, the improvement of recognition performance by adapting acoustic models remarkably depends on the accuracy of labels such as phonemes and syllables. Therefore, extraction of the adaptation data guided by confidence measure is effective for unsupervised adaptation. In this paper, we looked for the high confidence portions based on the agreement between two LVCSR systems, adapted acoustic models using the portions attached with high accurate labels, and then improved the recognition accuracy. We applied our method to the Corpus of Spontaneous Japanese (CSJ) and the method improved the recognition rate by about 2.1% in comparison with a traditional method.
- 社団法人電子情報通信学会の論文
- 2005-03-01
著者
-
Nakagawa Seiichi
Department of Information and Computer Sciences, Toyohashi University of Technology
-
Nakagawa Seiichi
Department Of Information And Computer Sciences Toyohashi University Of Technology
-
Nakagawa S
Toyohashi Univ. Technol. Toyohashi Jpn
-
Nakagawa Seiichi
Department Of Information And Computer Sciences Toyohashi University
-
WATANABE Tomohiro
Department of Gastroenterology and Hepatology, Kyoto University Graduate School of Medicine
-
Utsuro Takehito
Graduate School Of Informatics Kyoto University
-
Watanabe Tomohiro
Department Of Gastroenterology And Hepatology Kyoto University Graduate School Of Medicine
-
NISHIZAKI Hiromitsu
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi
-
Nishizaki Hiromitsu
Interdisciplinary Graduate School Of Medicine And Engineering University Of Yamanashi
-
Watanabe Tomohiro
Department Of Information And Computer Sciences Toyohashi University Of Technology
-
Nakagawa Seiichi
Department of Computer Science and Engineering, Toyohashi University of Technology
-
WATANABE Tomohiro
Department of Applied Chemistry and Biotechnology, Faculty of Engineering, Chiba University
関連論文
- Topic dependent language model based on on-line voting (言語理解とコミュニケーション)
- A transitive translation for Indonesian-Japanese CLQA (自然言語処理)
- A Machine Learning Approach for an Indonesian-English Cross Language Question Answering System(Natural Language Processing)
- Indonesian-Japanese Transitive Translation using English for CLIR
- Molecular mechanisms of portal vein tolerance
- Scintigraphic study of regenerative nodules due to fulminant hepatic failure
- CD4 T cells monospecific to ovalbumin produced by Escherichia coli can induce colitis upon transfer to BALB/c and SCID mice
- Massive rectal bleeding due to ileal tuberculosis
- A Comparative Study of Output Probability Functions in HMMs
- Topic dependent language model based on on-line voting (音声)
- Topic dependent language model based on clustering of noun word history
- Word and class dependency of N-gram language model (音声言語情報処理)
- Word and class dependency of N-gram language model (言語理解とコミュニケーション・第9回音声言語シンポジウム)
- Word and class dependency of N-gram language model (音声・第9回音声言語シンポジウム)
- Text-Independent/Text-Prompted Speaker Recognition by Combining Speaker-Specific GMM with Speaker Adapted Syllable-Based HMM(Speaker Recognition, Statistical Modeling for Speech Processing)
- Evaluation of Combinational Use of Discriminant Analysis-Based Acoustic Feature Transformation and Discriminative Training
- LVCSR based on context-dependent syllable acoustic models (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN (Speech) -- (国際ワークショップ"Asian workshop on speech science and technology")
- Linear Discriminant Analysis Using a Generalized Mean of Class Covariances and Its Application to Speech Recognition
- Robust Speech Recognition by Combining Short-Term and Long-Term Spectrum Based Position-Dependent CMN with Conventional CMN
- Robust distant speech recognition by combining variable-term spectrum based position-dependent CMN with conventional CMN
- Improving Keyword Recognition of Spoken Queries by Combining Multiple Speech Recognizer's Outputs for Speech-driven WEB Retrieval Task(Spoken Language Systems, Corpus-Based Speech Technologies)
- An Unsupervised Speaker Adaptation Method for Lecture-Style Spontaneous Speech Recognition Using Multiple Recognition Systems(Spoken Language Systems, Corpus-Based Speech Technologies)
- Speaker Change Detection and Speaker Clustering Using VQ Distortion Measure
- Succeeding Word Prediction for Speech Recognition Based on Stochastic Language Model
- Relationship among Recognition Rate, Rejection Rate and False Alarm Rate in a Spoken Word Recognition System
- Distant Speech Recognition Using a Microphone Array Network
- Auditory perception versus automatic estimation of location and orientation of an acoustic source in a real environment
- Continuous Speech Recognition Using an On-Line Speaker Adaptation Method Based on Automatic Speaker Clustering (Special Issue on Speech Information Processing)
- Distant-Talking Speech Recognition Based on Spectral Subtraction by Multi-Channel LMS Algorithm
- A Spoken Dialog System for Spontaneous Conversations Considering Response Timing and Response Type
- Indonesian-Japanese Transitive Translation using English for CLIR
- Class-Based N-Gram Language Model for New Words Using Out-of-Vocabulary to In-Vocabulary Similarity
- Fabrication and Characterization of Ferroelectric Poly(vinylidene fluoride–tetrafluoroethylene) Gate Field-Effect Transistor Memories
- Spoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs
- Spontaneous Rupture of Liver Plasmacytoma Mimicking Hepatocellular Carcinoma
- Small Bowel Anisakiasis with Self-limiting Clinical Course
- Odd-Even Effect of Dopant Molecules on Clearing Temperatures of Nematic Liquid-crystal Phases
- Spoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs