Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions(Spoken Language Systems, <Special Section>Corpus-Based Speech Technologies)
スポンサーリンク
概要
- 論文の詳細を見る
Appropriate language modeling is one of the major issues for automatic transcription of spontaneous speech. We propose an adaptation method for statistical language models based on both topic and speaker characteristics. This approach is applied for automatic transcription of meetings and panel discussions, in which multiple participants speak on a given topic in their own speaking style. A baseline language model is a mixture of two models, which are trained with different corpora covering various topics and speakers, respectively. Then, probabilistic latent semantic analysis (PLSA) is performed on the same respective corpora and the initial ASR result to provide two sets of unigram probabilities conditioned on input speech, with regard to topics and speaker characteristics, respectively. Finally, the baseline model is adapted by scaling N-gram probabilities with these unigram probabilities. For speaker adaptation purpose, we make use of a portion of the Corpus of Spontaneous Japanese (CSJ) in which a large number of speakers gave talks for given topics. Experimental evaluation with real discussions showed that both topic and speaker adaptation reduced test-set perplexity, and in total, an average reduction rate of 8.5% was obtained. Furthermore, improvement on word accuracy was also achieved by the proposed adaptation method.
- 2005-03-01
著者
-
Kawahara Tatsuya
School Of Informatics Kyoto University
-
AKITA Yuya
School of Informatics, Kyoto University
-
Akita Yuya
School Of Informatics Kyoto University
関連論文
- Modeling and automatic detection of English sentence stress for computer-assisted English prosody learning system
- Voice Activity Detection Based on High Order Statistics and Online EM Algorithm
- Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions(Spoken Language Systems, Corpus-Based Speech Technologies)
- Dialogue Speech Recognition by Combining Hierarchical Topic Classification and Language Model Switching(Spoken Language Systems, Corpus-Based Speech Technologies)
- Difference of acoustic modeling for read speech and dialogue speech