Spoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs
スポンサーリンク
概要
- 論文の詳細を見る
Spoken Term Detection (STD) that considers the out-of-vocabulary (OOV) problem has generated significant interest in the field of spoken document processing. This study describes STD with false detection control using phoneme transition networks (PTNs) derived from the outputs of multiple speech recognizers. PTNs are similar to subword-based confusion networks (CNs), which are originally derived from a single speech recognizer. Since PTN-formed index is based on the outputs of multiple speech recognizers, it is robust to recognition errors. Therefore, PTN should also be robust to recognition errors in an STD task, when compared to the CN-formed index from a single speech recognition system. Our PTN-formed index was evaluated on a test collection. The experiment showed that the PTN-based approach effectively detected OOV terms, and improved the F-measure value from 0.370 to 0.639 when compared with a baseline approach. Furthermore, we applied two false detection control parameters, one is based on the majority voting scheme. The other is a measure of the ambiguity of CN, to the calculation of detection score. By introducing these parameters, the performance of STD was found to be better (0.736 for the F-measure value) than that without any parameters (0.639).
- Information and Media Technologies 編集運営会議の論文
著者
-
Nishizaki Hiromitsu
Interdisciplinary Graduate School Of Medicine And Engineering University Of Yamanashi
-
Natori Satoshi
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi
-
Furuya Yuto
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi
-
Sekiguchi Yoshihiro
Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi
関連論文
- Improving Keyword Recognition of Spoken Queries by Combining Multiple Speech Recognizer's Outputs for Speech-driven WEB Retrieval Task(Spoken Language Systems, Corpus-Based Speech Technologies)
- An Unsupervised Speaker Adaptation Method for Lecture-Style Spontaneous Speech Recognition Using Multiple Recognition Systems(Spoken Language Systems, Corpus-Based Speech Technologies)
- Spoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs
- Spoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs