Construction of a Test Collection for Spoken Document Retrieval from Lecture Audio Data
スポンサーリンク
概要
- 論文の詳細を見る
The lecture is one of the most valuable genres of audiovisual data. Though spoken document processing is a promising technology for utilizing the lecture in various ways, it is difficult to evaluate because the evaluation require a subjective judgment and/or the verification of large quantities of evaluation data. In this paper, a test collection for the evaluation of spoken lecture retrieval is reported. The test collection consists of the target spoken documents of about 2,700 lectures (604 hours) taken from the Corpus of Spontaneous Japanese (CSJ), 39 retrieval queries, the relevant passages in the target documents for each query, and the automatic transcription of the target speech data. This paper also reports the retrieval performance targeting the constructed test collection by applying a standard spoken document retrieval (SDR) method, which serves as a baseline for the forthcoming SDR studies using the test collection.
- 一般社団法人情報処理学会の論文
- 2009-02-15
著者
-
Tatsuya Kawahara
Kyoto University
-
Kiyoaki Aikawa
Tokyo University Of Technology
-
Yoichi Yamashita
Ritsumeikan University
-
Katunobu Itou
Hosei University
-
Tomoyosi Akiba
Toyohashi University of Technology
-
Yoshiaki Itoh
Iwate Prefectural University
-
Hiroaki Nanjo
Ryukoku University
-
Hiromitsu Nishizaki
University of Yamanashi
-
Norihito Yasuda
Nippon Telegraph and Telephone Corporation
関連論文
- Construction of a Test Collection for Spoken Document Retrieval from Lecture Audio Data
- Joint Phrase Alignment and Extraction for Statistical Machine Translation
- Comparison of Discriminative Models for Lexicon Optimization for ASR of Agglutinative Language
- Partial and Synchronized Caption Generation to Enhance the Listening Comprehension Skills of Second Language Learners
- Partial and Synchronized Caption Generation to Enhance the Listening Comprehension Skills of Second Language Learners
- Classifier-based Data Selection for Lightly-Supervised Training of Acoustic Model for Lecture Transcription