SVM-Based Multi-Document Summarization Integrating Sentence Extraction with Bunsetsu Elimination(Special Issue on Text Processing for Information Access)
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we propose a machine learning-based method of multi-document summarization integrating sentence extraction with bunsetsu elimination. We employ Support Vector Machines for both of the modules used. To evaluate the effect of bunsetsu elimination, we participated in the multi-document summarization task at TSC-2 by the following two approaches : (1) sentence extraction only, and (2) sentence extraction + bunsetsu elimination. The results of subjective evaluation at TSC-2 show that both approaches are superior to the Lead-based method from the viewpoint of information coverage. In addition, we made extracts from given abstracts to quantitatively examine the effectiveness of bunsetsu elimination. The experimental results showed that our bunsetsu elimination makes summaries more informative. Moreover, we found that extraction based on SVMs trained by short extracts are better than the Lead-based method, but that SVMs trained by long extracts are not.
- 社団法人電子情報通信学会の論文
- 2003-09-01
著者
-
Maeda Eisaku
Ntt Communication Science Laboratories Ntt Corporation
-
SASAKI Yutaka
NTT Communication Science Laboratories, NTT Corporation
-
HIRAO Tsutomu
NTT Communication Science Laboratories, NTT Corporation
-
TAKEUCHI Kazuhiro
Graduate School of Information Science, Nara Institute of Science and Technology
-
ISOZAKI Hideki
NTT Communication Science Laboratories, NTT Corporation
-
Hirao Tsutomu
Ntt Communication Science Laboratories Ntt Corporation
-
Isozaki Hideki
Ntt Communication Science Laboratories Ntt Corporation
-
Sasaki Yutaka
Ntt Communication Science Laboratories
-
Takeuchi Kazuhiro
Graduate School Of Information Science Nara Institute Of Science And Technology:communication Resear
-
Isozaki Hideki
NTT Communication Science Laboratories
関連論文
- Question Answering as Abduction : A Feasibility Study at NTCIR QAC1(Special Issue on Text Processing for Information Access)
- SVM-Based Multi-Document Summarization Integrating Sentence Extraction with Bunsetsu Elimination(Special Issue on Text Processing for Information Access)
- On the Necessity of Special Mechanisms for Handling Types in Inductive Logic Programmintg
- SemiCCA: Efficient Semi-supervised Learning of Canonical Correlations
- SemiCCA: Efficient Semi-supervised Learning of Canonical Correlations
- Lagrangian Relaxation for Scalable Text Summarization while Maximizing Multiple Objectives
- Named Entity Recognition from Speech Using Discriminative Models and Speech Recognition Confidence
- Effects of Conversational Agents on Activation of Communication in Thought-Evoking Multi-Party Dialogues