Machine Learning Approach to Multi-Document Summarization.
スポンサーリンク
概要
- 論文の詳細を見る
Due to the rapid growth of the Internet and the emergence of low-price and largecapacity storage devices, the number of online documents is exploding. Automatic summarization is the key handling this situation. The cost of manual work demands that we be able to summarize a document set related to a certain event. This paper proposes a method of extracting important sentences from document sets. The method is based on Support Vector Machines, a technology that is attracting attention in the field of natural language processing. We conducted experiments using three document sets formed from twelve events published in the MAINICHI newspaper of 1999. These sets were manually processed by newspaper editors. Tests using this corpus show that our method has better performance than either the Lead-based method or the TF-IDF method. Moreover, we clarify that reducing redundancy is not always effective for extracting important sentences from a set of multiple documents taken from a single source.
- 言語処理学会の論文
言語処理学会 | 論文
- 複合語の分野連想語の効率的決定法
- クラス指向事例収集手法による言い換えコーパスの構築
- 動詞項構造辞書への大規模用例付与
- 言い換え技術に関する研究動向
- Morpho-Syntactic Rules for Detecting Japanese Term Variation: Establishment and Evaluation