Text Segmentation with Multiple Surface Linguistic Cues
スポンサーリンク
概要
- 論文の詳細を見る
In general, a text consists of multiple sentences, and there are some semantic relations among them. A certain range of sentences in a text, is widely assumed to form a coherent unit which is usually called a discourse segment. While sentences in a segment have semantic relations with each other, segments in a discourse have some relations with each other. The global discource structure of a text can be constructed by relating the segments with each other. Therefore, identifying the segment boundaries is a first step to recognize the structure of a text. There are many surface linguistic cues which help for identifing text segmentations in a text. In this paper, we describe a method for identifying segment boundaries of a Japanese text with the aid of multiple surface linguistic cues, though our experiments might be small-scale. We calculate a weighted sum of the scores for all cues that reflects their contribution to identifying the correct segment boundaries. We also present a method of training the weights for multiple linguistic cues automatically without the overfitting problem.
- 言語処理学会の論文
言語処理学会 | 論文
- 複合語の分野連想語の効率的決定法
- クラス指向事例収集手法による言い換えコーパスの構築
- 動詞項構造辞書への大規模用例付与
- 言い換え技術に関する研究動向
- Morpho-Syntactic Rules for Detecting Japanese Term Variation: Establishment and Evaluation