タイトル無し
スポンサーリンク
概要
- 論文の詳細を見る
This paper describes a system for extracting named entities. The system is based on a ME (maximum entropy) model and transformation rules. Eight types of named entities are defined by IREX-NE, and each named entity consists of one or more morphemes, or it includes a substring of a morpheme. We define 40 named entity labels, which are at the beginning, the middle, or the end of a named entity, and extract a named entity which consists of one or more morphemes by estimating the labels according to the ME model. The trained ME model detects the relationship between features and named entity labels assigned to morphemes. The features are clues used for estimating labels. We use information about lexical items and parts-of-speech as features in the target morpheme. We also use information about lexical items and parts-of-speech in four morphemes, two on the left and two on the right of the target morpheme, as features. After estimating the named entity labels according to the ME model, we extract a named entity, which includes a substring of a morpheme, by using transformation rules. These rules are automatically acquired by investigating the difference between named entity labels in a tagged corpus and those extracted by our system from the same corpus without tags. This paper also evaluates the relationships between transformation rules and accuracy, between features and accuracy, and between the amount of training data and accuracy by conducting several comparative experiments.
- 言語処理学会の論文
言語処理学会 | 論文
- 複合語の分野連想語の効率的決定法
- クラス指向事例収集手法による言い換えコーパスの構築
- 動詞項構造辞書への大規模用例付与
- 言い換え技術に関する研究動向
- Morpho-Syntactic Rules for Detecting Japanese Term Variation: Establishment and Evaluation