Grammar Acquisition and Statistical Parsing by Exploiting Local Contextual Information
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents a method for inducing a context-sensitive conditional probability context-free grammar from an unlabeled bracketed corpus using local contextual information and describes a natural language parsing model which uses a probabilitybased scoring function of the grammar to rank parses of a sentence. This method uses clustering techniques to group brackets in a corpus into a number of similar bracket groups based on their local contextual information. From the set of these groups, the corpus is automatically labeled with some nonterminal labels, and consequently a grammar with conditional probabilities is acquired. Based on these conditional probabilities, the statistical parsing model provides a framework for finding the most likely parse of a sentence. A number of experiments are made using EDR corpus and Wall Street Journal corpus. The results show that our approach achieves a relatively high accuracy: 88% recall, 72% precision and 0.7 crossing brackets per sentence for sentences shorter than 10 words, and 71% recall, 51% precision and 3.4 crossing brackets for sentences between 10-19 words. This result supports the assumption that local contextual statistics obtained from an unlabeled bracketed corpus are effective for learning a useful grammar and parsing.
- 言語処理学会の論文
言語処理学会 | 論文
- 複合語の分野連想語の効率的決定法
- クラス指向事例収集手法による言い換えコーパスの構築
- 動詞項構造辞書への大規模用例付与
- 言い換え技術に関する研究動向
- Morpho-Syntactic Rules for Detecting Japanese Term Variation: Establishment and Evaluation