A Framework of Integrating Syntactic and Lexical Statistics in Statistical Parsing
スポンサーリンク
概要
- 論文の詳細を見る
In this paper, we propose a new framework of statistical language modeling integrating syntactic statistics and lexical statistics. Our model consists of two submodels, the syntactic model and lexical model. The syntactic model reflects syntactic statistics, such as structural preferences, whereas the lexical model reflects lexical statistics, such as the occurrence of each word and word collocations. One of the characteristics of our model is that it learns both types of statistics separately, although many previous models learn them simultaneously. Learning each submodel separately enables us to use a different language source for different submodels, and to make understanding of each submodel's behavior much easier. We conducteda preliminary experiment, where our model was applied to the disambiguation of dependency structures of Japanese sentences. The syntactic model achieved 73.38%in Bunsetu phrase accuracy, which is 11.70 points above the baseline, and when incorporating the lexical model with the syntactic model, further 10.96 point gain was achieved, to 84.34%. Thus the contribution of lexical statistics for disambiguation is as great as that of syntactic statistics in our framework.
- 言語処理学会の論文
言語処理学会 | 論文
- 複合語の分野連想語の効率的決定法
- クラス指向事例収集手法による言い換えコーパスの構築
- 動詞項構造辞書への大規模用例付与
- 言い換え技術に関する研究動向
- Morpho-Syntactic Rules for Detecting Japanese Term Variation: Establishment and Evaluation