タイトル無し

概要

論文の詳細を見る
In this paper we propose a method for acquiring word order from corpora.We define word order as the order of modifiers or the order of bunsetsus which depend on the same modifiee. The method uses a model which automatically discovers what the tendency of the word order in Japanese is by using various kinds of information in and around the target bunsetsus. It shows us to what extent each piece of information contributes to deciding the word order and which word order tends to be selected when several kinds of information conflict. The contribution rate of each piece of information in deciding word order is efficiently learned by a model within a maximum entropy (ME) framework.The performance of the trained model can be evaluated by checking how many instances of word order selected by the model agree with those in the original text. A raw corpus instead of a tagged corpus can be used to train the model, if it is first analyzed by a parser. This is possible because text in the corpus is in the correct word order. In this paper, we show that this is indeed possible.
言語処理学会の論文