自然言語解析のためのMSLRパーザ・ツールキット

概要

論文の詳細を見る
本論文では，我々が現在公開している自然言語解析用ツール「MSLR パーザ・ツールキット」の特徴と機能について述べる．MSLR パーザは，一般化LR 法の解析アルゴリズムを拡張し，日本語などの分かち書きされていない文の形態素解析と構文解析を同時に行うツールである．MSLR パーザを用いて解析を行う際には，まずLR 表作成器を用いて，文法と接続表からLR 表を作成する．このとき，LR 表作成器は，接続表に記述された品詞間の接続制約を組み込んだLR 表を生成する．このため，接続制約に違反する解析結果を受理しないLR 表が作られるだけでなく，LR 表の大きさを大幅に縮小することができる．次に，MSLR パーザは，作成されたLR 表と辞書を用いて辞書引きによる単語分割と構文解析を同時に行い，その結果として構文木を出力する．さらに，MSLR パーザは，文中の括弧の組によって係り受けに関する部分的な制約が与えられた文を入力とし，その制約を満たす構文木のみを出力する機能を持つ．また，文脈依存性を若干反映した言語モデルのひとつである確率一般化LR モデル(PGLR モデル) を学習し，個々の構文木に対してPGLR モデルに基づく生成確率を計算し，解析結果の優先順位付けを行う機能も持つ． : In this paper, we describe a tool kit for natural language analysis, the MSLR parser tool kit. The ‘MSLR parser’ is based on the generalized LR parsing algorithm, and integrates morphological and syntactic analysis of unsegmented sentences. The ‘LR table generator’ constructs an LR table from a context free grammar and a connection matrix describing adjacency constraints between part-of-speech pairs. By incorporating connection matrix-based constraints into the LR table, it is possible to both reject any locally implausible parsing results, and reduce the size of the LR table. Then, using the generated LR table and a lexicon, the MSLR parser outputs parse trees based on morphological and syntactic analysis of input sentences. In addition to this, the MSLR parser accepts sentence inputs including partial syntactic constraints denoted by pairs of brackets, and suppresses the generation of any parse trees not satisfying those constraints. Furthermore, it can be trained according to the probabilistic generalized LR (PGLR) model, which is a mildly context sensitive language model. It can also rank parse trees in order of the overall probability returned by the trained PGLR model.
言語処理学会の論文
2000-11-10

自然言語解析のためのMSLRパーザ・ツールキット

スポンサーリンク

概要

著者

関連論文

スポンサーリンク