Using Constituent Boundary Parsing for Multi-lingual Spoken-language Translation
スポンサーリンク
概要
- 論文の詳細を見る
We propose a method called constituent boundary parsing which uses pattern matching on the surface form. The new version of Transfer-Driven Machine Translation (TDMT) combining constituent boundary parsing and example-based processing is effective for multi-lingual spoken-language translation. Constituent boundary parsing consistently describes the syntactic structures of various expressions with surface patterns consisting of variables and constituent boundaries. In constituent boundary parsing, input words are read in a left-to-right fashion, and the best syntactic structure is efficiently built up based on a chart-parsing algorithm while disambiguating local structures. By introducing constituent boundary parsing, the problems of the earlier version of TDMT, such as the descriptive power of syntactic structures and the explosion of structural ambiguity are solved. Also, because constituent boundary parsing and example-based processing are simple and languageindependent, TDMT's applicability to multi-lingual spoken-language translation has been enhanced. We have evaluated the TDMT system which translates bilingually between Japanese and English, and Japanese and Korean in the domain of travel conversations. Experimental results show that a wide range of sentences in the domain can be translated into understandable output in real-time by the proposed TDMT.
- 言語処理学会の論文
言語処理学会 | 論文
- 複合語の分野連想語の効率的決定法
- クラス指向事例収集手法による言い換えコーパスの構築
- 動詞項構造辞書への大規模用例付与
- 言い換え技術に関する研究動向
- Morpho-Syntactic Rules for Detecting Japanese Term Variation: Establishment and Evaluation