Imposing Constraints from the Source Tree on ITG Constraints for SMT
スポンサーリンク
概要
- 論文の詳細を見る
In the current statistical machine translation (SMT), erroneous word reordering is one of the most serious problems. To resolve this problem, many word-reordering constraint techniques have been proposed. Inversion transduction grammar (ITG) is one of these constraints. In ITG constraints, target-side word order is obtained by rotating nodes of the source-side binary tree. In these node rotations, the source binary tree instance is not considered. Therefore, stronger constraints for word reordering can be obtained by imposing further constraints derived from the source tree on the ITG constraints. For example, for the source word sequence { a b c d }, ITG constraints allow a total of twenty-two target word orderings. However, when the source binary tree instance ((a b) (c d)) is given, our proposed “imposing source tree on ITG” (IST-ITG) constraints allow only eight word orderings. The reduction in the number of word-order permutations by our proposed stronger constraints efficiently suppresses erroneous word orderings. In our experiments with IST-ITG using the NIST MT08 English-to-Chinese translation tracks data, the proposed method resulted in a 1.8-points improvement in character BLEU-4 (35.2 to 37.0) and a 6.2% lower CER (74.1 to 67.9%) compared with our baseline condition.
- (社)電子情報通信学会の論文
- 2009-09-01
著者
-
SUMITA Eiichiro
National Institute of Information and Communications Technology
-
Sumita Eiichiro
National Inst. Communications Technol. Kyoto‐fu Jpn
-
OKUMA Hideo
National Institute of Information and Communications Technology
-
Okuma Hideo
National Institute Of Communications Technology
-
Yamamoto Hirofumi
Atr Spoken Language Translation Res. Lab. Kyoto‐fu Jpn
-
YAMAMOTO Hirofumi
School of Science and Engineering, Dept. Informatics, Kinki University
関連論文
- Constraining a Generative Word Alignment Model with Discriminative Output
- A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation
- Splitting Input for Machine Translation Using N-gram Language Model Together with Utterance Similarity(Natural Language Processing)
- E_019 Achilles : A Chinese Morphological Analyzer
- Imposing Constraints from the Source Tree on ITG Constraints for SMT
- Introducing a Translation Dictionary into Phrase-Based SMT
- Training Set Selection for Building Compact and Efficient Language Models
- Constraining a Generative Word Alignment Model with Discriminative Output
- Bilingual Cluster Based Models for Statistical Machine Translation
- Statistical Language Model Adaptation with Additional Text Generated by Machine Translation
- Paraphrase Lattice for Statistical Machine Translation
- An Empirical Comparison of Parsers in Constraining Reordering for E-J Patent Machine Translation
- Japanese Argument Reordering Based on Dependency Structure for Statistical Machine Translation
- An Empirical Comparison of Parsers in Constraining Reordering for E-J Patent Machine Translation
- Database of Human Evaluations of Machine Translation Systems for Patent Translation
- How to Translate Dialects: A Segmentation-Centric Pivot Translation Approach
- Joint Phrase Alignment and Extraction for Statistical Machine Translation
- Joint Phrase Alignment and Extraction for Statistical Machine Translation
- Database of Human Evaluations of Machine Translation Systems for Patent Translation