Joint Phrase Alignment and Extraction for Statistical Machine Translation
スポンサーリンク
概要
- 論文の詳細を見る
The phrase table, a scored list of bilingual phrases, lies at the center of phrase-based machine translation systems. We present a method to directly learn this phrase table from a parallel corpus of sentences that are not aligned at the word level. The key contribution of this work is that while previous methods have generally only modeled phrases at one level of granularity, in the proposed method phrases of many granularities are included directly in the model. This allows for the direct learning of a phrase table that achieves competitive accuracy without the complicated multi-step process of word alignment and phrase extraction that is used in previous research. The model is achieved through the use of non-parametric Bayesian methods and inversion transduction grammars (ITGs), a variety of synchronous context-free grammars (SCFGs). Experiments on several language pairs demonstrate that the proposed model matches the accuracy of the more traditional two-step word alignment/phrase extraction approach while reducing its phrase table to a fraction of its original size.
- Information and Media Technologies 編集運営会議の論文
著者
-
Kawahara Tatsuya
Graduate School Of Informatics Kyoto University
-
WATANABE Taro
National Institute of Information and Communications Technology
-
Sumita Eiichiro
National Inst. Communications Technol. Kyoto‐fu Jpn
-
Neubig Graham
Graduate School Of Informatics Kyoto University
-
Mori Shinsuke
Graduate School Of Informatics Kyoto University
-
KAWAHARA Tatsuya
Graduate School of Informatics, Kyoto University
関連論文
- Constraining a Generative Word Alignment Model with Discriminative Output
- Modeling and automatic detection of English sentence stress for computer-assisted English prosody learning system
- Formant structure estimation using vocal tract length normalization for CALL systems
- A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation
- Splitting Input for Machine Translation Using N-gram Language Model Together with Utterance Similarity(Natural Language Processing)
- E_019 Achilles : A Chinese Morphological Analyzer
- Numerical Analysis of Carbon Isotope Separation by Plasma Chemical Reactions in Carbon Monoxide Glow Discharge
- Carbon and Oxygen Isotope Separation by Plasma Chemical Reactions in Carbon Monoxide Glow Discharge
- Imposing Constraints from the Source Tree on ITG Constraints for SMT
- Introducing a Translation Dictionary into Phrase-Based SMT
- Lecture Speech Recognition Using Large Corpus of Spontaneous Japanese
- Constraining a Generative Word Alignment Model with Discriminative Output
- Bilingual Cluster Based Models for Statistical Machine Translation
- Effective Prediction of Errors by Non-native Speakers Using Decision Tree for Speech Recognition-Based CALL System
- Paraphrase Lattice for Statistical Machine Translation
- Bayesian Learning of a Language Model from Continuous Speech
- An Empirical Comparison of Parsers in Constraining Reordering for E-J Patent Machine Translation
- A Pointwise Approach to Training Dependency Parsers from Partially Annotated Corpora
- Japanese Argument Reordering Based on Dependency Structure for Statistical Machine Translation
- An Empirical Comparison of Parsers in Constraining Reordering for E-J Patent Machine Translation
- A Pointwise Approach to Training Dependency Parsers from Partially Annotated Corpora
- Database of Human Evaluations of Machine Translation Systems for Patent Translation
- How to Translate Dialects: A Segmentation-Centric Pivot Translation Approach
- Joint Phrase Alignment and Extraction for Statistical Machine Translation
- Joint Phrase Alignment and Extraction for Statistical Machine Translation
- Database of Human Evaluations of Machine Translation Systems for Patent Translation