How to Translate Dialects: A Segmentation-Centric Pivot Translation Approach
スポンサーリンク
概要
- 論文の詳細を見る
Recent research on multilingual statistical machine translation (SMT) focuses on the usage of pivot languages in order to overcome resource limitations for certain language pairs. This paper proposes a new method to translate a dialect language into a foreign language by integrating transliteration approaches based on Bayesian alignment (BA) models with pivot-based SMT approaches. The advantages of the proposed method with respect to standard SMT approaches are threefold: (1) it uses a standard language as the pivot language and acquires knowledge about the relation between dialects and a standard language automatically, (2) it avoids segmentation mismatches between the input and the translation model by mapping the character sequences of the dialect language to the word segmentation of the standard language, and (3) it reduces the translation task complexity by using monotone decoding techniques. Experiment results translating five Japanese dialects (Kumamoto, Kyoto, Nagoya, Okinawa, Osaka) into four Indo-European languages (English, German, Russian, Hindi) and two Asian languages (Chinese, Korean) revealed that the proposed method improves the translation quality of dialect translation tasks and outperforms standard pivot translation approaches concatenating SMT engines for the majority of the investigated language pairs.
著者
-
Sumita Eiichiro
National Inst. Communications Technol. Kyoto‐fu Jpn
-
Finch Andrew
National Institute of Information and Communications Technology
-
Paul Michael
National Institute of Information and Communications Technology
関連論文
- A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation
- Splitting Input for Machine Translation Using N-gram Language Model Together with Utterance Similarity(Natural Language Processing)
- E_019 Achilles : A Chinese Morphological Analyzer
- Imposing Constraints from the Source Tree on ITG Constraints for SMT
- Introducing a Translation Dictionary into Phrase-Based SMT
- Constraining a Generative Word Alignment Model with Discriminative Output
- Bilingual Cluster Based Models for Statistical Machine Translation
- Paraphrase Lattice for Statistical Machine Translation
- An Empirical Comparison of Parsers in Constraining Reordering for E-J Patent Machine Translation
- Japanese Argument Reordering Based on Dependency Structure for Statistical Machine Translation
- An Empirical Comparison of Parsers in Constraining Reordering for E-J Patent Machine Translation
- Database of Human Evaluations of Machine Translation Systems for Patent Translation
- How to Translate Dialects: A Segmentation-Centric Pivot Translation Approach
- Joint Phrase Alignment and Extraction for Statistical Machine Translation
- Joint Phrase Alignment and Extraction for Statistical Machine Translation
- Database of Human Evaluations of Machine Translation Systems for Patent Translation