Statistical Language Model Adaptation with Additional Text Generated by Machine Translation
スポンサーリンク
概要
- 論文の詳細を見る
The language model adaptation needs only a small corpusof the application domain (the "targe corpus") and the corpus should be written in the language of the model. However, it is sometimes difficult to collect even a small corpus, especially of spoken language, due to its high cost. To address this problem, this paper proposes a novel scheme that generates a small target corpus in the language of the model by machine translation of the target corpus in another language. Experiments showed that the language model improvement was about half of that which was obtained with a human collected corpus.
- 社団法人電子情報通信学会の論文
- 2003-04-01
著者
-
Nakajima Hideharu
Atr Spoken Language Translation Research Laboratories:(present Address)ntt Dyber Space Laboratories
-
Yamamoto Hirofumi
Atr Spoken Language Translation Res. Lab. Kyoto‐fu Jpn
-
Watanabe Taro
Atr Spoken Language Translation Research Laboratories:department Of Intelligence And Technology Grad
-
Yamamoto Hirofumi
Atr Spoken Language Translation Research Laboratories
関連論文
- A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation
- Imposing Constraints from the Source Tree on ITG Constraints for SMT
- Introducing a Translation Dictionary into Phrase-Based SMT
- Training Set Selection for Building Compact and Efficient Language Models
- Bilingual Cluster Based Models for Statistical Machine Translation
- Statistical Language Model Adaptation with Additional Text Generated by Machine Translation