E_019 Achilles : A Chinese Morphological Analyzer
スポンサーリンク
概要
- 論文の詳細を見る
We created a new Chinese morphological analyzer, Achilles, by integrating rule-based, dictionarybased, and statistical machine learning method, conditional random fields (CRF). The rule-based method is used to recognize regular expressions: numbers, time and alphabets. The dictionarybased method is used to find in-vocabulary (IV) words while out-of-vocabulary (OOV) words are detected by the CRFs. At last, confidence measure based approach is used to weigh all the results and output the best ones. We tested Achilles using data from Sighan Bakeoff 2005. Achilles outperforms the best contester in CITYU, PKU and MSR corpora, achieving the highest F-scores.
- 2006-08-21
著者
-
Sumita Eiichiro
National Inst. Communications Technol. Kyoto‐fu Jpn
-
Sumita Eiichiro
National Institute Of Information And Communications Technology:atr Spoken Language Communication Re
-
Zhang Ruiqiang
National Institute of Information and Communications Technology
-
Zhang Ruiqiang
National Institute Of Information And Communications Technology:atr Spoken Language Communication Re
関連論文
- A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation
- Splitting Input for Machine Translation Using N-gram Language Model Together with Utterance Similarity(Natural Language Processing)
- E_019 Achilles : A Chinese Morphological Analyzer
- Imposing Constraints from the Source Tree on ITG Constraints for SMT
- Introducing a Translation Dictionary into Phrase-Based SMT
- Constraining a Generative Word Alignment Model with Discriminative Output
- Bilingual Cluster Based Models for Statistical Machine Translation
- Paraphrase Lattice for Statistical Machine Translation
- An Empirical Comparison of Parsers in Constraining Reordering for E-J Patent Machine Translation
- Japanese Argument Reordering Based on Dependency Structure for Statistical Machine Translation
- An Empirical Comparison of Parsers in Constraining Reordering for E-J Patent Machine Translation
- Database of Human Evaluations of Machine Translation Systems for Patent Translation
- How to Translate Dialects: A Segmentation-Centric Pivot Translation Approach
- Joint Phrase Alignment and Extraction for Statistical Machine Translation
- Joint Phrase Alignment and Extraction for Statistical Machine Translation
- Database of Human Evaluations of Machine Translation Systems for Patent Translation