Learning Transfer Rules from Annotated English-Vietnamese Bilingual Corpus(Text Mining I)
スポンサーリンク
概要
- 論文の詳細を見る
Due to the difference of language typology, many transfer rules are required in the lexical and structural transfer stage in the English-to-Vietnamese Machine Translation. Recently, many NLP (Natural Language Processing) tasks have changed from rule-based approaches into corpus-based approaches with large annotated corpora. Corpus-based NLP tasks for such popular languages as English. French, etc. have been well studied with satisfactory achievements. In contrast, corpus-based NLP tasks for Vietnamese are at a deadlock due to absence of annotated training data. Furthermore, hand-annotation of even reasonably well-determined features such as part-of-speech (POS) tags has proved to be labor intensive and costly. In this paper, we present issues of collection and annotation (Word Alignment, Word Segmentation Vietnamese and Part-of-Speech) of a parallel corpus of English-Vietnamese named EVC (English-Vietnamese Corpus). From this EVC, transfer rules have been automatically mined to train for Vietnamese-related NLP tasks and to study English-Vietnamese comparative linguistics.
- 一般社団法人情報処理学会の論文
- 2004-12-04
著者
-
Dien Dinh
Faculty Of Information Technology University Of Natural Sciences Vnu-hcmc
-
Kiem Hoang
Center Of Information Technology Development Vietnam National University Of Hcmc
-
Kiem Hoang
Center For Information Technology Vietnam National University
関連論文
- Learning Transfer Rules from Annotated English-Vietnamese Bilingual Corpus(Text Mining I)
- Learning Transfer Rules from Annotated English-Vietnamese Bilingual Corpus
- Developing Text Mining Based Algorithms for Classifying Biological Sequences(Text Mining I)(Joint Workshop of Vietnamese Society of AI, SIGKBS-JSAI, ICS-IPSJ, and IEICE-SIGAI on Active Mining)
- Developing Text Mining Based Algorithms for Classifying Biological Sequences(Text Mining I)
- Learning Transfer Rules from Annotated English-Vietnamese Bilingual Corpus(Text Mining I)(Joint Workshop of Vietnamese Society of AI, SIGKBS-JSAI, ICS-IPSJ, and IEICE-SIGAI on Active Mining)