An Alignment Model for Extracting English-Korean Translations of Term Constituents(Natural Language Processing)
スポンサーリンク
概要
- 論文の詳細を見る
Technical terms are linguistic representations of a domain concept, and their constituents are components used to represent the concept. Technical terms are usually multi-word terms and their meanings can be inferred from their constituents. Therefore, term constituents are essential for understanding the designated meaning of technical terms. However, there are several problems in finding the correct meanings of technical terms with their term constituents. First, because a term constituent is usually a morphological unit rather than a conceptual unit in the case of Korean technical terms, we need to first identify conceptual units by chunking term constituents. Second, conceptual units are sometimes homonyms or synonyms. Moreover their meanings show domain dependency. It is therefore necessary to give information about conceptual units and their possible meanings, including homonyms, synonyms, and domain dependency, so that natural language applications can properly handle technical terms. In this paper, we propose a term constituent alignment algorithm that extracts such information from bilingual technical term pairs. Our algorithm recognizes conceptual units and their meanings by finding English term constituents and their corresponding Korean term constituents for given English-Korean term pairs. Our experimental results indicate that this method can effectively find conceptual units and their meanings with about 6% alignment error rate (AER) on manually analyzed experimental data and about 14% AER on automatically analyzed experimental data.
- 2006-12-01
著者
-
OH Jong-Hoon
Computational Linguistics Group of the National Institute of Information and Communications Technolo
-
CHOI Key-Sun
Dept. of Computer Science, KAIST
-
ISAHARA Hitoshi
Computational Linguistics Group of the National Institute of Information and Communications Technolo
-
Choi Key-sun
Dept. Of Computer Science Kaist
-
Isahara Hitoshi
National Inst. Information And Communications Technol. Kyoto Jpn
-
Isahara Hitoshi
Nict
-
Isahara Hitoshi
Computational Linguistics Group At The National Institute Of Information And Communications Technolo
関連論文
- An Alignment Model for Extracting English-Korean Translations of Term Constituents(Natural Language Processing)
- Normalizing Syntactic Structure Using Part-of-Speech Tags and Binary Rules( Development of Advanced Computer Systems)
- Extracting Partial Parsing Rules from Tree-Annotated Corpus : Toward Deterministic Global Parsing(Natural Language Processing)
- Use of Multiple Documents as Evidence with Decreased Adding in a Japanese Question-answering System
- Automatic F-term Classification of Japanese Patent Documents Using the k-Nearest Neighborhood Method and the SMART Weighting
- Statistical-Based Approach to Non-segmented Language Processing(Knowledge, Information and Creativity Support System)
- Improving Search Performance : A Lesson Learned from Evaluating Search Engines Using Thai Queries(Knowledge, Information and Creativity Support System)
- Related Word Lists Effective in Creativity Support(Knowledge, Information and Creativity Support System)
- A Model of Discourse Segmentation and Segment Title Assignment for Lecture Speech Indexing(Knowledge, Information and Creativity Support System)
- Remarks on Relational Nouns and Relational Categories
- Toolbar to Highlight Important Expressions in Scientific Articles on Atomic and Molecular Physics
- Automatic F-term Classification of Japanese Patent Documents Using the k-Nearest Neighborhood Method and the SMART Weighting
- Use of Multiple Documents as Evidence with Decreased Adding in a Japanese Question-answering System