Utilizing the Web for Automatic Word Spacing
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents a new approach to word spacing problems by mining reliable words from the Web and use them as additional resources. Conventional approaches to automatic word spacing use noise-free data to train parameters for word spacing models. However, the insufficiency and irrelevancy of training examples is always the main bottleneck associated with automatic word spacing. To mitigate the data-sparseness problem, this paper proposes an algorithm to discover reliable words on the Web to expand the vocabularies and a model to utilize the words as additional resources. The proposed approach is very simple and practical to adapt to new domains. Experimental results show that the proposed approach achieves better performance compared to the conventional word spacing approaches.
- (社)電子情報通信学会の論文
- 2009-12-01
著者
-
RIM Hae-Chang
Korea University
-
Rim Hae‐chang
Korea Univ. Kor
-
Rim Hae
Department Of Computer Science Korea University
-
Song Young-in
Korea University
-
Rim Hae-chang
Department Of Computer Science Engineering Korea University
-
HONG Gumwon
Korea University
-
LEE Jeong-Hoon
Korea University
-
LEE Do-Gil
Korea University
関連論文
- Three-Phase Text Error Correction Model for Korean SMS Messages
- Interannual variation in quantitative relationships among egg production and densities of larvae and juveniles of the Japanese mantis shrimp Oratosquilla oratoria in Tokyo Bay, Japan
- Three-Phase Text Error Correction Model for Korean SMS Messages
- Automatic Acronym Dictionary Construction Based on Acronym Generation Types
- Utilizing the Web for Automatic Word Spacing
- Incorporating Frame Information to Semantic Role Labeling
- Simple Weighting Techniques for Query Expansion in Biomedical Document Retrieval(Contents Technology and Web Information Systems)
- A Definitional Question Answering System Based on Phrase Extraction Using Syntactic Patterns(Natural Language Processing)
- Topic Document Model Approach for Naive Bayes Text Classification(Natural Language Processing)
- Changes in growth of marbled sole Pseudopleuronectes yokohamae between high and low stock-size periods in Tokyo Bay, Japan
- Minimizing Human Intervention for Constructing Korean Part-of-Speech Tagged Corpus
- Comparison between surface-reading and cross-section methods using sagittal otolith for age determination of the marbled sole Pseudopleuronectes yokohamae