Construction of Context Models for Word Sense Disambiguation
スポンサーリンク
概要
- 論文の詳細を見る
This paper presents a study on the use of word context features for Word Sense Disambiguation (WSD). State-of-the-art WSD systems achieve high accuracy by using resources such as dictionaries, taggers, lexical analyzers or topic modeling packages. However, these resources are either too heavy or dont have sufficient coverage for large-scale tasks such as information retrieval. The use of local context for WSD is common, but the rationale behind the formulation of features is often based on trial and error. We therefore investigate the notion of relatedness of context words to the target word (the word to be disambiguated), and propose an unsupervised method for finding the optimal weights for context words based on their distance to the target word. The key idea behind the method is that the optimal weights should maximize the similarity of two context models constructed from different context samples of the same word. Our experimental results show that the strength of the relation between two words follows approximately a power law. The resulting context models are used in Naïve Bayes classifiers for word sense disambiguation. Our evaluation on Semeval WSD tasks in both English and Japanese show that our method can achieve state-of-the-art effectiveness even though it does not use external tools like most existing methods. The high efficiency makes it possible to use our method in large scale applications such as information retrieval.
著者
-
Kando Noriko
National Center For Science Information Systems
-
Nie Jian-yun
Universite De Montreal
-
Brosseau-villeneuve Bernard
Universite De Montreal
関連論文
- Ranking the NTCIR ACLIA IR4QA Systems without Relevance Assessments
- Revisiting NTCIR ACLIA IR4QA with Additional Relevance Assessments
- Revisiting NTCIR ACLIA IR4QA with Additional Relevance Assessments
- Evaluation Methods for Web Retrieval Tasks Considering Hyperlink Structure(Special Issue on Text Processing for Information Access)
- A further note on alternatives to Bpref (情報学基礎)
- A Graph-based Method for Automatic Generation of Multilingual Keyword Clusters and Its Applications
- Construction of Context Models for Word Sense Disambiguation
- Construction of context models for Word Sense Disambiguation ([SemEval-2日本語タスクを中心とする日本語語義曖昧性解消])
- The Concept of Sensitive Word in Chinese : A Survey in a Machine-Readable Dictionary