Construction of Domain Dictionary for Fundamental Vocabulary and its Application to Automatic Blog Categorization with the Dynamic Estimation of Unknown Words' Domains
スポンサーリンク
概要
- 論文の詳細を見る
For natural language understanding, it is essential to reveal semantic relations between. words. To date, only the IS-A relation has been publicly available as a thesaurus. Toward deeper natural language understanding, we semi-automatically constructed the domain dictionary that represents the domain relation between Japanese fundamental words. Our method does not require a document collection. As a task-based evaluation of the domain dictionary, we performed blog categorization, where we assigned a domain for each word in a blog article and categorize it as the most dominant domain. In so doing, we dynamically estimated the domains of unknown words, i.e., those not listed in the domain dictionary. As a result, our blog categorization achieved the accuracy of 94.0% (564/600). Also, the domain estimation technique for unknown words achieved the accuracy of 76.6% (383/500).
- 言語処理学会の論文
言語処理学会 | 論文
- 複合語の分野連想語の効率的決定法
- クラス指向事例収集手法による言い換えコーパスの構築
- 動詞項構造辞書への大規模用例付与
- 言い換え技術に関する研究動向
- Morpho-Syntactic Rules for Detecting Japanese Term Variation: Establishment and Evaluation