Investigating Approaches to Semantic Category Disambiguation Using Large Lexical Resources and Approximate String Matching
スポンサーリンク
概要
- 論文の詳細を見る
This paper proposes and investigates several improvements for an existing machine learning-based system for the task of semantic category disambiguation. For querying large scale lexical resources with millions of lexical entries using approximate string matching we investigate the application of a semantically motivated distance measure, using start/end markers for the query, selecting the most beneficial lexical resources out of a set and the effect using similarity thresholds. These approaches are evaluated using six datasets from the domain of BioNLP and while some modest improvements are observed we fail to establish a consistent benefit for any of the suggested methods for all datasets. The introduced system and all related resources are freely available for research purposes at: https://github.com/ninjin/simsem
- 2011-11-14
著者
-
Sophia Ananiadou
School Of Computer Science University Of Manchester Manchester Uk|national Centre For Text Mining Un
-
Pontus Stenetorp
Aizawa Laboratory, Department of Computer Science, University of Tokyo, Tokyo, Japan
-
Sampo Pyysalo
School of Computer Science, University of Manchester, Manchester, UK|National Centre for Text Mining
-
Sampo Pyysalo
School Of Computer Science University Of Manchester Manchester Uk|national Centre For Text Mining Un
-
Jun'ichi Tsujii
Microsoft Research Asia Beijing People's Republic Of China
-
Pontus Stenetorp
Aizawa Laboratory Department Of Computer Science University Of Tokyo Tokyo Japan
関連論文
- Investigating Approaches to Semantic Category Disambiguation Using Large Lexical Resources and Approximate String Matching
- Investigating Approaches to Semantic Category Disambiguation Using Large Lexical Resources and Approximate String Matching