Discovery of Predicate-Oriented Relations among Named Entities Extracted from Thai Texts
スポンサーリンク
概要
- 論文の詳細を見る
Extracting named entities (NEs) and their relations is more difficult in Thai than in other languages due to several Thai specific characteristics, including no explicit boundaries for words, phrases and sentences; few case markers and modifier clues; high ambiguity in compound words and serial verbs; and flexible word orders. Unlike most previous works which focused on NE relations of specific actions, such as work_for, live_in, located_in, and kill, this paper proposes more general types of NE relations, called predicate-oriented relation (PoR), where an extracted action part (verb) is used as a core component to associate related named entities extracted from Thai Texts. Lacking a practical parser for the Thai language, we present three types of surface features, i.e. punctuation marks (such as token spaces), entity types and the number of entities and then apply five alternative commonly used learning schemes to investigate their performance on predicate-oriented relation extraction. The experimental results show that our approach achieves the F-measure of 97.76%, 99.19%, 95.00% and 93.50% on four different types of predicate-oriented relation (action-location, location-action, action-person and person-action) in crime-related news documents using a data set of 1,736 entity pairs. The effects of NE extraction techniques, feature sets and class unbalance on the performance of relation extraction are explored.
- 2012-07-01
著者
-
Theeramunkong Thanaruk
School Of Information And Computer Technology Siit Thammasat University
-
TONGTEP Nattapong
School of Information, Computer and Communication Technology, Sirindhorn International Institute of Technology, Thammasat University
関連論文
- News Relation Discovery Based on Association Rule Mining with Combining Factors
- Extracting Chemical Reactions from Thai Text for Semantics-Based Information Retrieval
- Extracting Semantic Frames from Thai Medical-Symptom Unstructured Text with Unknown Target-Phrase Boundaries
- Special Section on Knowledge, Information and Creativity Support System
- Quality Evaluation for Document Relation Discovery Using Citation Information(Data Mining)
- Discovery of Predicate-Oriented Relations among Named Entities Extracted from Thai Texts