Extracting Semantic Frames from Thai Medical-Symptom Unstructured Text with Unknown Target-Phrase Boundaries
スポンサーリンク
概要
- 論文の詳細を見る
Due to the limitations of language-processing tools for the Thai language, pattern-based information extraction from Thai documents requires supplementary techniques. Based on sliding-window rule application and extraction filtering, we present a framework for extracting semantic information from medical-symptom phrases with unknown boundaries in Thai unstructured-text information entries. A supervised rule learning algorithm is employed for automatic construction of information extraction rules from hand-tagged training symptom phrases. Two filtering components are introduced: one uses a classification model to predict rule application across a symptom-phrase boundary based on instantiation features of rule internal wildcards, the other uses weighted classification confidence to resolve conflicts arising from overlapping extractions. In our experimental study, we focus our attention on two basic types of symptom phrasal descriptions: one is concerned with abnormal characteristics of some observable entities and the other with human-body locations at which primitive symptoms appear. The experimental results show that the filtering components improve precision while preserving recall satisfactorily.
- (社)電子情報通信学会の論文
- 2011-03-01
著者
-
Theeramunkong Thanaruk
Thammasat Univ. Tha
-
Theeramunkong Thanaruk
School Of Information And Computer Technology Siit Thammasat University
-
Nantajeewarawat Ekawit
Computer Science Program Sirindhorn International Institute Of Technology Thammasat University
-
INTARAPAIBOON Peerasak
School of Information and Computer Technology Sirindhorn International Institute of Technology, Tham
-
NANTAJEEWARAWAT Ekawit
School of Information and Computer Technology Sirindhorn International Institute of Technology, Tham
-
Intarapaiboon Peerasak
School Of Information And Computer Technology Sirindhorn International Institute Of Technology Thamm
関連論文
- An EM-Based Approach for Mining Word Senses from Corpora(Natural Language Processing)
- OWL/XDD Application Profiles(Knowledge, Information and Creativity Support System)
- Construction of Thai Lexicon from Existing Dictionaries and Texts on the Web(Natural Language Processing)
- News Relation Discovery Based on Association Rule Mining with Combining Factors
- Extracting Chemical Reactions from Thai Text for Semantics-Based Information Retrieval
- Extracting Semantic Frames from Thai Medical-Symptom Unstructured Text with Unknown Target-Phrase Boundaries
- Special Section on Knowledge, Information and Creativity Support System
- Quality Evaluation for Document Relation Discovery Using Citation Information(Data Mining)
- Discovery of Predicate-Oriented Relations among Named Entities Extracted from Thai Texts