Quality Evaluation for Document Relation Discovery Using Citation Information(Data Mining)
スポンサーリンク
概要
- 論文の詳細を見る
Assessment of discovered patterns is an important issue in the field of knowledge discovery. This paper presents an evaluation method that utilizes citation (reference) information to assess the quality of discovered document relations. With the concept of transitivity as direct/indirect citations, a series of evaluation criteria is introduced to define the validity of discovered relations. Two kinds of validity, called soft validity and hard validity, are proposed to express the quality of the discovered relations. For the purpose of impartial comparison, the expected validity is statistically estimated based on the generative probability of each relation pattern. The proposed evaluation is investigated using more than 10,000 documents obtained from a research publication database. With frequent itemset mining as a process to discover document relations, the proposed method was shown to be a powerful way to evaluate the relations in four aspects: soft/hard scoring, direct/indirect citation, relative quality over the expected value, and comparison to human judgment.
- 社団法人電子情報通信学会の論文
- 2007-08-01
著者
-
Theeramunkong Thanaruk
School Of Information And Computer Technology Sirindhorn International Institute Of Technology Thamm
-
Theeramunkong Thanaruk
School Of Information And Computer Technology Siit Thammasat University
-
SRIPHAEW Kritsada
School of Information and Computer Technology, Sirindhorn International Institute of Technology, Tha
-
Sriphaew Kritsada
School Of Information And Computer Technology Sirindhorn International Institute Of Technology Thamm
関連論文
- News Relation Discovery Based on Association Rule Mining with Combining Factors
- Extracting Chemical Reactions from Thai Text for Semantics-Based Information Retrieval
- Extracting Semantic Frames from Thai Medical-Symptom Unstructured Text with Unknown Target-Phrase Boundaries
- Special Section on Knowledge, Information and Creativity Support System
- Quality Evaluation for Document Relation Discovery Using Citation Information(Data Mining)
- Discovery of Predicate-Oriented Relations among Named Entities Extracted from Thai Texts