Controlling the Penalty on Late Arrival of Relevant Documents in Information Retrieval Evaluation with Graded Relevance
スポンサーリンク
概要
- 論文の詳細を見る
Large-scale information retrieval evaluation efforts such as TREC and NTCIR have always used binary-relevance evaluation metrics, even when graded relevance data were available. However, the NTCIR-6 crosslingual task has finally announced that it will use graded-relevance metrics, though only as additional metrics. This paper compares graded-relevance metrics in terms of the ability to control the balance between retrieving highly relevant documents and retrieving any relevant documents early in the ranked list. We argue and demonstrate that Q-measure is more flexible than normalised Discounted Cumulative Gain and generalised Average Precision. We then suggest a brief guideline for conducting a reliable information retrieval evaluation with graded relevance.
- 一般社団法人情報処理学会の論文
- 2006-09-12
著者
関連論文
- High-Precision Search via Question Abstraction for Japanese Question Answering
- High-Precision Search via Question Abstraction for Japanese Question Answering
- A Note on the Reliability of Japanese Question Answering Evaluation
- A Further Note on Evaluation Metrics for the Task of Finding One Highly Relevant Document(情報検索・分類,テーマ : 「デジタルアーカイブの活用(応用)」および一般)
- A Further Note on Evaluation Metrics for the Task of Finding One Highly Relevant Document(情報検索・分類,テーマ : 「デジタルアーカイブの活用(応用)」および一般)
- Controlling the Penalty on Late Arrival of Relevant Documents in Information Retrieval Evaluation with Graded Relevance