<論文>図書を NDC カテゴリに分類する試み
スポンサーリンク
概要
- 論文の詳細を見る
In information retrie’val, texts are usually retrieved by them with queries. ln this study, anapproach was suggested that texts are automatically classified into categories and retrieved bymatching them with queries classified in the same way. For an efficient information retrievalusing automatic classification, extracting methods of words from texts and matching methodsare essential. Some extracting methods from Japanese texts have been suggested in naturallanguages processing. However, it is difiicult to extract significant words from Japanese textsbecause Japanese texts are written without blank space separating words. As for matchingmethods, many weighting methods have been suggested as well as vector space models andprobabilistic models. This article reports the results of an experiment of classifying Japanese texts into NipponDecimal Classification (NDC) categories based on the title information in Japanese MARCrecords. ln this experiment, three extracting methods: 一一juman, MHSA, n-gram-are tested ona set of 1,000 books. Four weighting methods: 一relative term frequency between categories, tf・idf and tf (max)・idf一一一一一are tested. The results indicate that the extracting method using jumanachieved best and the best weighting method was the relative term frequency between categories, being able to select correct classification categories (upper three digits of NDC) for about55.99060 of 1,000 books.
論文 | ランダム
- 特集 INTERVIEW 「改正貸金業法」再度、冷静な議論を
- 判例の紹介 社会保険事務所等が裁判所の採用した調査嘱託に回答しなかったことについて,当該嘱託を申し立てた個人との関係で,国家賠償法1条1項の違法性は認められないと判断された事例[東京高等裁判所平成21.7.15判決]
- 視床下部過誤腫部分摘出後長期追跡例の検討
- 頭蓋咽頭腫の脳浸潤性について - 腫瘍周辺組織の再検討から -
- レチノイン酸誘発神経管閉鎖不全モデルの検討 : 第1報 作成条件と早期の形態異常所見について