Topic Trend Detection and Mining in World Wide Web(Learning & Discovery)(<Special Issue>Doctorial Theses on Aritifical Intelligence)
スポンサーリンク
概要
- 論文の詳細を見る
The technology capable of capturing and analyzing the changes on the Web is no doubt vital, in providing the needed information m time for one to stay competent in this fast changing information age. This dissertation presents an approach toward the automatic journalism of new information (changes) on the Web. These information changes on the Web can be classified into two types: "flow" and "stock". "Flow" type information (i.e news) come to the Web constantly and regularly, at a rather fast pace "Stock" type information, mainly the static web pages, change unpredictably doesn't know at when and in what form. Our system aims to innovate the technology and use a new TF * PDF (Term Frequency * Proportional Document Frequency) algorithm to detect the prominent topics in the changes. In the framework and domain of problem addressed, this algorithm is more superior than the conventional TF * IDF algorithm in a way that it doesn't need retrospective corpus, besides posing minimal risk of losing the tracks of detection and tracking of popular topics. Also, our system requires less computational complexity while offering more flexibility. It crawl the Web, collects the changes and journalizes a summary of popular topics to the user. It does more than the conventional web tracking systems that just acknowledges the URLs of changed pages. It can become our personalized e-journalist on the Web and periodically provide us with the collection and e-publication of currently popular events.
- 社団法人人工知能学会の論文
- 2005-01-01