Improvements of HITS Algorithms for Spam Links
スポンサーリンク
概要
- 論文の詳細を見る
The HITS algorithm proposed by Kleinberg is one of the representative methods of scoring Web pages by using hyperlinks. In the days when the algorithm was proposed, most of the pages given high score by the algorithm were really related to a given topic, and hence the algorithm could be used to find related pages. However, the algorithm and the variants including Bharats improved HITS, abbreviated to BHITS, proposed by Bharat and Henzinger cannot be used to find related pages any more on todays Web, due to an increase of spam links. In this paper, we first propose three methods to find “linkfarms,” that is, sets of spam links forming a densely connected subgraph of a Web graph. We then present an algorithm, called a trust-score algorithm, to give high scores to pages which are not spam pages with a high probability. Combining the three methods and the trust-score algorithm with BHITS, we obtain several variants of the HITS algorithm. We ascertain by experiments that one of them, named TaN+BHITS using the trust-score algorithm and the method of finding linkfarms by employing name servers, is most suitable for finding related pages on todays Web. Our algorithms take time and memory no more than those required by the original HITS algorithm, and can be executed on a PC with a small amount of main memory.
- (社)電子情報通信学会の論文
- 2008-02-01
著者
-
Nishizeki Takao
The Graduate School Of Information Sciences Tohoku University
-
ASANO Yasuhito
the Department of Information Sciences, Faculty of Science and Engineering, Tokyo Denki University
-
TEZUKA Yu
Mobile Entertainment Category, Victor Company of Japan, Limited
-
Nishizeki Takao
The Graduate School Of Information Science Tohoku University
-
Tezuka Yu
Mobile Entertainment Category Victor Company Of Japan Limited
-
Asano Yasuhito
The Department Of Information Sciences Faculty Of Science And Engineering Tokyo Denki University
関連論文
- No-Bend Orthogonal Drawings of Subdivisions of Planar Triconnected Cubic Graphs(Foundations of Computer Science)
- Improvements of HITS Algorithms for Spam Links
- Graph Coloring Algorithms(Special Issue on Algorithm Engineering : Surveys)