Finding Web Communities by Maximum Flow Algorithm Using Well-Assigned Edge Capacities(<Special Section>Information Processing Technology for Web Utilization)
スポンサーリンク
概要
- 論文の詳細を見る
A web community is a set of web pages that provide resources on a specific topic. Various methods for finding web communities based on link analysis have been proposed in the literature. The method proposed in this paper is based on the method using the maximum flow algorithm proposed in [7], [8]. Our objective of using the maximum flow algorithm is to extract a subgraph which can be recognized as a good web community in the context of the quantity and the quality. This paper first discusses the features of the maximum flow algorithm based method. The previously proposed approach has a problem that a certain graph structure containing noises (i.e., irrelevant pages) is always extracted. This problem is mainly caused by edge capacities assigned a constant value. This paper proposes an assignment of variable edge capacities that are based on hub and authority scores obtained from HITS calculation. To examine the effects of our proposed method, we performed experiments using a Japanese archive crawled in February 2002. Our experimental results demonstrate that our proposed method removes noise pages caused by constant edge capacities and improves the quality of web communities.
- 社団法人電子情報通信学会の論文
- 2004-02-01
著者
-
Kitsuregawa Masaru
Institute Of Industrial Science The University Of Tokyo
-
IMAFUJI Noriko
Institute of Industrial Science, The University of Tokyo
-
Imafuji Noriko
Institute Of Industrial Science The University Of Tokyo
関連論文
- Display Wall Empowered Visual Mining for CEOP Data Archive(Coordinated Enhanced Observing Period(CEOP))
- Data Analysis System Attached to the CEOP Centralized Data Archive System(Coordinated Enhanced Observing Period(CEOP))
- QUASUR : Web-based Quality Assurance System for CEOP Reference Data(Coordinated Enhanced Observing Period(CEOP))
- Initial CEOP-based Review of the Prediction Skill of Operational General Circulation Models and Land Surface Models(Coordinated Enhanced Observing Period(CEOP))
- Overview of the Super Database Computer (SDC-I) (Special Issue on Super Chip for Intelligent Integrated Systems)
- Mining Communities on the Web Using a Max-Flow and a Site-Oriented Framework(Data Mining)
- Finding Neighbor Communities in the Web Using an Inter-Site Graph(Database)
- Speculative Transaction Processing Approach for Database Systems
- An Economic Dynamic Replication Model for Mobile-P2P networks (夏のデータベースワークショップDBWS 2006)
- An Economic Dynamic Replication Model for Mobile-P2P networks
- Performance Evaluation of Flash SSDs in a Transaction Processing System
- Rank Optimization of Personalized Search
- High Performanee Parallel Query Processing on a 100 Node ATM Connected PC Cluster (Special Issue on New Generation Database Technologies)
- Web Community Chart : A Tool for Navigating the Web and Observing Its Evolution
- Detecting Hijacked Sites by Web Spammer Using Link-Based Algorithms
- A Study of Link Farm Evolution Using a Time-series of Web Snapshots
- A Study of Link Farm Evolution Using a Time-series of Web Snapshots
- Efficient Analyzing General Dominant Relationship Based on Partial Order Models
- Examination of Criterion for Choosing a Run Time Method in GN Hash Join Algorithm
- Finding Web Communities by Maximum Flow Algorithm Using Well-Assigned Edge Capacities(Information Processing Technology for Web Utilization)
- D-3 An Link-Contents Coupled Clustering for Web Search Results
- Speculative Transaction Processing in Distributed Database Systems
- Foreword to the Special Issue on Japanese Microprocessors
- Virtual Striping: A Storage Management Scheme with Dynamic Striping (Special Issue on Architectures, Algorithms and Networks for Massively parallel Computing)
- A Study on Characteristics of Topic-Specific Information Cascade in Twitter (データ工学)
- A Study on Efficient Searching Top-k Semantic Similar Sentences (データ工学)
- Efficient Classification with Conjunctive Features
- A Study on Characteristics of Topic-Specific Information Cascade in Twitter
- A Study on Efficient Searching Top-k Semantic Similar Sentences
- A Study on Graph Similarity Search
- Semi-supervised Sentiment Classification in Resource-Scarce Language : A Comparative Study
- A Study on Graph Similarity Search
- Exploration on Efficient Similar Sentences Extraction
- A Study on Similar Words Searching (データ工学)
- Semi-supervised Sentiment Classification in Resource-Scarce Language : A Comparative Study
- A Study on Graph Similarity Search
- Collective Sentiment Classification Based on User Leniency and Product Popularity
- A Study on Similar Words Searching