Load Balancing Scheme on the Basis of Huffman Coding for P2P Information Retrieval
スポンサーリンク
概要
- 論文の詳細を見る
Although a distributed index on a distributed hash table (DHT) enables efficient document query processing in Peer-to-Peer information retrieval (P2P IR), the index costs a lot to construct and it tends to be an unfair management because of the unbalanced term frequency distribution. We devised a new distributed index, named Huffman-DHT, for P2P IR. The new index uses an algorithm similar to Huffman coding with a modification to the DHT structure based on the term distribution. In a Huffman-DHT, a frequent term is assigned to a short ID and allocated a large space in the node ID space in DHT. Throuth ID management, the Huffman-DHT balances the index registration accesses among peers and reduces load concentrations. Huffman-DHT is the first approach to adapt concepts of coding theory and term frequency distribution to load balancing. We evaluated this approach in experiments using a document collection and assessed its load balancing capabilities in P2P IR. The experimental results indicated that it is most effective when the P2P system consists of about 30, 000 nodes and contains many documents. Moreover, we proved that we can construct a Huffman-DHT easily by estimating the probability distribution of the term occurrence from a small number of sample documents.
- (社)電子情報通信学会の論文
- 2009-10-01
著者
-
TAKASU Atsuhiro
National Institute of Informatics
-
Takasu Atsuhiro
National Center For Science Information Systems
-
Adachi Jun
National Institute Of Informatics
-
KURASAWA Hisashi
Graduate School of Information Science and Technology, The University of Tokyo
-
Kurasawa Hisashi
Graduate School Of Information Science And Technology The University Of Tokyo
-
Adachi Jun
National Center For Science And Information Systems
関連論文
- Probabilistic Automaton-Based Fuzzy English-Text Retrieval(Software Systems)
- Load Balancing Scheme on the Basis of Huffman Coding for P2P Information Retrieval
- Special Section on Information Processing Technology for Web Utilization
- A Minimum Path Decomposition of the Hasse Diagram for Testing the Consistency of Functional Dependencies
- Margin-Based Pivot Selection for Similarity Search Indexes
- The 2nd International Conference on Japanese Information in Science, Technology, and Commerce
- Optimal Pivot Selection Method Based on the Partition and the Pruning Effect for Metric Space Indexes
- Comparison of Electrochemical Impedance Spectroscopy between Illumination and Dark Conditions