Training Multiple Support Vector Machines for Personalized Web Content Filters
スポンサーリンク
概要
- 論文の詳細を見る
The abundance of information published on the Internet makes filtering of hazardous Web pages a difficult yet important task. Supervised learning methods such as Support Vector Machines (SVMs) can be used to identify hazardous Web content. However, scalability is a big challenge, especially if we have to train multiple classifiers, since different policies exist on what kind of information is hazardous. We therefore propose two different strategies to train multiple SVMs for personalized Web content filters. The first strategy identifies common data clusters and then performs optimization on these clusters in order to obtain good initial solutions for individual problems. This initialization shortens the path to the optimal solutions and reduces the training time on individual training sets. The second approach is to train all SVMs simultaneously. We introduce an SMO-based kernel-biased heuristic that balances the reduction rate of individual objective functions and the computational cost of kernel matrix. The heuristic primarily relies on the optimality conditions of all optimization problems and secondly on the pre-calculated part of the whole kernel matrix. This strategy increases the amount of information sharing among learning tasks, thus reduces the number of kernel calculation and training time. In our experiments on inconsistently labeled training examples, both strategies were able to predict hazardous Web pages accurately (> 91%) with a training time of only 26% and 18% compared to that of the normal sequential training.
- The Institute of Electronics, Information and Communication Engineersの論文
著者
-
Hattori Gen
Kddi R&d Laboratories
-
Matsumoto Kazunori
Kddi R&d Inc.
-
Ono Chihiro
KDDI R&D Laboratories
-
NGUYEN Dung
Institute of Information Technology, Vietnam Academy of Science and Technology
-
ERDMANN Maike
KDDI R&D Laboratories
-
TAKEYOSHI Tomoya
KDDI R&D Laboratories
-
ERDMANN Maike
KDDI R&D Laboratories
-
TAKEYOSHI Tomoya
KDDI R&D Laboratories
関連論文
- Lightweight FIPA Compliant Agent Platform on Java-enabled Mobile Phone for Ubiquitous Services (特集:ブロードバンドネットワークサービス)
- AN EFFIEICNT WINNER DETERMINATION ALGOLITHM FOR COMBINATORIAL ASCENDING AUCIONS
- Context-Aware Users' Preference Models by Integrating Real and Supposed Situation Data
- A New Diagnostic Method Using Probabilistic Temporal Fault Models(Special Issue on the 2000 IEICE Excellent Paper Award)
- Evaluation of Hybrid Message-Delivering Scheme for Massive Mobile Agents' Intercommunication(Autonomous Decentralized Systems)
- Training Multiple Support Vector Machines for Personalized Web Content Filters