Example-Based Outlier Detection for High Dimensional Datasets
スポンサーリンク
概要
- 論文の詳細を見る
Detecting outliers is an important problem, in applications such as fraud detection, financial analysis, health monitoring and so on. It is typical of most such applications to possess high dimensional datasets. Many recent approaches detect outliers according to some reasonable, pre-defined concepts of an outlier (e.g., distance-based, density-based, etc.). Most of these concepts are proximity-based which define an outlier by its relationship to the rest of the data. However, in high dimensional space, the data becomes sparse which implies that every object can be regarded as an outlier from the point of view of similarity. Furthermore, a fundamental issue is that the notion of which objects are outliers typically varies between users, problem domains or, even, datasets. In this paper, we present a novel solution to this problem, by detecting outliers based on user examples for high dimensional datasets. By studying the behavior of projections of such a few outlier examples in the dataset, the proposed method discovers the hidden view of outliers and picks out further objects that are outstanding in the projection where the examples stand out greatly. Our experiments on both real and synthetic datasets demonstrate the ability of the proposed method to detect outliers that match users' intentions.
- Information and Media Technologies 編集運営会議の論文
著者
-
Faloutsos Christos
School Of Computer Science Carnegie Mellon University
-
Kitagawa Hiroyuki
Graduate School Of Environmental Studies Nagoya University
-
Zhu Cui
Graduate School Of Systems And Information Engineering University Of Tsukuba
関連論文
- Environmental magnetic record and paleosecular variation data for the last 40kyrs from the Lake Biwa sediments, Central Japan
- Atmospheric radiocarbon calibration curve beyond 12.4 cal kyr BP(Proceedings of the 19^ Symposium on Chronological Studies at the Nagoya University Center for Chronological Research in 2006,Part1)
- An algorithm for parallel holistic twig joins on a PC cluster (データベースシステム)
- Detecting outliers in high dimensional datasets with examples (データベースシステム)
- 2T-6 A Robust Method of Detecting DB-Outliers in High Dimensional Datasets
- 3R-9 Keyword Search Including Metadata in Relational Databases
- Detecting outliers in high dimensional datasets with examples (データ光学)
- 3T-2 Continuous Query over Uncertain Data Streams
- Social Bookmarking Induced Active Page Ranking
- MV-OPES : Multivalued-Order Preserving Encryption Scheme : A Novel Scheme for Encrypting Integer Value to Many Different Values
- Cube-Based Analysis for Maintaining XML Data Partition for Holistic Twig Joins
- Example-Based Outlier Detection for High Dimensional Datasets
- Interactive Outlier Detection Adaptive to Users' Intentions (夏のデータベースワークショップDBWS2004)
- Interactive Outlier Detection Adaptive to Users' Intentions (夏のデータベースワークショップ(DBWS2004))
- A Method of Detecting Outliers Matching User's intentions
- Example-Based Outlier Detection for High Dimensional Datasets
- Querying Topic Evolution in Time Series Document Clusters
- Example-Based Outlier Detection for High Dimensional Datasets