Finding Frequent Closed Itemsets in Sliding Window in Linear Time
スポンサーリンク
概要
- 論文の詳細を見る
One of the most well-studied problems in data mining is computing the collection of frequent itemsets in large transactional databases. Since the introduction of the famous Apriori algorithm [14], many others have been proposed to find the frequent itemsets. Among such algorithms, the approach of mining closed itemsets has raised much interest in data mining community. The algorithms taking this approach include TITANIC [8], CLOSET+[6], DCI-Closed [4], FCI-Stream [3], GC-Tree [15], TGC-Tree [16] etc. Among these algorithms, FCI-Stream, GC-Tree and TGC-Tree are online algorithms work under sliding window environments. By the performance evaluation in [16], GC-Tree [15] is the fastest one. In this paper, an improved algorithm based on GC-Tree is proposed, the computational complexity of which is proved to be a linear combination of the average transaction size and the average closed itemset size. The algorithm is based on the essential theorem presented in Sect. 4.2. Empirically, the new algorithm is several orders of magnitude faster than the state of art algorithm, GC-Tree.
- (社)電子情報通信学会の論文
- 2008-10-01
著者
-
WANG Xinyu
Computer College of Zhejiang University
-
CHEN Junbo
Computer College of Zhejiang University
-
ZHOU Bo
Computer College of Zhejiang University
-
DING Yiqun
Computer College of Zhejiang University
-
CHEN Lu
State Street Corporation
関連論文
- Inconsistency Resolution Method for RBAC Based Interoperation
- Security Violation Detection for RBAC Based Interoperation in Distributed Environment
- Security Violation Detection for RBAC Based Interoperation in Distributed Environment
- Inconsistency Resolution Method for RBAC Based Interoperation
- Mining Noise-Tolerant Frequent Closed Itemsets in Very Large Database
- Finding Frequent Closed Itemsets in Sliding Window in Linear Time
- Feature Location in Source Code by Trace-Based Impact Analysis and Information Retrieval