Sliding window-based frequent pattern mining over data streams

被引:115
作者
Tanbeer, Syed Khairuzzaman [1 ]
Ahmed, Chowdhury Farhan [1 ]
Jeong, Byeong-Soo [1 ]
Lee, Young-Koo [1 ]
机构
[1] Kyung Hee Univ, Dept Comp Engn, Youngin Si 446701, Gyeonggi Do, South Korea
关键词
Frequent pattern; Data stream; Sliding window; Tree restructuring; ASSOCIATION RULES; EFFICIENT ALGORITHM; ITEMSETS; DISCOVERY;
D O I
10.1016/j.ins.2009.07.012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Finding frequent patterns in a continuous stream of transactions is critical for many applications such as retail market data analysis, network monitoring, web usage mining, and stock market prediction. Even though numerous frequent pattern mining algorithms have been developed over the past decade, new solutions for handling stream data are still required due to the continuous, unbounded, and ordered sequence of data elements generated at a rapid rate in a data stream. Therefore, extracting frequent patterns from more recent data can enhance the analysis of stream data. In this paper, we propose an efficient technique to discover the complete set of recent frequent patterns from a high-speed data stream over a sliding window. We develop a Compact Pattern Stream tree (CPS-tree) to capture the recent stream data content and efficiently remove the obsolete, old stream data content. We also introduce the concept of dynamic tree restructuring in our CPS-tree to produce a highly compact frequency-descending tree structure at runtime. The complete set of recent frequent patterns is obtained from the CPS-tree of the current window using an FP-growth mining technique. Extensive experimental analyses show that our CPS-tree is highly efficient in terms of memory and time complexity when finding recent frequent patterns from a high-speed data stream. (c) 2009 Elsevier Inc. All rights reserved.
引用
收藏
页码:3843 / 3865
页数:23
相关论文
共 44 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]  
Chang J.H., 2003, P 9 ACM SIGKDD INT C, P487, DOI DOI 10.1145/956750.956807
[3]   estWin:: Online data stream mining of recent frequent itemsets by sliding window method [J].
Chang, JH ;
Lee, WS .
JOURNAL OF INFORMATION SCIENCE, 2005, 31 (02) :76-90
[4]   Fuzzy association rules and the extended mining algorithms [J].
Chen, GQ ;
Wei, Q .
INFORMATION SCIENCES, 2002, 147 (1-4) :201-228
[5]   Catch the moment: maintaining closed frequent itemsets over a data stream sliding window [J].
Chi, Yun ;
Wang, Haixun ;
Yu, Philip S. ;
Muntz, Richard R. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2006, 10 (03) :265-294
[6]  
Gaber MM, 2005, SIGMOD REC, V34, P18, DOI 10.1145/1083784.1083789
[7]  
Giannella C, 2004, DATA MINING NEXT GEN
[8]  
Gopalan R. P., 2004, P SIAM INT WORKSH HI
[9]  
HAN J, 2000, P 2000 ACM SIGMOD IN, P1, DOI DOI 10.1145/342009.335372
[10]   Frequent pattern mining: current status and future directions [J].
Han, Jiawei ;
Cheng, Hong ;
Xin, Dong ;
Yan, Xifeng .
DATA MINING AND KNOWLEDGE DISCOVERY, 2007, 15 (01) :55-86