A Novel Strategy for Mining Frequent Closed Itemsets in Data Streams

被引:7
作者
Tang, Keming [1 ,2 ]
Dai, Caiyan [1 ]
Chen, Ling [3 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Jiangsu, Peoples R China
[2] Yancheng Teachers Univ, Dept Software Engn, Informat Sci & Technol Coll, Comp Sci, Yancheng, Jiangsu, Peoples R China
[3] Yangzhou Univ, Dept Comp Sci, Yangzhou, Jiangsu, Peoples R China
关键词
Stream data; mining closed frequent data itemsets; sliding window;
D O I
10.4304/jcp.7.7.1564-1573
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Mining frequent itemsets from data stream is an important task in stream data mining. This paper presents an algorithm Stream_FCI for mining the frequent closed itemsets from data streams in the model of sliding window. The algorithm detects the frequent closed itemsets in each sliding window using a DFP-tree with a head table. In processing each new transaction, the algorithm changes the head table and modifies the DFP-tree according to the changed items in the head table. The algorithm also adopts a table to store the frequent closed itemsets so as to avoid the time-consuming operations of searching in the whole DFP-tree for adding or deleting transactions. Our experimental results show that our algorithm is more efficient and has lower time and memory complexity than the similar algorithms Moment and FPCFI-DS.
引用
收藏
页码:1564 / 1573
页数:10
相关论文
共 11 条
  • [1] Agrawal R., 1994, P 20 INT C VER LARG, P487
  • [2] An Efficient Algorithm for Mining Closed Frequent Itemsets in Data Streams
    Ao, Fujiang
    Du, Jing
    Yan, Yuejin
    Liu, Baohong
    Huang, Kedi
    [J]. 8TH IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY WORKSHOPS: CIT WORKSHOPS 2008, PROCEEDINGS, 2008, : 37 - +
  • [3] Chang JH, 2003, 9 ACM SIGKDD INT C K, P487, DOI DOI 10.1145/956750.956807
  • [4] Moment: Maintaining closed frequent itemsets over a stream sliding window
    Chi, Y
    Wang, HX
    Yu, PS
    Muntz, RR
    [J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 59 - 66
  • [5] Gannell C., 2003, MINING FREQUENT ITEM
  • [6] Han JW, 2000, SIGMOD RECORD, V29, P1
  • [7] Li HF, 2006, ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, P672
  • [8] Pasquier N, 1999, LECT NOTES COMPUT SC, V1540, P398
  • [9] Singh Manku G., 2002, Proceedings of the Twenty-eighth International Conference on Very Large Data Bases, P346
  • [10] Wei- Guang T., 2003, P 29 INT C VER LARG, P607