GenMax: An efficient algorithm for mining maximal frequent itemsets

被引:93
作者
Gouda, K [1 ]
Zaki, MJ
机构
[1] Fac Sci, Dept Math, Banha, Egypt
[2] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
基金
美国国家科学基金会;
关键词
maximal itemsets; frequent itemsets; association rules; data mining; backtracking search;
D O I
10.1007/s10618-005-0002-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present GenMax, a backtrack search based algorithm for mining maximal frequent itemsets. GenMax uses a number of optimizations to prune the search space. It uses a novel technique called progressive focusing to perform maximality checking, and diffset propagation to perform fast frequency computation. Systematic experimental comparison with previous work indicates that different methods have varying strengths and weaknesses based on dataset characteristics. We found GenMax to be a highly efficient method to mine the exact set of maximal patterns.
引用
收藏
页码:223 / 242
页数:20
相关论文
共 12 条
[1]  
Agarwal R. C., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P108, DOI 10.1145/347090.347114
[2]  
[Anonymous], 1996, Advances in Knowledge Discovery and Data Mining, DOI DOI 10.1007/978-3-319-31750-2.
[3]  
BAYARDO RJ, 1998, SIGMOD 98, P85, DOI DOI 10.1145/276304.276313
[4]   MAFIA: A maximal frequent itemset algorithm for transactional databases [J].
Burdick, D ;
Calimlim, M ;
Gehrke, J .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :443-452
[5]  
Goethals B., 2004, ACM SIGKDD Explor. Newsl, V6, P109, DOI [10.1145/1007730.1007744, DOI 10.1145/1007730.1007744]
[6]   Discovering all most specific sentences [J].
Gunopulos, D ;
Khardon, R ;
Mannila, H ;
Saluja, S ;
Toivonen, H ;
Sharma, RS .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2003, 28 (02) :140-174
[7]  
Jiawei Han, 2000, SIGMOD Record, V29, P1, DOI 10.1145/335191.335372
[8]  
LIN DL, 1998, 6 INT C EXT DAT TECH, P105
[9]   AN ALGORITHM FOR DYNAMIC SUBSET AND INTERSECTION TESTING [J].
YELLIN, DM .
THEORETICAL COMPUTER SCIENCE, 1994, 129 (02) :397-406
[10]  
Zaki M. J., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P34, DOI 10.1145/347090.347101