Pushing convertible constraints in frequent itemset mining

被引:44
作者
Pei, J
Han, JW
Lakshmanan, LVS
机构
[1] SUNY Buffalo, Buffalo, NY 14260 USA
[2] Univ Illinois, Dept Comp Sci, Urbana, IL 61801 USA
[3] Univ British Columbia, Vancouver, BC V6T 1Z4, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
frequent itemset mining; constraint; convertible constraint; algorithm; pruning;
D O I
10.1023/B:DAMI.0000023674.74932.4c
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent work has highlighted the importance of the constraint-based mining paradigm in the context of frequent itemsets, associations, correlations, sequential patterns, and many other interesting patterns in large databases. Constraint pushing techniques have been developed for mining frequent patterns and associations with antimonotonic, monotonic, and succinct constraints. In this paper, we study constraints which cannot be handled with existing theory and techniques in frequent pattern mining. For example, avg(S)thetav, median(S)thetav, sum(S)thetav (S can contain items of arbitrary values, thetais an element of {>, <,<=, >=} and v is a real number.) are customarily regarded as "tough" constraints in that they cannot be pushed inside an algorithm such as A priori. We develop a notion of convertible constraints and systematically analyze, classify, and characterize this class. We also develop techniques which enable them to be readily pushed deep inside the recently developed FP-growth algorithm for frequent itemset mining. Results from our detailed experiments show the effectiveness of the techniques developed.
引用
收藏
页码:227 / 252
页数:26
相关论文
共 24 条
[1]  
AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
[2]  
Agrawal R, 1994, P 20 INT C VER LARG, V1215, P487
[3]  
[Anonymous], P 1998 ACM SIGMOD IN
[4]  
[Anonymous], P ACM SIGMOD 98
[5]  
[Anonymous], 1999, Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining p, DOI [10.1145/312129., DOI 10.1145/312129, 10.1145/312129, 10.1145/312129.312191]
[6]   Constraint-based rule mining in large, dense databases [J].
Bayardo, RJ ;
Agrawal, R ;
Gunopulos, D .
15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1999, :188-197
[7]  
Brin S., 1997, P 1997 ACM SIGMOD IN, P265, DOI DOI 10.1145/253262.253327
[8]  
Garofalakis MN, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P223
[9]  
Grahne G., 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073), P512, DOI 10.1109/ICDE.2000.839450
[10]  
HAN J, 2000, P 2000 ACM SIGMOD IN, P1, DOI DOI 10.1145/342009.335372