Extending the state-of-the-art of constraint-based pattern discovery

被引:35
作者
Bonchi, Francesco
Lucchese, Claudio
机构
[1] CNR, ISTI, Pisa KDD Lab, I-56124 Pisa, Italy
[2] Univ Ca Foscari, Dept Comp Sci, I-30172 Venice, Italy
关键词
frequent pattern mining; constraint pushing techniques; data reduction;
D O I
10.1016/j.datak.2006.02.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the last years, in the context of the constraint-based pattern discovery paradigm, properties of constraints have been studied comprehensively and on the basis of this properties, efficient constraint-pushing techniques have been defined. In this paper we review and extend the state-of-the-art of the constraints that can be pushed in a frequent pattern computation. We introduce novel data reduction techniques which are able to exploit convertible anti-monotone constraints (e.g., constraints on average or median) as well as tougher constraints (e.g., constraints on variance or standard deviation). A thorough experimental study is performed and it confirms that our framework outperforms previous algorithms for convertible constraints, and exploit the tougher ones with the same effectiveness. Finally, we highlight that the main advantage of our approach, i.e., pushing constraints by means of data reduction in a level-wise framework, is that different properties of different constraints can be exploited all together, and the total benefit is always greater than the sum of the individual benefits. This consideration leads to the definition of a general Apriori-like algorithm which is able to exploit all possible kinds of constraints studied so far. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:377 / 399
页数:23
相关论文
共 20 条
[1]  
Agarwal R., 1994, P 20 INT C VER LARG, V487, P499
[2]  
BESSONJK, 2005, INTELL DATA ANAL, V9, P59
[3]  
BONCHI F, 2004, P ADV KNOWL DISC DAT
[4]  
BONCHI F, 2003, P 3 IEEE INT C DAT M
[5]  
BONCHI F, 2004, P 3 IEEE INT C DAT M
[6]  
BONCHI F, 2006, P 22 INT C DAT ENG I
[7]  
BONCHI F, 2005, P ADV KNOWLEDGE DISC
[8]  
BONCHI F, 2003, P 7 EUR C PRINC PRAC
[9]  
BOULICAUT JF, 2002, INTELL DATA ANAL J, V6, P341
[10]  
BUCILA C, 2002, P 17 INT JOINT C ART