Constraint-based rule mining in large, dense databases

被引:123
作者
Bayardo, RJ [1 ]
Agrawal, R [1 ]
Gunopulos, D [1 ]
机构
[1] IBM Corp, Almaden Res Ctr, Armonk, NY 10504 USA
来源
15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS | 1999年
关键词
D O I
10.1109/ICDE.1999.754924
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 [计算机科学与技术];
摘要
Constraint-based rule miners find all rules in a given dataset meeting user-specified constraints such as minimum support and confidence. We describe a new algorithm that directly exploits all user-specified constraints including minimum support, minimum confidence, and a new constraint that ensures every mined rule offers a predictive advantage over any of its simplifications. Our algorithm maintains efficiency even at low supports on data that is dense (e.g. relational data). Previous approaches such as Apriori and its variants exploit only the minimum support constraint, and as a result are ineffective on dense data due to a combinatorial explosion of "frequent itemsets".
引用
收藏
页码:188 / 197
页数:10
相关论文
共 15 条
[1]
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]
Agrawal R., 1996, Advances in Knowledge Discovery and Data Mining, P307
[3]
Ali K., 1997, Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, P115
[4]
[Anonymous], P 1998 ACM SIGMOD IN
[5]
[Anonymous], P ACM SIGMOD 98
[6]
Berry MichaelJ., 1997, DATA MINING TECHNIQU
[7]
Brin S., 1997, SIGMOD Record, V26, P255, DOI [10.1145/253262.253327, 10.1145/253262.253325]
[8]
Cohen W. W., 1995, P 12 INT C MACH LEAR, P115, DOI DOI 10.1016/B978-1-55860-377-6.50023-2
[9]
*INT BUS MACH, 1996, IBM INT MIN US GUID
[10]
KLEMETTINEN M, 1994, P 3 INT C INF KNOWL, P401