A statistical theory for quantitative association rules

被引:74
作者
Aumann, Y
Lindell, Y
机构
[1] IBM TJ Watson Res, Hawthorne, NY 10532 USA
[2] Bar Ilan Univ, Dept Comp Sci, IL-52900 Ramat Gan, Israel
关键词
data mining; knowledge discovery in data bases; quantitative association rules; statistical inference theory;
D O I
10.1023/A:1022812808206
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Association rules are a key data-mining tool and as such have been well researched. So far, this research has focused predominantly on databases containing categorical data only. However, many real-world databases contain quantitative attributes and current solutions for this case are so far inadequate. In this paper we introduce a new definition of quantitative association rules based on statistical inference theory. Our definition reflects the intuition that the goal of association rules is to find extraordinary and therefore interesting phenomena in databases. We also introduce the concept of sub-rules which can be applied to any type of association rule. Rigorous experimental evaluation on real-world datasets is presented, demonstrating the usefulness and characteristics of rules mined according to our definition.
引用
收藏
页码:255 / 283
页数:29
相关论文
共 15 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]  
Agrawal R, 1994, P 20 INT C VLDB
[3]  
[Anonymous], 1994, KDD
[4]  
Brin S., 1997, P 1997 ACM SIGMOD C
[5]   Mining optimized association rules for numeric attributes [J].
Fukuda, T ;
Morimoto, Y ;
Morishita, S ;
Tokuyama, T .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1999, 58 (01) :1-12
[6]  
FUKUDA T, 1996, P 1996 ACM SIGMOD C
[7]   Multiple comparisons in induction algorithms [J].
Jensen, DD ;
Cohen, PR .
MACHINE LEARNING, 2000, 38 (03) :309-338
[8]  
KLOESGEN W, 1994, P KDD 94 WORKSH AAAI
[9]  
Lindgren B.W., 1976, STAT THEORY
[10]   Efficient construction of regression trees with range and region splitting [J].
Morimoto, Y ;
Ishii, H ;
Morishita, S .
MACHINE LEARNING, 2001, 45 (03) :235-259