A new approach to online generation of association rules

被引:60
作者
Aggarwal, CC [1 ]
Yu, PS [1 ]
机构
[1] IBM Corp, TJ Watson Res Ctr, Hawthorne, NY 10532 USA
关键词
OLAP; association rules; data mining; knowledge discovery;
D O I
10.1109/69.940730
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We discuss the problem of online mining of association rules in a large database of sales transactions. The online mining is performed by preprocessing the data effectively in order to make it suitable for repeated online queries. We store the preprocessed data in such a way that online processing may be done by applying a graph theoretic search algorithm whose complexity is proportional to the size of the output. The result is an online algorithm which is independent of the size of the transactional data and the size of the preprocessed data. The algorithm is almost instantaneous in the size of the output. The algorithm also supports techniques for quickly discovering association rules from large itemsets. The algorithm is capable of finding rules with specific items in the antecedent or consequent. These association rules are presented in a compact form, eliminating redundancy. The use of nonredundant association rules helps significantly in the reduction of irrelevant noise in the data mining process.
引用
收藏
页码:527 / 540
页数:14
相关论文
共 25 条
[1]  
Aggarwal C.C., 1998, P KDD C
[2]  
AGGARWAL CC, 1998, ICDE C
[3]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[4]  
AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
[5]  
Agrawal R., 1994, P 20 INT C VER LARG, P478
[6]  
AGRAWAL S, P 22 INT C VER LARG, P506
[7]  
[Anonymous], P PYOC ACM SIGMOD IN
[8]  
[Anonymous], P INT C VER LARG DAT
[9]  
CHEN MS, 20556 IBM
[10]  
Dyreson C, 1996, PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P532