A high-performance distributed algorithm for mining association rules

被引:24
作者
Schuster, A
Wolff, R [1 ]
Trock, D
机构
[1] Technion Israel Inst Technol, Dept Comp Sci, IL-32000 Haifa, Israel
[2] Technion Israel Inst Technol, Dept Elect Engn, IL-32000 Haifa, Israel
关键词
association rule; data mining; distributed data mining; high-performance computing;
D O I
10.1007/s10115-004-0176-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new distributed association rule mining (D-ARM) algorithm that demonstrates superlinear speed-up with the number of computing nodes. The algorithm is the first D-ARM algorithm to perform a single scan over the database. As such, its performance is unmatched by any previous algorithm. Scale-up experiments over standard synthetic benchmarks demonstrate stable run time regardless of the number of computers. Theoretical analysis reveals a tighter bound on error probability than the one shown in the corresponding sequential algorithm. As a result of this tighter bound and by utilizing the combined memory of several computers, the algorithm generates far fewer candidates than comparable sequential algorithms-the same order of magnitude as the optimum.
引用
收藏
页码:458 / 475
页数:18
相关论文
共 29 条
[1]   Parallel mining of association rules [J].
Agrawal, R ;
Shafer, JC .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (06) :962-969
[2]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[3]  
Agrawal R, 1994, P 20 INT C VER LARG, V1215, P487
[4]  
ANANTHANARAYANA VS, 2000, P HIPC 00 BANG IND, P559
[5]  
[Anonymous], P INT C VER LARG DAT
[6]  
Brin Sergey, 1997, SIGMOD REC, V6, P255, DOI DOI 10.1145/253262.253325
[7]  
CHEUNG D, 1998, 12 PAC AS C KNOWL DI, P48
[8]  
Cheung DW, 1996, PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED INFORMATION SYSTEMS, P31, DOI 10.1109/PDIS.1996.568665
[9]   A GUIDED TOUR OF CHERNOFF BOUNDS [J].
HAGERUP, T ;
RUB, C .
INFORMATION PROCESSING LETTERS, 1990, 33 (06) :305-308
[10]   Scalable parallel data mining for association rules [J].
Han, EH ;
Karypis, G ;
Kumar, V .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2000, 12 (03) :337-352