Fast parallel association rule mining without candidacy generation

被引:69
作者
Zaïane, OR [1 ]
El-Hajj, M [1 ]
Lu, P [1 ]
机构
[1] Univ Alberta, Edmonton, AB T6G 2M7, Canada
来源
2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2001年
关键词
D O I
10.1109/ICDM.2001.989600
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we introduce a new parallel algorithm MLFPT (Multiple Local Frequent Pattern Tree) [11] for parallel mining of frequent patterns, based on FP-growth mining, that uses only, two full I/O scans of the database, eliminating the need for generating the candidate items, and distributing the work fairly among processors. We have devised partitioning strategies at different stages of the mining process to achieve near optimal balancing between processors. We have successfully tested our algorithm on datasets larger than 50 million transactions.
引用
收藏
页码:665 / 668
页数:4
相关论文
共 13 条
[1]   Parallel mining of association rules [J].
Agrawal, R ;
Shafer, JC .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1996, 8 (06) :962-969
[2]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[3]  
Agrawal R., 1994, P 20 INT C VER LARG, V1215, P487
[4]  
ALMADEN I, QUEST SYNTHETIC DATA
[5]  
Brin S., 1997, SIGMOD Record, V26, P255, DOI [10.1145/253262.253327, 10.1145/253262.253325]
[6]  
Cheung DW, 1996, PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED INFORMATION SYSTEMS, P31, DOI 10.1109/PDIS.1996.568665
[7]  
Han J., 2012, Data Mining, P393, DOI [DOI 10.1016/B978-0-12-381479-1.00009-5, 10.1016/B978-0-12-381479-1.00001-0]
[8]  
HAN J, 2000, ACM SIGMOD DALL
[9]  
Jong Soo Park, 1995, SIGMOD Record, V24, P175, DOI 10.1145/568271.223813
[10]   Parallel Data Mining for Association Rules on Shared-Memory Systems [J].
S. Parthasarathy ;
M. J. Zaki ;
M. Ogihara ;
W. Li .
Knowledge and Information Systems, 2001, 3 (1) :1-29