Incremental mining of sequential patterns in large databases

被引:85
作者
Masseglia, F
Poncelet, P
Teisseire, M
机构
[1] INRIA Sophia Antipolis, FR-06902 Sophia Antipolis, France
[2] LIRMM, F-34392 Montpellier 5, France
[3] Ecole Mines Ales, Lab LGI2P, F-30035 Nimes 1, France
关键词
sequential patterns; incremental mining; data mining;
D O I
10.1016/S0169-023X(02)00209-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we consider the problem of the incremental mining of sequential patterns when new transactions or new customers are added to an original database. We present a new algorithm for mining frequent sequences that uses information collected during an earlier mining process to cut down the cost of finding new sequential patterns in the updated database. Our test shows that the algorithm performs significantly faster than the naive approach of mining the whole updated database from scratch. The difference is so pronounced that this algorithm could also be useful for mining sequential patterns, since in many cases it is faster to apply our algorithm than to mine sequential patterns using a standard algorithm, by breaking down the database into an original database plus an increment. (C) 2003 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:97 / 121
页数:25
相关论文
共 28 条
[1]  
Agarwal R., 1994, P 20 INT C VER LARG, V487, P499
[2]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[3]  
AGRAWAL R, 1995, P 1 INT C KNOWL DISC
[4]  
[Anonymous], 1996, EDBT, DOI 10.1007/BFb0014140
[5]  
[Anonymous], 1999, NETWORK INFORM SYST
[6]  
[Anonymous], P INT C VER LARG DAT
[7]  
[Anonymous], P 11 INT C DAT ENG I
[8]  
Brin S., 1997, SIGMOD Record, V26, P255, DOI [10.1145/253262.253327, 10.1145/253262.253325]
[9]  
Cheung D., 1996, P 12 INT C DAT ENG I
[10]  
CHEUNG DW, 1997, P 5 INT C DAT SYST A