Clustering in dynamic spatial databases

被引:33
作者
Zhang, J [1 ]
Hsu, W [1 ]
Lee, M [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117543, Singapore
关键词
spatial databases; data mining; multi-resolution clustering; incremental clustering; Minimum Spanning Tree;
D O I
10.1007/s10844-005-0265-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Efficient clustering in dynamic spatial databases is currently an open problem with many potential applications. Most traditional spatial clustering algorithms are inadequate because they do not have an efficient support for incremental clustering. In this paper, we propose DClust, a novel clustering technique for dynamic spatial databases. DClust is able to provide multi-resolution view of the clusters, generate arbitrary shapes clusters in the presence of noise, generate clusters that are insensitive to ordering of input data and support incremental clustering efficiently. DClust utilizes the density criterion that captures arbitrary cluster shapes and sizes to select a number of representative points, and builds the Minimum Spanning Tree (MST) of these representative points, called R-MST. After the initial clustering, a summary of the cluster structure is built. This summary enables quick localization of the effect of data updates on the current set of clusters. Our experimental results show that DClust outperforms existing spatial clustering methods such as DBSCAN, C2P, DENCLUE, Incremental DBSCAN and BIRCH in terms of clustering time and accuracy of clusters found.
引用
收藏
页码:5 / 27
页数:23
相关论文
共 17 条
[1]  
Berchtold S, 1996, PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P28
[2]   INCREMENTAL CLUSTERING FOR DYNAMIC INFORMATION-PROCESSING [J].
CAN, F .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1993, 11 (02) :143-164
[3]  
Ester M., 1998, Proceedings of the Twenty-Fourth International Conference on Very-Large Databases, P323
[4]  
Ester M., 1996, Proc. Second Int. Conf. Knowl. Discov. Data Min, P226, DOI DOI 10.5555/3001460.3001507
[5]  
Fisher D. H., 1987, Machine Learning, V2, P139, DOI 10.1007/BF00114265
[6]   Clustering large datasets in arbitrary metric spaces [J].
Ganti, V ;
Ramakrishnan, R ;
Gehrke, J ;
Powell, A ;
French, J .
15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1999, :502-511
[7]  
GANTI V, 2001, IEEE T KNOWL DAT ENG
[8]  
Guha S., 1998, SIGMOD Record, V27, P73, DOI 10.1145/276305.276312
[9]  
Hinneburg A., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P58
[10]  
MacQueen J., 1967, P 5 BERK S MATH STAT, V14, P281, DOI DOI 10.1234/12345678