WaveCluster:: a wavelet-based clustering approach for spatial data in very large databases

被引:173
作者
Sheikholeslami, G [1 ]
Chatterjee, S [1 ]
Zhang, AD [1 ]
机构
[1] SUNY Buffalo, Dept Comp Sci & Engn, Buffalo, NY 14260 USA
关键词
D O I
10.1007/s007780050009
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many applications require the management of spatial data in a multidimensional feature space. Clustering large spatial databases is an important problem, which tries to find the densely populated regions in the feature space to be used in data mining, knowledge discovery, or efficient information retrieval. A good clustering approach should be efficient and detect clusters of arbitrary shape. It must be insensitive to the noise (outliers) and the order of input data. We propose WaveCluster, a novel clustering approach based on wavelet transforms, which satisfies all the above requirements. Using the multiresolution property of wavelet transforms, we can effectively identify arbitrarily shaped clusters at different degrees of detail. We also demonstrate that WaveCluster is highly efficient in terms of time complexity. Experimental results on very large datasets are presented, which show the efficiency and effectiveness of the proposed approach compared to the other recent clustering methods.
引用
收藏
页码:289 / 304
页数:16
相关论文
共 34 条
[1]   Nonparametric maximum likelihood estimation of features in spatial point processes using Voronoi tessellation [J].
Allard, D ;
Fraley, C .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1997, 92 (440) :1485-1493
[2]  
[Anonymous], P 23 VLDB C ATH GREE
[3]  
[Anonymous], P 2 INT C KDD
[4]  
[Anonymous], P SPIE
[5]  
[Anonymous], 1994, MULTIMEDIA SYST
[6]  
ESTER M, 1998, KI J
[7]  
Gordon A. D., 1981, CLASSIFICATION METHO
[8]  
HON BKP, 1988, ROBOT VSION
[9]   Mathematical theory of improvability for production systems [J].
Jacobs, D ;
Meerkov, SM .
MATHEMATICAL PROBLEMS IN ENGINEERING, 1995, 1 (02) :95-137
[10]  
Knuth Donald E., 1998, ART COMPUTER PROGRAM