A new hybrid method based on partitioning-based DBSCAN and ant clustering

被引:70
作者
Jiang, Hua [1 ]
Li, Jing [1 ]
Yi, Shenghe [1 ]
Wang, Xiangyang [1 ]
Hu, Xin [1 ]
机构
[1] NE Normal Univ, Coll Comp Sci, Changchun 130117, Jilin, Peoples R China
关键词
Clustering; Density-based clustering; DBSCAN; PDBSCAN; Ant clustering algorithm; K-HARMONIC MEANS; ALGORITHM;
D O I
10.1016/j.eswa.2011.01.135
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering problem is an unsupervised learning problem. It is a procedure that partition data objects into matching clusters. The data objects in the same cluster are quite similar to each other and dissimilar in the other clusters. Density-based clustering algorithms find clusters based on density of data points in a region. DBSCAN algorithm is one of the density-based clustering algorithms. It can discover clusters with arbitrary shapes and only requires two input parameters. DBSCAN has been proved to be very effective for analyzing large and complex spatial databases. However, DBSCAN needs large volume of memory support and often has difficulties with high-dimensional data and clusters of very different densities. So, partitioning-based DBSCAN algorithm (PDBSCAN) was proposed to solve these problems. But PDBSCAN will get poor result when the density of data is non-uniform. Meanwhile, to some extent. DBSCAN and PDBSCAN are both sensitive to the initial parameters. In this paper, we propose a new hybrid algorithm based on PDBSCAN. We use modified ant clustering algorithm (ACA) and design a new partitioning algorithm based on 'point density' (PD) in data preprocessing phase. We name the new hybrid algorithm PACA-DBSCAN. The performance of PACA-DBSCAN is compared with DBSCAN and PDBSCAN on five data sets. Experimental results indicate the superiority of PACA-DBSCAN algorithm. (C) 2011 Elsevier Ltd. All rights reserved.
引用
收藏
页码:9373 / 9381
页数:9
相关论文
共 24 条
[1]  
[Anonymous], 1991, SIMULATION ADAPTIVE
[2]  
[Anonymous], P 3 INT C SIM AD BEH
[3]  
[Anonymous], 1996, P AAAI INT C KNOWL D
[4]   ST-DBSCAN: An algorithm for clustering spatial-temp oral data [J].
Birant, Derya ;
Kut, Alp .
DATA & KNOWLEDGE ENGINEERING, 2007, 60 (01) :208-221
[5]   Density-Based Clustering over an Evolving Data Stream with Noise [J].
Cao, Feng ;
Ester, Martin ;
Qian, Weining ;
Zhou, Aoying .
PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, :328-+
[6]  
Chu S, 2003, CHINESE J ELECTRON, V12, P349
[7]   Distribution free decomposition of multivariate data [J].
Comaniciu, D ;
Meer, P .
PATTERN ANALYSIS AND APPLICATIONS, 1999, 2 (01) :22-30
[8]  
DALLI A, 2003, EACL 2003 BUD
[9]  
GAN GJ, 2007, ASA SIAM SERIES STAT, pCH7
[10]   K-harmonic means data clustering with Tabu-search method [J].
Gungor, Zulal ;
Unler, Alper .
APPLIED MATHEMATICAL MODELLING, 2008, 32 (06) :1115-1125