REDPC: A residual error-based density peak clustering algorithm

被引:61
作者
Parmar, Milan [1 ,2 ]
Wang, Di [3 ]
Zhang, Xiaofeng [4 ]
Tan, Ah-Hwee [3 ,5 ]
Miao, Chunyan [3 ,5 ]
Jiang, Jianhua [2 ]
Zhou, You [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun, Jilin, Peoples R China
[2] Jilin Univ Finance & Econ, Sch Management Sci & Informat Engn, Changchun, Jilin, Peoples R China
[3] Nanyang Technol Univ, Joint NTU UBC Res Ctr Excellence Act Living Elder, Singapore, Singapore
[4] Harbin Inst Technol, Dept Comp Sci, Shenzhen, Peoples R China
[5] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
Clustering; Density peak clustering; Anomaly detection; Residual error; Low-density data points; SYSTEM;
D O I
10.1016/j.neucom.2018.06.087
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
The density peak clustering (DPC) algorithm was designed to identify arbitrary-shaped clusters by finding density peaks in the underlying dataset. Due to its aptitudes of relatively low computational complexity and a small number of control parameters in use, DPC soon became widely adopted. However, because DPC takes the entire data space into consideration during the computation of local density, which is then used to generate a decision graph for the identification of cluster centroids, DPC may face difficulty in differentiating overlapping clusters and in dealing with low-density data points. In this paper, we propose a residual error-based density peak clustering algorithm named REDPC to better handle datasets comprising various data distribution patterns. Specifically, REDPC adopts the residual error computation to measure the local density within a neighbourhood region. As such, comparing to DPC, our REDPC algorithm provides a better decision graph for the identification of cluster centroids and better handles the low-density data points. Experimental results on both synthetic and real-world datasets show that REDPC performs better than DPC and other algorithms. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:82 / 96
页数:15
相关论文
共 29 条
[1]
[Anonymous], 2014, INT J ADV RES COMPUT
[2]
Chaudhary A, 2013, ARXIV13032292
[3]
Discriminative and coherent subspace clustering [J].
Chen, Huazhu ;
Wang, Weiwei ;
Feng, Xiangchu ;
He, Ruiqiang .
NEUROCOMPUTING, 2018, 284 :177-186
[4]
Model-based multidimensional clustering of categorical data [J].
Chen, Tao ;
Zhang, Nevin L. ;
Liu, Tengfei ;
Poon, Kin Man ;
Wang, Yi .
ARTIFICIAL INTELLIGENCE, 2012, 176 (01) :2246-2269
[5]
Ester M., 1996, P 2 INT C KNOWL DISC
[6]
A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis [J].
Fahad, Adil ;
Alshatri, Najlaa ;
Tari, Zahir ;
Alamri, Abdullah ;
Khalil, Ibrahim ;
Zomaya, Albert Y. ;
Foufou, Sebti ;
Bouras, Abdelaziz .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2014, 2 (03) :267-279
[7]
Clustering by passing messages between data points [J].
Frey, Brendan J. ;
Dueck, Delbert .
SCIENCE, 2007, 315 (5814) :972-976
[8]
Hartigan J. A., 1979, Applied Statistics, V28, P100, DOI 10.2307/2346830
[9]
Multimodal Deep Autoencoder for Human Pose Recovery [J].
Hong, Chaoqun ;
Yu, Jun ;
Wan, Jian ;
Tao, Dacheng ;
Wang, Meng .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) :5659-5670
[10]
A fuzzy anomaly detection system based on hybrid PSO-Kmeans algorithm in content-centric networks [J].
Karami, Amin ;
Guerrero-Zapata, Manel .
NEUROCOMPUTING, 2015, 149 :1253-1269