Density-based clustering using approximate natural neighbours

被引:8
作者
Angelova, Maia [1 ]
Beliakov, Gleb [1 ]
Zhu, Ye [1 ]
机构
[1] Deakin Univ, Sch Informat Technol, Melbourne Burwood Campus, Burwood, Vic 3125, Australia
基金
澳大利亚研究理事会; 日本学术振兴会; 中国国家自然科学基金; 新加坡国家研究基金会;
关键词
Choquet integral; Fuzzy measure; Fuzzy betweenness relation; Density estimate; Clustering; INDEX;
D O I
10.1016/j.asoc.2019.105867
中图分类号
TP18 [人工智能理论];
学科分类号
140502 [人工智能];
摘要
We propose a computationally efficient natural neighbour based metric for discovering clusters of arbitrary shape based on fuzzy measures. The approximate natural neighbours are found with the help of the Choquet integral with respect to a specially designed two-additive fuzzy measure. Fuzzy betweenness relation is used to construct such a measure and helps determine the natural neighbours of a query point. The natural neighbours of a datum allow the computation of point density estimate, which in turn defines a density-based metric suitable for clustering. The proposed method overcomes the exponential computational complexity of the Delaunay triangulation traditionally used to identify the natural neighbours. The run-time of the density estimate by this metric keeps the same quadratic trends as the Euclidean distance based estimates with respect to the data size. Empirical evaluation based on 20 synthetic and real-world datasets shows that this metric has a higher clustering accuracy for existing state-of-the-art density-based clustering algorithms, such as DBSCAN, SNN and DP. Furthermore, the proposed metric is easily combined with these algorithms, and the enhanced clustering algorithms inherit positive features such as resistance to noise and imbalanced data. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 45 条
[1]
Aggarwal CC, 2014, CH CRC DATA MIN KNOW, P1
[2]
[Anonymous], DATA MINING ANAL FUN
[3]
[Anonymous], P ICASSP 03 2003 IEE
[4]
[Anonymous], 1981, Interpreting Multivariate Data
[5]
[Anonymous], 2008, PROC IPMU
[6]
[Anonymous], 2009, Proceedings of the VLDB Endowment, DOI [DOI 10.14778/1687627.1687770, 10.14778/1687627.1687770]
[7]
AURENHAMMER F, 1991, COMPUT SURV, V23, P345, DOI 10.1145/116873.116880
[8]
Beliakov G., 2019, Discrete fuzzy measures: computational aspects
[9]
Beliakov G., 2007, Aggregation Functions: A Guide for Practitioners
[10]
Graph clustering using k-Neighbourhood Attribute Structural similarity [J].
Boobalan, M. Parimala ;
Lopez, Daphne ;
Gao, X. Z. .
APPLIED SOFT COMPUTING, 2016, 47 :216-223