Automatic clustering and boundary detection algorithm based on adaptive influence function

被引:76
作者
Nosovskiy, Gleb V. [2 ]
Liu, Dongquan [1 ]
Sourina, Olga [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Moscow MV Lomonosov State Univ, Fac Mech & Math, Moscow, Russia
关键词
clustering algorithms; data mining; density-based clustering;
D O I
10.1016/j.patcog.2008.01.021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 [模式识别与智能系统]; 0812 [计算机科学与技术]; 0835 [软件工程]; 1405 [智能科学与技术];
摘要
Clustering became a classical problem in databases, data warehouses, pattern recognition, artificial intelligence, and computer graphics. Applications in large spatial databases, point-based graphics, etc., give rise to new requirements for the clustering algorithms: automatic discovering of arbitrary shaped and/or non-homogeneous clusters, discovering of clusters located in low-dimensional hyperspace, detecting cluster boundaries. On that account, a new clustering and boundary detecting algorithm, ADACLUS, is proposed. It is based on the specially constructed adaptive influence function, and therefore, discovers clusters of arbitrary shapes and diverse densities, adequately captures clusters boundaries, and it is robust to noise. Normally ADACLUS performs clustering purely automatically without any preliminary parameter settings. But it also gives the user an optional possibility to set three parameters with clear meaning in order to adjust clustering for special applications. The algorithm was tested on various two-dimensional data sets, and it exhibited its effectiveness in discovering clusters of complex shapes and diverse densities. Linear complexity of the ADACLUS gives it an advantage over some well-known algorithms. (c) 2008 Elsevier Ltd. All rights reserved.
引用
收藏
页码:2757 / 2776
页数:20
相关论文
共 36 条
[1]
Approximating bounded, non-orientable surfaces from points [J].
Adamson, A ;
Alexa, M .
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SHAPE MODELING AND APPLICATIONS, 2004, :243-+
[2]
Ankerst M., 1999, SIGMOD Record, V28, P49, DOI 10.1145/304181.304187
[3]
[Anonymous], P 10 INT S SPAT DAT
[4]
[Anonymous], 1996, The EM Algorithm and Extensions
[5]
[Anonymous], 1968, 1 COURSE STOCHASTIC
[6]
Ester M., 1996, Proc. Second Int. Conf. Knowl. Discov. Data Min, P226, DOI DOI 10.5555/3001460.3001507
[7]
Estivill-Castro V., 2000, P 5 INT C GEOC
[8]
Estivill-Castro V., 2000, P 9 INT S SPAT DAT H, P1
[9]
Model-based clustering, discriminant analysis, and density estimation [J].
Fraley, C ;
Raftery, AE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (458) :611-631
[10]
MCLUST: Software for model-based cluster analysis [J].
Fraley, C ;
Raftery, AE .
JOURNAL OF CLASSIFICATION, 1999, 16 (02) :297-306