Automatic clustering using genetic algorithms

被引:103
作者
Liu, Yongguo [1 ,2 ,3 ,4 ]
Wu, Xindong [4 ]
Shen, Yidong [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
[2] Chinese Acad Sci, Inst Software, State Key Lab Comp Sci, Beijing 100191, Peoples R China
[3] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[4] Univ Vermont, Dept Comp Sci, Burlington, VT 05405 USA
基金
国家高技术研究发展计划(863计划); 中国国家自然科学基金; 美国国家科学基金会;
关键词
Clustering; Genetic algorithms; Noising method; Davies-Bouldin index; K-means algorithm; NOISING METHOD; OPTIMIZATION;
D O I
10.1016/j.amc.2011.06.007
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In face of the clustering problem, many clustering methods usually require the designer to provide the number of clusters as input. Unfortunately, the designer has no idea, in general, about this information beforehand. In this article, we develop a genetic algorithm based clustering method called automatic genetic clustering for unknown K (AGCUK). In the AGCUK algorithm, noising selection and division-absorption mutation are designed to keep a balance between selection pressure and population diversity. In addition, the Davies-Bouldin index is employed to measure the validity of clusters. Experimental results on artificial and real-life data sets are given to illustrate the effectiveness of the AGCUK algorithm in automatically evolving the number of clusters and providing the clustering partition. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:1267 / 1279
页数:13
相关论文
共 38 条
[1]   An evolutionary technique based on K-Means algorithm for optimal clustering in RN [J].
Bandyopadhyay, S ;
Maulik, U .
INFORMATION SCIENCES, 2002, 146 (1-4) :221-237
[2]   Genetic clustering for automatic evolution of clusters and application to image classification [J].
Bandyopadhyay, S ;
Maulik, U .
PATTERN RECOGNITION, 2002, 35 (06) :1197-1208
[3]   Clustering using simulated annealing with probabilistic redistribution [J].
Bandyopadhyay, S ;
Maulik, U ;
Pakhira, MK .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2001, 15 (02) :269-285
[4]   Nonparametric genetic clustering: Comparison of validity indices [J].
Bandyopadhyay, S ;
Maulik, U .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2001, 31 (01) :120-125
[5]   Genetic algorithm with elitist model and its convergence [J].
Bhandari, D ;
Murthy, CA ;
Pal, SK .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 1996, 10 (06) :731-747
[6]  
Blake C. L., 1998, Uci repository of machine learning databases
[7]   A robust dynamic niching genetic algorithm with niche migration for automatic clustering problem [J].
Chang, Dong-Xia ;
Zhang, Xian-Da ;
Zheng, Chang-Wen ;
Zhang, Dao-Ming .
PATTERN RECOGNITION, 2010, 43 (04) :1346-1360
[8]   A new query reweighting method for document retrieval based on genetic algorithms [J].
Chang, Yu-Chuan ;
Chen, Shyi-Ming .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2006, 10 (05) :617-622
[9]   Noising methods for a clique partitioning problem [J].
Charon, I ;
Hudry, O .
DISCRETE APPLIED MATHEMATICS, 2006, 154 (05) :754-769
[10]   THE NOISING METHOD - A NEW METHOD FOR COMBINATORIAL OPTIMIZATION [J].
CHARON, I ;
HUDRY, O .
OPERATIONS RESEARCH LETTERS, 1993, 14 (03) :133-137