Validity index for crisp and fuzzy clusters

被引:526
作者
Pakhira, MK [1 ]
Bandyopadhyay, S
Maulik, U
机构
[1] Kalyani Govt Engn Coll, Dept Comp Sci & Technol, Kalyani 741235, W Bengal, India
[2] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700108, W Bengal, India
[3] Kalyani Govt Engn Coll, Dept Comp Sci & Technol, Kalyani 741235, W Bengal, India
关键词
clustering; expectation maximization algorithm; fuzzy c-means algorithm; k-means algorithm; unsupervised classification; validity index;
D O I
10.1016/j.patcog.2003.06.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, a cluster validity index and its fuzzification is described, which can provide a measure of goodness of clustering on different partitions of a data set. The maximum value of this index, called the PBM-index, across the hierarchy provides the best partitioning. The index is defined as a product of three factors, maximization of which ensures the formation of a small number of compact clusters with large separation between at least two clusters. We have used both the k-means and the expectation maximization algorithms as underlying crisp clustering techniques. For fuzzy clustering, we have utilized the well-known fuzzy c-means algorithm. Results demonstrating the superiority of the PBM-index in appropriately determining the number of clusters, as compared to three other well-known measures, the Davies-Bouldin index, Dunn's index and the Xie-Beni index, are provided for several artificial and real-life data sets. (C) 2003 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:487 / 501
页数:15
相关论文
共 17 条
[1]  
Anderberg M.R., 1973, Probability and Mathematical Statistics
[2]  
[Anonymous], 1989, GENETIC ALGORITHM SE
[3]   Some new indexes of cluster validity [J].
Bezdek, JC ;
Pal, NR .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1998, 28 (03) :301-315
[4]  
BRADLEY PS, 1998, SCALING EM EXPECTAIO
[5]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227
[6]  
Devijver P., 1982, PATTERN RECOGN
[7]   CLUSTERING TECHNIQUES - USERS DILEMMA [J].
DUBES, R ;
JAIN, AK .
PATTERN RECOGNITION, 1976, 8 (04) :247-260
[8]  
Dunn J.C., 1973, J CYBERNETICS, V3, P32, DOI DOI 10.1080/01969727308546046
[9]   The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[10]  
Hartigan J. A., 1975, CLUSTERING ALGORITHM