A fuzzy extension of the Rand index and other related indexes for clustering and classification assessment

被引:155
作者
Campello, R. J. G. B. [1 ]
机构
[1] Univ Sao Paulo, SCC, ICMC, BR-13560970 Sao Carlos, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
fuzzy clustering; fuzzy classification; external validity indexes; Rand index; adjusted Rand index; Jaccard coefficient; Minkowski measure; Fowlkes-Mallows index; Gamma statistics;
D O I
10.1016/j.patrec.2006.11.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A fuzzy extension of the Rand index [Rand, W.M., 1971. Objective criteria for the evaluation of clustering methods. J. Amer. Statist. Assoc. 846-850] is introduced in this paper. The Rand index is a traditional criterion for assessment and comparison of different results provided by classifiers and clustering algorithms. It is able to measure the quality of different hard partitions of a data set from a classification perspective, including partitions with different numbers of classes or clusters. The original Rand index is extended here by making it able to evaluate a fuzzy partition of a data set - provided by a fuzzy clustering algorithm or a classifier with fuzzy-like outputs against a reference hard partition that encodes the actual (known) data classes. A theoretical formulation based on formal concepts from the fuzzy set theory is derived and used as a basis for the mathematical interpretation of the Fuzzy Rand Index proposed. The fuzzy counterparts of other (five) related indexes, namely, the Adjusted Rand Index of Hubert and Arabic, the Jaccard coefficient, the Minkowski measure, the Fowlkes-Mallows Index, and the r statistics, are also derived from this formulation. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:833 / 841
页数:9
相关论文
共 31 条
[1]  
[Anonymous], IEEE CDC SAN DIEG CA
[2]  
[Anonymous], FUZZY SETS THEIR APP
[3]  
[Anonymous], FUZZY MODELING CONTR
[4]  
[Anonymous], Pattern Recognition With Fuzzy Objective Function Algorithms
[5]  
[Anonymous], 1997, Machine Learning
[6]   Some new indexes of cluster validity [J].
Bezdek, JC ;
Pal, NR .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1998, 28 (03) :301-315
[7]  
Bigus J.P., 1996, DATA MINING NEURAL N
[8]   A fuzzy extension of the silhouette width criterion for cluster analysis [J].
Campello, R. J. G. B. ;
Hruschka, E. R. .
FUZZY SETS AND SYSTEMS, 2006, 157 (21) :2858-2875
[9]  
CORNEY DPA, 2002, THESIS U COLL LONDON
[10]  
DENOEUD L, 2005, P 11 C APPL STOCH MO