Rough Cluster Quality Index Based on Decision Theory

被引:89
作者
Lingras, Pawan [1 ]
Chen, Min [2 ]
Miao, Duoqian [2 ]
机构
[1] St Marys Univ, Dept Math & Comp Sci, Halifax, NS B3H 3C3, Canada
[2] Tongji Univ, Elect & Informat Engn Dept, Shanghai 201804, Peoples R China
基金
加拿大自然科学与工程研究理事会;
关键词
Cluster validity; decision theory; loss functions; rough-set-based clustering; k-means clustering; WEB;
D O I
10.1109/TKDE.2008.236
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Quality of clustering is an important issue in application of clustering techniques. Most traditional cluster validity indices are geometry-based cluster quality measures. This paper proposes a cluster validity index based on the decision-theoretic rough set model by considering various loss functions. Experiments with synthetic, standard, and real-world retail data show the usefulness of the proposed validity index for the evaluation of rough and crisp clustering. The measure is shown to help determine optimal number of clusters, as well as an important parameter called threshold in rough clustering. The experiments with a promotional campaign for the retail data illustrate the ability of the proposed measure to incorporate financial considerations in evaluating quality of a clustering scheme. This ability to deal with monetary values distinguishes the proposed decision-theoretic measure from other distance-based measures. The proposed validity index can also be extended for evaluating other clustering algorithms such as fuzzy clustering.
引用
收藏
页码:1014 / 1026
页数:13
相关论文
共 45 条
[1]  
[Anonymous], P 6 INT C SOFT COMP
[2]   Rough support vector clustering [J].
Asharaf, S ;
Shevade, SK ;
Murty, MN .
PATTERN RECOGNITION, 2005, 38 (10) :1779-1783
[3]   Rough fuzzy MLP: Knowledge encoding and classification [J].
Banerjee, M ;
Mitra, S ;
Pal, SK .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1998, 9 (06) :1203-1216
[4]  
Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms
[5]   Some new indexes of cluster validity [J].
Bezdek, JC ;
Pal, NR .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1998, 28 (03) :301-315
[6]  
Caliski T., 1974, Communications in Statistics-theory and Methods, V3, P1, DOI DOI 10.1080/03610927408827101
[7]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227
[8]  
Dunn J.C., 1974, J CYBERNETICS, V3, P95, DOI [DOI 10.1080/01969727408546059, 10.1080/019697274085460590304.68093]
[9]  
Falkenauer E., 1998, GENETIC ALGORITHMS G
[10]  
Halkidi M, 2002, SIGMOD REC, V31, P19, DOI 10.1145/601858.601862