Inappropriateness of the criterion of k-way normalized cuts for deciding the number of clusters

被引:2
作者
Nagai, Ayumu [1 ]
机构
[1] Gunma Univ, Dept Comp Sci, Gunma 3768515, Japan
关键词
clustering; spectral clustering; number of clusters; cluster validation;
D O I
10.1016/j.patrec.2007.05.020
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Spectral clustering is a completely different algorithm from other existing clustering algorithms in that it relies on a linear algebraic approach including spectral decomposition. Normalized Cuts is a representative algorithm of spectral clustering. It incorporates a criterion for deciding the number k of clusters to partition. This paper shows that the criterion is not appropriate for deciding k. We showed this by proving that the optimal bipartition (that is, when k = 2) becomes the optimal clustering. Namely, based on the criterion, the evaluation becomes better when k is small. We also show that the criterion is inappropriate for comparing approximate solutions with various k. Especially we prove that a bipartition which surpasses the best given approximate solution H-k can be constructed from H-k within the time complexity at most O(k(3)), where k is the number of clusters contained in H-k. Based on these two reasons, the Normalized Cuts Criterion is not appropriate for deciding k. An alternative criterion is necessary. (C) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:1981 / 1986
页数:6
相关论文
共 20 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
[Anonymous], 2004, P 21 INT C MACHINE L
[3]  
[Anonymous], SPSS 13 0 STAT PROCE
[4]  
Cheeseman P.C., 1996, ADV KNOWLEDGE DISCOV, V180, P153, DOI https://doi.org/10.5555/257938.257954
[5]  
Cour T, 2005, PROC CVPR IEEE, P1124
[6]  
DHILLON IS, 2001, INT C KNOW DISC DAT, P269
[7]  
Dhillon IS, 2004, P 10 ACM SIGKDD INT, P551, DOI [DOI 10.1145/1014052.1014118, 10.1145/1014052.1014118]
[8]   A min-max cut algorithm for graph partitioning and data clustering [J].
Ding, CHQ ;
He, XF ;
Zha, HY ;
Gu, M ;
Simon, HD .
2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, :107-114
[9]   NEW SPECTRAL METHODS FOR RATIO CUT PARTITIONING AND CLUSTERING [J].
HAGEN, L ;
KAHNG, AB .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 1992, 11 (09) :1074-1085
[10]   On clusterings: Good, bad and spectral [J].
Kannan, R ;
Vempala, S ;
Vetta, A .
JOURNAL OF THE ACM, 2004, 51 (03) :497-515