Consistency of spectral clustering

被引:331
作者
von Luxburg, Ulrike [1 ]
Belkin, Mikhail [2 ]
Bousquet, Olivier [3 ]
机构
[1] Max Planck Inst Biol Cybernet, D-72076 Tubingen, Germany
[2] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
[3] Pertinence, F-75002 Paris, France
关键词
spectral clustering; graph Laplacian; consistency; convergence of eigenvectors;
D O I
10.1214/009053607000000640
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Consistency is a key property of all statistical procedures analyzing randomly sampled data. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of the popular family of spectral clustering algorithms, which clusters the data with the help of eigenvectors of graph Laplacian matrices. We develop new methods to establish that, for increasing sample size, those eigenvectors converge to the eigenvectors of certain limit operators. As a result, we can prove that one of the two major classes of spectral clustering (normalized clustering) converges under very general conditions, while the other (unnormalized clustering) is only consistent under strong additional assumptions, which are not always satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized spectral clustering.
引用
收藏
页码:555 / 586
页数:32
相关论文
共 51 条
[41]   PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS [J].
POTHEN, A ;
SIMON, HD ;
LIOU, KP .
SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 1990, 11 (03) :430-452
[42]  
Shawe-Taylor J, 2002, LECT NOTES ARTIF INT, V2533, P23
[43]   Normalized cuts and image segmentation [J].
Shi, JB ;
Malik, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (08) :888-905
[44]   Spectral partitioning works: Planar graphs and finite element meshes [J].
Spielman, DA ;
Teng, SH .
37TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 1996, :96-105
[45]  
van der Vaart A., 1996, WEAK CONVERGENCE EMP
[46]   AN IMPROVED SPECTRAL BISECTION ALGORITHM AND ITS APPLICATION TO DYNAMIC LOAD BALANCING [J].
VANDRIESSCHE, R ;
ROOSE, D .
PARALLEL COMPUTING, 1995, 21 (01) :29-48
[47]  
VONLUXBURG U, 2007, IN PRESS STAT COMPUT, V17
[48]  
VONLUXBURG U, 2004, THESIS TECHNICAL U B
[49]  
Weiss Y., 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision, P975, DOI 10.1109/ICCV.1999.790354
[50]  
WILLIAMS CKI, 2000, P 17 INT C MACH LEAR, P1159