Spectral redemption in clustering sparse networks

被引:428
作者
Krzakala, Florent [1 ,2 ]
Moore, Cristopher [3 ]
Mossel, Elchanan [4 ]
Neeman, Joe [4 ]
Sly, Allan [4 ]
Zdeborova, Lenka [5 ,6 ]
Zhang, Pan [1 ,3 ]
机构
[1] Ecole Super Phys & Chim Ind Ville Paris, F-75005 Paris, France
[2] Ecole Normale Super, F-75005 Paris, France
[3] Santa Fe Inst, Santa Fe, NM 87501 USA
[4] Univ Calif Berkeley, Berkeley, CA 94720 USA
[5] Commissariat Energie Atom Saclay, Inst Phys Theor, F-91190 Gif Sur Yvette, France
[6] CNRS, Unite Rech Associee 2306, F-91190 Gif Sur Yvette, France
基金
美国国家科学基金会; 欧洲研究理事会;
关键词
RANDOM GRAPHS; STOCHASTIC BLOCKMODELS; INFORMATION-FLOW; EIGENVALUE; COMMUNITY;
D O I
10.1073/pnas.1312486110
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Spectral algorithms are classic approaches to clustering and community detection in networks. However, for sparse networks the standard versions of these algorithms are suboptimal, in some cases completely failing to detect communities even when other algorithms such as belief propagation can do so. Here, we present a class of spectral algorithms based on a nonbacktracking walk on the directed edges of the graph. The spectrum of this operator is much better-behaved than that of the adjacency matrix or other commonly used matrices, maintaining a strong separation between the bulk eigenvalues and the eigenvalues relevant to community structure even in the sparse case. We show that our algorithm is optimal for graphs generated by the stochastic block model, detecting communities all of the way down to the theoretical limit. We also show the spectrum of the nonbacktracking operator for some real-world networks, illustrating its advantages over traditional spectral clustering.
引用
收藏
页码:20935 / 20940
页数:6
相关论文
共 37 条
[1]  
Adamic Lada A., 2005, P 3 INT WORKSHOP LIN, P36, DOI DOI 10.1145/1134271.1134277
[2]   Non-backtracking random walks mix faster [J].
Alon, Noga ;
Benjamini, Itai ;
Lubetzky, Eyal ;
Sodin, Sasha .
COMMUNICATIONS IN CONTEMPORARY MATHEMATICS, 2007, 9 (04) :585-603
[3]  
Amini AA, 2012, ARXIV12072340
[4]  
Angel Omer, 2007, ARXIV07120192
[5]   Efficient and principled method for detecting communities in networks [J].
Ball, Brian ;
Karrer, Brian ;
Newman, M. E. J. .
PHYSICAL REVIEW E, 2011, 84 (03)
[6]  
Bass Hyman, 1992, International Journal of Mathematics, V3, P717, DOI DOI 10.1142/S0129167X92000357
[7]   A nonparametric view of network models and Newman-Girvan and other modularities [J].
Bickel, Peter J. ;
Chen, Aiyou .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (50) :21068-21073
[8]   The phase transition in inhomogeneous random graphs [J].
Bollobas, Bela ;
Janson, Svante ;
Riordan, Oliver .
RANDOM STRUCTURES & ALGORITHMS, 2007, 31 (01) :3-122
[9]   Graph Partitioning via Adaptive Spectral Techniques [J].
Coja-Oghlan, Amin .
COMBINATORICS PROBABILITY & COMPUTING, 2010, 19 (02) :227-284
[10]   A Spectral Approach to Analysing Belief Propagation for 3-Colouring [J].
Coja-Oghlan, Amin ;
Mossel, Elchanan ;
Vilenchik, Dan .
COMBINATORICS PROBABILITY & COMPUTING, 2009, 18 (06) :881-912