Hypergraph Spectral Clustering in the Weighted Stochastic Block Model

被引:60
作者
Ahn, Kwangjun [1 ]
Lee, Kangwook [2 ]
Suh, Changho [2 ]
机构
[1] Korea Adv Inst Sci & Technol, Dept Math Sci, Daejeon 34141, South Korea
[2] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea
基金
新加坡国家研究基金会;
关键词
Clustering; hypergraph clustering; information-theoretic limits; stochastic block model; subspace clustering; COMMUNITY DETECTION; CONSISTENCY; ALGORITHM; RECOVERY;
D O I
10.1109/JSTSP.2018.2837638
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
080906 [电磁信息功能材料与结构]; 082806 [农业信息与电气工程];
摘要
Spectral clustering is a celebrated algorithm that partitions the objects based on pairwise similarity information. While this approach has been successfully applied to a variety of domains, it comes with limitations. The reason is that there are many other applications in which only multiway similarity measures are available. This motivates us to explore the multiway measurement setting. In this paper, we develop two algorithms intended for such setting: hypergraph spectral clustering (HSC) and hypergraph spectral clustering with local refinement (HSCLR). Our main contribution lies in performance analysis of the polytime algorithms under a random hypergraph model, which we name the weighted stochastic block model, in which objects and multiway measures are modeled as nodes and weights of hyperedges, respectively. Denoting by n the number of nodes, our analysis reveals the following: 1) HSC outputs a partition which is better than a random guess if the sum of edge weights (to be explained later) is Omega (n); 2) HSC out puts a partition which coincides with the hidden partition except for a vanishing fraction of nodes if the sum of edge weights is.(n); and 3) HSCLR exactly recovers the hidden partition if the sum of edge weights is on the order of n log n. Our results improve upon the state of the arts recently established under the model and they first settle the orderwise optimal results for the binary edge weight case. Moreover, we show that our results lead to efficient sketching algorithms for subspace clustering, a computer vision application. Finally, we show that HSCLR achieves the information-theoretic limits for a special yet practically relevant model, thereby showing no computational barrier for the case.
引用
收藏
页码:959 / 974
页数:16
相关论文
共 79 条
[1]
Proof of the Achievability Conjectures for the General Stochastic Block Model [J].
Abbe, Emmanuel ;
Sandon, Colin .
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 2018, 71 (07) :1334-1406
[2]
Exact Recovery in the Stochastic Block Model [J].
Abbe, Emmanuel ;
Bandeira, Afonso S. ;
Hall, Georgina .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2016, 62 (01) :471-487
[3]
Community detection in general stochastic block models: fundamental limits and efficient algorithms for recovery [J].
Abbe, Emmanuel ;
Sandon, Colin .
2015 IEEE 56TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, 2015, :670-688
[4]
Ahn K, 2017, IEEE INT SYMP INFO, P2473, DOI 10.1109/ISIT.2017.8006974
[5]
Ahn K, 2016, ANN ALLERTON CONF, P657, DOI 10.1109/ALLERTON.2016.7852294
[6]
Alon N, 1998, RANDOM STRUCT ALGOR, V13, P457, DOI 10.1002/(SICI)1098-2418(199810/12)13:3/4<457::AID-RSA14>3.0.CO
[7]
2-W
[8]
Alon Noga, 2004, PROBABILISTIC METHOD
[9]
ON SEMIDEFINITE RELAXATIONS FOR THE BLOCK MODEL [J].
Amini, Arash A. ;
Levina, Elizaveta .
ANNALS OF STATISTICS, 2018, 46 (01) :149-179
[10]
Angelini MC, 2015, ANN ALLERTON CONF, P66, DOI 10.1109/ALLERTON.2015.7446987