Community detection via heterogeneous interaction analysis

被引:154
作者
Tang, Lei [1 ]
Wang, Xufei [2 ]
Liu, Huan [2 ]
机构
[1] Yahoo Labs Silicon Valley, Santa Clara, CA 95054 USA
[2] Arizona State Univ, Dept Comp Sci & Engn, Tempe, AZ 85287 USA
关键词
Community detection; Heterogeneous interactions; Network integration; Multi-dimensional networks; Social media; NETWORKS; SETS;
D O I
10.1007/s10618-011-0231-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The pervasiveness of Web 2.0 and social networking sites has enabled people to interact with each other easily through various social media. For instance, popular sites like Del.icio.us, Flickr, and YouTube allow users to comment on shared content (bookmarks, photos, videos), and users can tag their favorite content. Users can also connect with one another, and subscribe to or become a fan or a follower of others. These diverse activities result in a multi-dimensional network among actors, forming group structures with group members sharing similar interests or affiliations. This work systematically addresses two challenges. First, it is challenging to effectively integrate interactions over multiple dimensions to discover hidden community structures shared by heterogeneous interactions. We show that representative community detection methods for single-dimensional networks can be presented in a unified view. Based on this unified view, we present and analyze four possible integration strategies to extend community detection from single-dimensional to multi-dimensional networks. In particular, we propose a novel integration scheme based on structural features. Another challenge is the evaluation of different methods without ground truth information about community membership. We employ a novel cross-dimension network validation (CDNV) procedure to compare the performance of different methods. We use synthetic data to deepen our understanding, and real-world data to compare integration strategies as well as baseline methods in a large scale. We study further the computational time of different methods, normalization effect during integration, sensitivity to related parameters, and alternative community detection methods for integration.
引用
收藏
页码:1 / 33
页数:33
相关论文
共 62 条
[51]  
Specia L, 2007, LECT NOTES COMPUT SC, V4519, P624
[52]  
Strehl A, 2002, EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, P93, DOI 10.1162/153244303321897735
[53]  
Tang L, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P817
[54]   Uncovering Groups via Heterogeneous Interaction Analysis [J].
Tang, Lei ;
Wang, Xufei ;
Liu, Huan .
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, :503-512
[55]   Clustering with Multiple Graphs [J].
Tang, Wei ;
Lu, Zhengdong ;
Dhillon, Inderjit S. .
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, :1016-+
[56]  
Topchy A, 2003, THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, P331
[57]  
Wasserman S., 1994, Social network analysis: Methods and applications, DOI [10.1017/CBO9780511815478, DOI 10.1017/CBO9780511815478]
[58]  
White S, 2005, SIAM PROC S, P274
[59]  
Witten I.H., 2005, Data Mining: Practical machine learning tools and techniques, V2nd
[60]  
Yingzi Jin YM, 2008, WWW 08, P21