Friendship Prediction and Homophily in Social Media

被引:288
作者
Aiello, Luca Maria [1 ]
Barrat, Alain [2 ,3 ]
Schifanella, Rossano [1 ]
Cattuto, Ciro
Markines, Benjamin [4 ]
Menczer, Filippo [4 ]
机构
[1] Univ Turin, I-10124 Turin, Italy
[2] Aix Marseille Univ, Marseille, France
[3] Univ Sud Toulon, Toulon, France
[4] Indiana Univ, Bloomington, IN 47405 USA
基金
美国国家科学基金会;
关键词
Algorithms; Experimentation; Measurement; Social media; folksonomies; collaborative tagging; social network; homophily; link prediction; topical similarity; maximum Information path; NETWORKS; PATTERNS;
D O I
10.1145/2180861.2180866
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Social media have attracted considerable attention because their open-ended nature allows users to create lightweight semantic scaffolding to organize and share content. To date, the interplay of the social and topical components of social media has been only partially explored. Here, we study the presence of homophily in three systems that combine tagging social media with online social networks. We find a substantial level of topical similarity among users who are close to each other in the social network. We introduce a null model that preserves user activity while removing local correlations, allowing us to disentangle the actual local similarity between users from statistical effects due to the assortative mixing of user activity and centrality in the social network. This analysis suggests that users with similar interests are more likely to be friends, and therefore topical similarity measures among users based solely on their annotation metadata should be predictive of social links. We test this hypothesis on several datasets, confirming that social networks constructed from topical similarity capture actual friendship accurately. When combined with topological features, topical similarity achieves a link prediction accuracy of about 92%.
引用
收藏
页数:33
相关论文
共 61 条
  • [1] Aiello L.-M, 2010, Proceedings of the 2010 IEEE Second International Conference on Social Computing (SocialCom 2010). the Second IEEE International Conference on Privacy, Security, Risk and Trust (PASSAT 2010), P249, DOI 10.1109/SocialCom.2010.42
  • [2] [Anonymous], 2010, P 3 ACM INT C WEB SE
  • [3] [Anonymous], 2011, PROC 4 ACM INT C WEB, DOI DOI 10.1145/1935826.1935914
  • [4] [Anonymous], 2008, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM
  • [5] [Anonymous], P INT C WEBL SOC MED
  • [6] [Anonymous], 2006, ICDCSW
  • [7] [Anonymous], 2006, Proceedings of 12th International Conference on Knowledge Discovery in Data Mining
  • [8] [Anonymous], 2008, P 17 INT C WORLD WID, DOI DOI 10.1145/1367497.1367620
  • [9] [Anonymous], 2005, ACM SIGKDD EXPLOR NE
  • [10] Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks
    Aral, Sinan
    Muchnik, Lev
    Sundararajan, Arun
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (51) : 21544 - 21549