Applying Link-Based Classification to Label Blogs

被引:19
作者
Bhagat, Smriti [1 ]
Cormode, Graham [1 ]
Rozenbaum, Irina [1 ]
机构
[1] Rutgers State Univ, Piscataway, NJ 08855 USA
来源
ADVANCES IN WEB MINING AND WEB USAGE ANALYSIS | 2009年 / 5439卷
关键词
Graph labeling; Relational learning; Social Networks;
D O I
10.1145/1348549.1348560
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In analyzing data from social and communication networks, we encounter the problem of classifying objects where there is explicit link structure amongst the objects. We study the problem of inferring the classification of all the objects from a labeled subset, using only link-based information between objects. We abstract the above as a labeling problem on multigraphs with weighted edges. We present two classes of algorithms, based on local and global similarities. Then we focus on multigraphs induced by blog data, and carefully apply our general algorithms to specifically infer labels such as age, gender and location associated with the blog based only on the link-structure amongst them. We perform a comprehensive set of experiments with real, large-scale blog data sets and show that significant accuracy is possible from little or no non-link information, and our methods scale to millions of nodes and edges.
引用
收藏
页码:97 / 117
页数:21
相关论文
共 26 条
  • [1] ADAMIC LA, 2004, INT WORKSH LINK DISC, P36
  • [2] [Anonymous], WORKSH MULT DAT MIN
  • [3] [Anonymous], 2003, ICML
  • [4] [Anonymous], 2006, SEMISUPERVISED LEARN
  • [5] [Anonymous], 1998, STOC
  • [6] Bhagat S., 2007, INT C WEBL SOC MED
  • [7] BURGER JD, 2006, AAAI SPRING S COMP A
  • [8] Chakrabarti S., 1998, ACM SIGMOD
  • [9] DOMINGOS P, 2004, WORKSH STAT REL LEAR
  • [10] Getoor L., 2002, Journal of Machine Learning Research, V3, P679