基于用户主题模型的微博用户兴趣挖掘(英文)

被引:16
作者
何力 [1 ]
贾焰 [1 ]
韩伟红 [1 ]
丁兆云 [2 ]
机构
[1] School of Computer Science,National University of Defense Technology
[2] College of Information and Management,National University of Defense Technology
关键词
microblogs; topic mining; user interest; LDA; user-topic model;
D O I
暂无
中图分类号
TP393.092 []; TP391.1 [文字信息处理];
学科分类号
080402 ; 081203 ; 0835 ;
摘要
Microblogs have become an important platform for people to publish,transform information and acquire knowledge.This paper focuses on the problem of discovering user interest in microblogs.In this paper,we propose a topic mining model based on Latent Dirichlet Allocation(LDA) named user-topic model.For each user,the interests are divided into two parts by different ways to generate the microblogs:original interest and retweet interest.We represent a Gibbs sampling implementation for inference the parameters of our model,and discover not only user's original interest,but also retweet interest.Then we combine original interest and retweet interest to compute interest words for users.Experiments on a dataset of Sina microblogs demonstrate that our model is able to discover user interest effectively and outperforms existing topic models in this task.And we find that original interest and retweet interest are similar and the topics of interest contain user labels.The interest words discovered by our model reflect user labels,but range is much broader.
引用
收藏
页码:131 / 144
页数:14
相关论文
共 22 条
[1]  
Link-PLSA-LDA: A new unsupervised model fortopicsand influence of blogs. Ramesh Nallapati,William Cohen. International Conference for Weblogs and Social Media . 2008
[2]  
Htm: A topic model for hypertexts. Congkai Sun,Bin Gao,Zhenfu Cao,Hang Li. Conferenceon Empirical Methods in Natural Language Processing . 2008
[3]  
Discovering users’’topics of interest on twitter:A first look. Michelson M,Macskassy S A. Proceedings of the 4th Workshop on Analytics for Noisy Unstructured Text Data(AND’’10) . 2010
[4]  
Tag-based User Topic Discovery Using Twitter Lists. Yamaguchi Y,Amagasa T. proceedings of the International Conference on Advances in Social Networks Analysis and Mining . 2011
[5]  
Empirical Study of Topic Modeling in Twitter. Liangjie Hong,Brian D. Davison. Proceedings of the First Workshop on Social Media Analytics . 2010
[6]  
The author-topic model for authors and documents. M. Rosen-Zvi,T. Griffiths,M. Steyvers,P. Smyth. Uncertainty in Artificial Intelligence . 2004
[7]  
http://www.keenage.com .
[8]  
Subject metadata enrichment using statistical topic models. D. Newman,,K. Hagedorn,,C. Chemudugunta,P. Smyth. Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries . 2007
[9]  
Tag-LDA for scalable real-time tag recommendation. Si, Xiance,Sun, Maosong. Journal of Information and Computational Science . 2009
[10]  
Characterizing Microblogs with Topic Models. Ramage D,Dumais S,Liebling D. International AAAI Conference on Weblogs and Social Media . 2010