Box office prediction based on microblog

被引:72
作者
Du, Jingfei [1 ,2 ]
Xu, Hua [1 ]
Huang, Xiaoqiu [1 ,2 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Tethnol, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China
[2] Beijing Univ Posts & Telecommun, Beijing 100876, Peoples R China
基金
中国国家自然科学基金;
关键词
Box office; Microblog; Social media; Prediction model;
D O I
10.1016/j.eswa.2013.08.065
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the importance and popularity of online social media has become more obvious, there are more researches aiming at making use of information from them. One important topic of this is predicting the future with social media. This paper focuses on predicting box offices using microblog. Compared with previous work which makes use of the count of related microblogs simply, the information from social media has been utilized more deeply in this paper. Two sets of features have been extracted: count based features and content based features. For the former, the information in the aspect of users, which decrease the influence of garbage microblogs, has been exploited. For content based features, a new box office oriented semantic classification method has been provided to make the features more relative with box offices. Meanwhile, more complex machine learning models such as SVM and neutral network have been applied to the prediction method. Our prediction model is more accurate and reliable. With our prediction method, the data in Tencent microblog has been utilized to predict box offices of certain movies in China. With the results, the strength of our method and predictive power of online social media can be completely demonstrated. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1680 / 1689
页数:10
相关论文
共 20 条
[1]  
[Anonymous], 2012, P 21 INT C WORLD WID, DOI [10.1145/2187980.2188254, DOI 10.1145/2187980.2188254]
[2]  
[Anonymous], 2010, Proceedings of the 19th International Conference on World Wide Web, WWW'10, page, DOI DOI 10.1145/1772690.1772754
[3]  
[Anonymous], 2009, CIKM, DOI 10.1145/1645953.1646094
[4]  
Asur S., 2010, Proceedings 2010 IEEE/ACM International Conference on Web Intelligence-Intelligent Agent Technology (WI-IAT), P492, DOI 10.1109/WI-IAT.2010.63
[5]  
Bar-Haim R., 2011, P C EMP METH NAT LAN, P1310, DOI DOI 10.5555/2145432.2145569
[6]  
De Choudhury M, 2009, 20TH ACM CONFERENCE ON HYPERTEXT AND HYPERMEDIA (HYPERTEXT 2009), P349
[7]   Don't Turn Social Media Into Another 'Literary Digest' Poll [J].
Gayo-Avello, Daniel .
COMMUNICATIONS OF THE ACM, 2011, 54 (10) :121-128
[8]  
Gilbert E, 2009, CHI2009: PROCEEDINGS OF THE 27TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P211
[9]  
Gupta A., 2012, PSOSM 12 P 1 WORKSH
[10]  
Hong Liangjie, 2011, P 20 INT C COMP WORL, P57, DOI [DOI 10.1145/1963192.1963222, 10.1145/1963192.1963222]