Sentiment in Short Strength Detection Informal Text

被引:1054
作者
Thelwall, Mike [1 ]
Buckley, Kevan [1 ]
Paltoglou, Georgios [1 ]
Cai, Di [1 ]
Kappas, Arvid [2 ]
机构
[1] Wolverhampton Univ, Sch Comp & Informat Technol, Stat Cybermetr Res Grp, Wolverhampton WV1 1SB, England
[2] Jacobs Univ Bremen, Sch Humanities & Social Sci, D-28759 Bremen, Germany
来源
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY | 2010年 / 61卷 / 12期
关键词
NEGATIVE AFFECT; OPINIONS; INDEPENDENCE; POLARITY; EMOTION; WORDS;
D O I
10.1002/asi.21416
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A huge number of informal messages are posted every day in social network sites, blogs, and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behavior to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially oriented, designed to identify opinions about products rather than user behaviors. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimized by machine learning, SentiStrength is able to predict positive emotion with 60.6% accuracy and negative emotion with 72.8% accuracy, both based upon strength scales of 1-5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches.
引用
收藏
页码:2544 / 2558
页数:15
相关论文
共 78 条
[1]   Affect analysis of web forums and blogs using correlation ensembles [J].
Abbasi, Ahmed ;
Chen, Hsinchun ;
Thoms, Sven ;
Fu, Tianjun .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (09) :1168-1180
[2]   Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums [J].
Abbasi, Ahmed ;
Chen, Hsinchun ;
Salem, Arab .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2008, 26 (03)
[3]  
AGERRI R, 2010, 7 C INT LANG RES EV
[4]  
[Anonymous], 2006, P COLING ACL MAIN C
[5]  
[Anonymous], 2006, P LANG RES EV LREC 2
[6]  
[Anonymous], 2005, Proceedings of the ACL student research workshop
[7]  
[Anonymous], 2007, P 2007 JOINT C EMP M
[8]  
[Anonymous], 2001, AS PAC FIN ASS ANN C
[9]   Stylistic text classification using functional lexical features [J].
Argamon, Shlomo ;
Whitelaw, Casey ;
Chase, Paul ;
Hota, Sobhan Raj ;
Garg, Navendu ;
Levitan, Shlomo .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (06) :802-822
[10]   Inter-Coder Agreement for Computational Linguistics [J].
Artstein, Ron ;
Poesio, Massimo .
COMPUTATIONAL LINGUISTICS, 2008, 34 (04) :555-596