Sentiment in Short Strength Detection Informal Text

被引:1054
作者
Thelwall, Mike [1 ]
Buckley, Kevan [1 ]
Paltoglou, Georgios [1 ]
Cai, Di [1 ]
Kappas, Arvid [2 ]
机构
[1] Wolverhampton Univ, Sch Comp & Informat Technol, Stat Cybermetr Res Grp, Wolverhampton WV1 1SB, England
[2] Jacobs Univ Bremen, Sch Humanities & Social Sci, D-28759 Bremen, Germany
来源
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY | 2010年 / 61卷 / 12期
关键词
NEGATIVE AFFECT; OPINIONS; INDEPENDENCE; POLARITY; EMOTION; WORDS;
D O I
10.1002/asi.21416
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A huge number of informal messages are posted every day in social network sites, blogs, and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behavior to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially oriented, designed to identify opinions about products rather than user behaviors. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimized by machine learning, SentiStrength is able to predict positive emotion with 60.6% accuracy and negative emotion with 72.8% accuracy, both based upon strength scales of 1-5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches.
引用
收藏
页码:2544 / 2558
页数:15
相关论文
共 78 条
[31]  
Gamon M, 2005, LECT NOTES COMPUT SC, V3646, P121
[32]  
Gamon M., 2004, P 20 INT C COMPUTATI, P841
[33]  
Gill AJ, 2008, CHI 2008: 26TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS VOLS 1 AND 2, CONFERENCE PROCEEDINGS, P1121
[34]  
Grinter B., 2003, P SIGCHI C HUM FACT, P441, DOI DOI 10.1145/642611.642688
[35]  
Hancock JT, 2008, CSCW: 2008 ACM CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK, CONFERENCE PROCEEDINGS, P295
[36]   A Method of Automated Nonparametric Content Analysis for Social Science [J].
Hopkins, Daniel J. ;
King, Gary .
AMERICAN JOURNAL OF POLITICAL SCIENCE, 2010, 54 (01) :229-247
[37]   Hunting suicide notes in web 2.0 - Preliminary findings [J].
Huang, Yen-Pei ;
Goh, Tiong ;
Liew, Chem Li .
ISM WORKSHOPS 2007: NINTH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA - WORKSHOPS, PROCEEDINGS, 2007, :517-521
[38]   Evidence for the independence of positive and negative well-being: Implications for quality of life assessment [J].
Huppert, FA ;
Whittington, JE .
BRITISH JOURNAL OF HEALTH PSYCHOLOGY, 2003, 8 :107-122
[39]  
Krippendorff K., 2018, Content analysis: An introduction to its methodology (Third edition), DOI DOI 10.2307/2288384
[40]   TECHNIQUES FOR AUTOMATICALLY CORRECTING WORDS IN TEXT [J].
KUKICH, K .
COMPUTING SURVEYS, 1992, 24 (04) :377-439