Sentiment Polarity Detection for Software Development

被引:157
作者
Calefato, Fabio [1 ]
Lanubile, Filippo [2 ]
Maiorano, Federico [2 ]
Novielli, Nicole [2 ]
机构
[1] Univ Bari A Moro, Dipartimento Jon, Via Duomo 259, I-74123 Taranto, Italy
[2] Univ Bari A Moro, Dipartimento Informat, Via E Orabona 4, I-70125 Bari, Italy
关键词
Sentiment Analysis; Communication Channels; Stack Overflow; Word Embedding; Social Software Engineering; REPRESENTATION;
D O I
10.1007/s10664-017-9546-9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The role of sentiment analysis is increasingly emerging to study software developers' emotions by mining crowd-generated content within social software engineering tools. However, off-the-shelf sentiment analysis tools have been trained on non-technical domains and general-purpose social media, thus resulting in misclassifications of technical jargon and problem reports. Here, we present Senti4SD, a classifier specifically trained to support sentiment analysis in developers' communication channels. Senti4SD is trained and validated using a gold standard of Stack Overflow questions, answers, and comments manually annotated for sentiment polarity. It exploits a suite of both lexicon- and keyword-based features, as well as semantic features based on word embedding. With respect to a mainstream off-the-shelf tool, which we use as a baseline, Senti4SD reduces the misclassifications of neutral and positive posts as emotionally negative. To encourage replications, we release a lab package including the classifier, the word embedding space, and the gold standard with annotation guidelines.
引用
收藏
页码:1352 / 1382
页数:31
相关论文
共 74 条
[41]   Bootstrapping a Lexicon for Emotional Arousal in Software Engineering [J].
Mantyla, Mika V. ;
Novielli, Nicole ;
Lanubile, Filippo ;
Claes, Maelick ;
Kuutila, Miikka .
2017 IEEE/ACM 14TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2017), 2017, :198-202
[42]  
Meta, 2017, MET EXCH IS TOO HARS
[43]  
Mikolov T., 2013, ICLR, P3111
[44]  
Mikolov T., 2013, Adv Neural Inf Process Syst, P26, DOI DOI 10.48550/ARXIV.1310.4546
[45]   CONTEXTUAL CORRELATES OF SEMANTIC SIMILARITY [J].
MILLER, GA ;
CHARLES, WG .
LANGUAGE AND COGNITIVE PROCESSES, 1991, 6 (01) :1-28
[46]  
MITCHELL T, 1989, ANNU REV COMPUT SCI, V4, P417
[47]  
Mohammad SM, 2016, EMOTION MEASUREMENT, P201, DOI 10.1016/B978-0-08-100508-8.00009-6
[48]   CROWDSOURCING A WORD-EMOTION ASSOCIATION LEXICON [J].
Mohammad, Saif M. ;
Turney, Peter D. .
COMPUTATIONAL INTELLIGENCE, 2013, 29 (03) :436-465
[49]   Stuck and Frustrated or In Flow and Happy: Sensing Developers' Emotions and Progress [J].
Muller, Sebastian C. ;
Fritz, Thomas .
2015 IEEE/ACM 37TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, VOL 1, 2015, :688-699
[50]  
Murgia A., 2014, P 11 WORKING C MININ, P262, DOI [10.1145/2597073.2597086, DOI 10.1145/2597073.2597086]