Measuring praise and criticism: Inference of semantic orientation from association

被引:718
作者
Turney, PD
Littman, ML
机构
[1] Natl Res Council Canada, Inst Informat Technol, Ottawa, ON K1A 0R6, Canada
[2] Rutgers State Univ, Dept Comp Sci, Piscataway, NJ 08854 USA
关键词
algorithms; experimentation semantic orientation; semantic association; web mining; text mining; text classification; unsupervised learning; mutual information; latent semantic analysis;
D O I
10.1145/944012.944013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The evaluative character of a word is called its semantic orientation. Positive semantic orientation indicates praise (e.g., "honest", "intrepid") and negative semantic orientation indicates criticism (e.g., "disturbing", " superfluous"). Semantic orientation varies in both direction ( positive or negative) and degree ( mild to strong). An automated system for measuring semantic orientation would have application in text classification, text filtering, tracking opinions in online discussions, analysis of survey responses, and automated chat systems (chatbots). This article introduces a method for inferring the semantic orientation of a word from its statistical association with a set of positive and negative paradigm words. Two instances of this approach are evaluated, based on two different statistical measures of word association: pointwise mutual information (PMI) and latent semantic analysis (LSA). The method is experimentally tested with 3,596 words ( including adjectives, adverbs, nouns, and verbs) that have been manually labeled positive ( 1,614 words) and negative ( 1,982 words). The method attains an accuracy of 82.8% on the full test set, but the accuracy rises above 95% when the algorithm is allowed to abstain from classifying mild words.
引用
收藏
页码:315 / 346
页数:32
相关论文
共 31 条
  • [1] AGRESTI A., 2019, INTRO CATEGORICAL DA
  • [2] [Anonymous], 2000, P 17 NAT C ART INT
  • [3] BARTELL BT, 1992, SIGIR 92 : PROCEEDINGS OF THE FIFTEENTH ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, P161
  • [4] Budanitsky A., 2001, WORKSH WORDN OTH LEX
  • [5] CHURCH KW, 1990, 27TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, P76
  • [6] DEERWESTER S, 1990, J AM SOC INFORM SCI, V41, P391, DOI 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO
  • [7] 2-9
  • [8] Dunning T., 1993, Computational Linguistics, V19, P61
  • [9] Firth J. R., 1957, STUDIES LINGUISTIC A, P1
  • [10] Golub G.H., 2013, MATRIX COMPUTATIONS