A study of supervised term weighting scheme for sentiment analysis

被引:100
作者
Deng, Zhi-Hong [1 ]
Luo, Kun-Hu [1 ]
Yu, Hong-Liang [1 ]
机构
[1] Peking Univ, Sch Elect Engn & Comp Sci, Dept Machine Intelligence, Key Lab Machine Percept,Minist Educ, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
Sentiment analysis; Term weighting; Supervised learning; Experimentation; Performance;
D O I
10.1016/j.eswa.2013.10.056
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Term weighting is a strategy that assigns weights to terms to improve the performance of sentiment analysis and other text mining tasks. In this paper, we propose a supervised term weighting scheme based on two basic factors: Importance of a term in a document (ITD) and importance of a term for expressing sentiment (175), to improve the performance of analysis. For ITD, we explore three definitions based on term frequency. Then, seven statistical functions are employed to learn the ITS of each term from training documents with category labels. Compared with the previous unsupervised term weighting schemes originated from information retrieval, our scheme can make full use of the available labeling information to assign appropriate weights to terms. We have experimentally evaluated the proposed method against the state-of-the-art method. The experimental results show that our method outperforms the method and produce the best accuracy on two of three data sets. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3506 / 3513
页数:8
相关论文
共 33 条
[1]  
[Anonymous], 2003, P ACM S APPL COMP
[2]  
[Anonymous], 1997, ICML
[3]  
[Anonymous], 2008, FDN TRENDS INF RETRI, DOI DOI 10.1561/1500000001
[4]  
[Anonymous], 2008, Introduction to information retrieval
[5]  
[Anonymous], 1994, P TREC
[6]  
[Anonymous], 2011, Modern Information Retrieval: The Concepts and Technology behind Search
[7]  
Armstrong T.G., 2009, P 18 ACM C INFORM KN, P601, DOI [10.1145/1645953.1646031, DOI 10.1145/1645953, DOI 10.1145/1645953.1646031]
[8]  
CHURCH KW, 1990, 27TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, P76
[9]  
Das SanjivR., 2001, P 8 ASIA PACIFIC FIN
[10]  
Deng ZH, 2004, LECT NOTES COMPUT SC, V3007, P588