Exploring Co-Training Strategies for Opinion Detection

被引:15
作者
Yu, Ning [1 ]
机构
[1] Univ Kentucky, Sch Lib & Informat Sci, Lexington, KY 40506 USA
关键词
text mining; machine learning; automatic classification; SEMANTIC ORIENTATION; CLASSIFICATION;
D O I
10.1002/asi.23111
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For the last decade or so, sentiment analysis, which aims to automatically identify opinions, polarities, or emotions from user-generated content (e.g., blogs, tweets), has attracted interest from both academic and industrial communities. Most sentiment analysis strategies fall into 2 categories: lexicon-based and corpus-based approaches. While the latter often requires sentiment-labeled data to build a machine learning model, both approaches need sentiment-labeled data for evaluation. Unfortunately, most data domains lack sufficient quantities of labeled data, especially at the subdocument level. Semisupervised learning (SSL), a machine learning technique that requires only a few labeled examples and can automatically label unlabeled data, is a promising strategy to deal with the issue of insufficient labeled data. Although previous studies have shown promising results of applying various SSL algorithms to solve a sentiment-analysis problem, co-training, an SSL algorithm, has not attracted much attention for sentiment analysis largely due to its restricted assumptions. Therefore, this study focuses on revisiting co-training in depth and discusses several co-training strategies for sentiment analysis following a loose assumption. Results suggest that co-training can be more effective than can other currently adopted SSL methods for sentiment analysis.
引用
收藏
页码:2098 / 2110
页数:13
相关论文
共 49 条
  • [1] Abney S, 2002, 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P360
  • [2] [Anonymous], 2007, THESIS
  • [3] [Anonymous], 2005, P 14 ACM INT C INF
  • [4] [Anonymous], 2004, COLING 2004 P 20 INT
  • [5] [Anonymous], 2011, Proc. Int. AAAI Conf. Web Soc. Media, DOI DOI 10.1609/ICWSM.V5I1.14171
  • [6] [Anonymous], 2003, Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4, CONLL'03
  • [7] [Anonymous], 2008, ACM Transactions on Information Systems (TOIS)
  • [8] [Anonymous], 2009, P JOINT C 47 ANN M A
  • [9] [Anonymous], P INT C REC ADV NAT
  • [10] Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962