Exploring Co-Training Strategies for Opinion Detection

被引：15

作者：

Yu, Ning ^{[1
]}

机构：

[1] Univ Kentucky, Sch Lib & Informat Sci, Lexington, KY 40506 USA

来源：

JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY | 2014年 / 65卷 / 10期

关键词：

text mining; machine learning; automatic classification; SEMANTIC ORIENTATION; CLASSIFICATION;

D O I：

10.1002/asi.23111

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

For the last decade or so, sentiment analysis, which aims to automatically identify opinions, polarities, or emotions from user-generated content (e.g., blogs, tweets), has attracted interest from both academic and industrial communities. Most sentiment analysis strategies fall into 2 categories: lexicon-based and corpus-based approaches. While the latter often requires sentiment-labeled data to build a machine learning model, both approaches need sentiment-labeled data for evaluation. Unfortunately, most data domains lack sufficient quantities of labeled data, especially at the subdocument level. Semisupervised learning (SSL), a machine learning technique that requires only a few labeled examples and can automatically label unlabeled data, is a promising strategy to deal with the issue of insufficient labeled data. Although previous studies have shown promising results of applying various SSL algorithms to solve a sentiment-analysis problem, co-training, an SSL algorithm, has not attracted much attention for sentiment analysis largely due to its restricted assumptions. Therefore, this study focuses on revisiting co-training in depth and discusses several co-training strategies for sentiment analysis following a loose assumption. Results suggest that co-training can be more effective than can other currently adopted SSL methods for sentiment analysis.

引用

页码：2098 / 2110

页数：13

共 49 条

[1] Abney S, 2002, 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P360
[2] [Anonymous], 2007, THESIS
[3] [Anonymous], 2005, P 14 ACM INT C INF
[4] [Anonymous], 2004, COLING 2004 P 20 INT
[5] [Anonymous], 2011, Proc. Int. AAAI Conf. Web Soc. Media, DOI DOI 10.1609/ICWSM.V5I1.14171
[6] [Anonymous], 2003, Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003-Volume 4, CONLL'03
[7] [Anonymous], 2008, ACM Transactions on Information Systems (TOIS)
[8] [Anonymous], 2009, P JOINT C 47 ANN M A
[9] [Anonymous], P INT C REC ADV NAT
[10] Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962

← 1 2 3 4 5 →