Short Text Classification in Twitter to Improve Information Filtering

被引：365

作者：

Sriram, Bharath ^{[1
]}

Fuhry, David ^{[1
]}

Demir, Engin ^{[1
]}

Ferhatosmanoglu, Hakan ^{[1
]}

Demirbas, Murat

机构：

[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA

来源：

SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL | 2010年

关键词：

Short text; classification; Twitter; feature selection;

D O I：

10.1145/1835449.1835643

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In microblogging services such as Twitter, the users may become overwhelmed by the raw data. One solution to this problem is the classification of short text messages. As short texts do not provide sufficient word occurrences, traditional classification methods such as "Bag-Of-Words" have limitations. To address this problem, we propose to use a small set of domain-specific features extracted from the author's profile and text. The proposed approach effectively classifies the text to a predefined set of generic classes such as News, Events, Opinions, Deals, and Private Messages.

引用

页码：841 / 842

页数：2

共 6 条

[1]

Altingovde Ismail Sengor, 2008, P SIGIR SING JUL, P861

[2]

[Anonymous], P 30 ANN INT ACM SIG

[3]

[Anonymous], 2009, P 17 ACM SIGSP INT C

[4]

Hu Xia., 2009, Proceedings of the 18th ACM Conference on Information and Knowledge Management, Hong Kong, China, P919

[5]

Java A., 2007, Proceedings of the 9th WebKDD and 1st SNA-KDD Workshop on Web Mining and Social Network Analysis, P56

[6]

Phan Xuan-Hieu, 2008, Proceedings of the 17th international conference on World Wide Web, WWW '08, P91

← 1 →