A one-class classification approach for bot detection on Twitter

被引:71
作者
Rodriguez-Ruiz, Jorge [1 ]
Israel Mata-Sanchez, Javier [2 ]
Monroy, Raul [3 ]
Loyola-Gonzalez, Octavio [4 ]
Lopez-Cuevas, Armando [5 ]
机构
[1] Tecnol Monterrey, Sch Engn & Sci, Av Carlos Lazo 100, Ciudad De Mexico 01389, Mexico
[2] Tecnol Monterrey, Sch Engn & Sci, Av Eugenio Garza Sada 2501 Sur, Monterrey 64849, NL, Mexico
[3] Tecnol Monterrey, Sch Engn & Sci, Carretera Lego Guadalupe Km 3-5, Atizapan De Zaragoza 52926, Estado De Mexic, Mexico
[4] Tecnol Monterrey, Sch Engn & Sci, Via Atlixcayotl 2301, Puebla 72453, Mexico
[5] Tecnol Monterrey, Sch Engn & Sci, Av Gen Ramon Corona 2514, Zapopan 45138, Jalisco, Mexico
关键词
Twitter bot detection; Supervised classification; One-class classifiers; Anomaly detection; Social networks; STATISTICAL COMPARISONS; CLASSIFIERS; ENSEMBLE; ACCOUNTS; TWEETS;
D O I
10.1016/j.cose.2020.101715
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Twitter is a popular online social network with hundreds of millions of users, where n important part of the accounts in this social network are not humans. Approximately 48 million Twitter accounts are managed by automated programs called bots, which represents up to 15% of all accounts. Some bots have good purposes, such as automatically posting information about news and academic papers, and even to provide help during emergencies. Nevertheless, Twitter bots have also been used for malicious purposes, such as distributing malware or influencing the perception of the public about a topic. There are existing mechanisms that allow detecting bots on Twitter automatically; however, these mechanisms rely on examples of existing bots to discern them from legitimate accounts. As the bot landscape changes, with the bot creators using more sophisticated methods to avoid detection, new mechanisms for discerning between legitimate and bot accounts are needed. In this paper, we propose to use one-class classification to enhance Twitter bot detection, as this allows detecting novel bot accounts, and requires only from examples of legitimate accounts. Our experiment results show that our proposal can consistently detect different types of bots with a performance above 0.89 measured using AUC, without requiring previous information about them. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:14
相关论文
共 76 条
[1]  
Ackermann M.R., 2012, J. Exp. Algorithmics (JEA), V17, P2, DOI [DOI 10.1145/2133803.2184450, 10.1145/2133803.2184450]
[2]   A generic statistical approach for spam detection in Online Social Networks [J].
Ahmed, Faraz ;
Abulaish, Muhammad .
COMPUTER COMMUNICATIONS, 2013, 36 (10-11) :1120-1129
[3]   AN INTRODUCTION TO KERNEL AND NEAREST-NEIGHBOR NONPARAMETRIC REGRESSION [J].
ALTMAN, NS .
AMERICAN STATISTICIAN, 1992, 46 (03) :175-185
[4]   Bagging-TPMiner: a classifier ensemble for masquerader detection based on typical objects [J].
Angel Medina-Perez, Miguel ;
Monroy, Raul ;
Benito Camina, J. ;
Garcia-Borroto, Milton .
SOFT COMPUTING, 2017, 21 (03) :557-569
[5]  
[Anonymous], 2008, Bayesian Networks
[6]  
[Anonymous], 2011, 5 INT AAAI C WEBLOGS
[7]  
[Anonymous], 2015, P 31 ANN COMP SEC AP, DOI DOI 10.1145/2818000.2818047
[8]  
Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
[9]   HOW BROWN ADIPOSE TISSUE CORRECTS HYPERLIPIDEMIA AND OBESITY: A ROLE FOR LIPOPROTEINS [J].
Bartelt, A. ;
Bruns, O. T. ;
Ittrich, H. ;
Niemeier, A. ;
Merkel, M. ;
Heeren, J. .
ATHEROSCLEROSIS SUPPLEMENTS, 2011, 12 (01) :3-3
[10]  
Batista G.E., 2004, ACM SIGKDD Explor. Newsl, V6, P20, DOI [DOI 10.1145/1007730.1007735, 10.1145/1007730.1007735]