Hate Speech Detection with Comment Embeddings

被引:337
作者
Djuric, Nemanja [1 ]
Zhou, Jing [1 ]
Morris, Robin [1 ]
Grbovic, Mihajlo [1 ]
Radosavljevic, Vladan [1 ]
Bhamidipati, Narayan [1 ]
机构
[1] Yahoo Labs, 701 First Ave, Sunnyvale, CA 94089 USA
来源
WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB | 2015年
关键词
D O I
10.1145/2740908.2742760
中图分类号
TP301 [理论、方法];
学科分类号
080201 [机械制造及其自动化];
摘要
We address the problem of hate speech detection in online user comments. Hate speech, defined as an "abusive speech targeting specific group characteristics, such as ethnicity, religion, or gender", is an important problem plaguing websites that allow users to leave feedback, having a negative impact on their online business and overall user experience. We propose to learn distributed low-dimensional representations of comments using recently proposed neural language models, that can then be fed as inputs to a classification algorithm. Our approach addresses issues of high-dimensionality and sparsity that impact the current state-of-the-art, resulting in highly efficient and effective hate speech detectors.
引用
收藏
页码:29 / 30
页数:2
相关论文
共 7 条
[1]
Bo Pang, 2008, Foundations and Trends in Information Retrieval, V2, P1, DOI 10.1561/1500000001
[2]
Burnap P., 2014, IPP
[3]
Kwok Irene, 2013, AAAI
[4]
Le Q., 2014, P 31 INT C MACHINE L, P7
[5]
Massaro T. M., 1990, WILLIAM MARY LAW REV, V32, P211
[6]
Warner W., 2012, WORKSHOP LANGUAGE SO, P19, DOI DOI 10.5555/2390374.2390377
[7]
Xu Z., 2010, COLL EL MESS ANT SPA