Us and them: identifying cyber hate on Twitter across multiple protected characteristics

详细信息查看全文

作者：Pete Burnap ; Matthew L Williams
关键词：cyber hate ; hate speech ; Twitter ; NLP ; machine learning
刊名：EPJ Data Science
出版年：2016
出版时间：December 2016
年：2016
卷：5
期：1
全文大小：1,197 KB
刊物主题：Computer Appl. in Social and Behavioral Sciences; Socio- and Econophysics, Population and Evolutionary Models; Complexity;
出版者：Springer Berlin Heidelberg
ISSN：2193-1127

文摘

Hateful and antagonistic content published and propagated via the World Wide Web has the potential to cause harm and suffering on an individual basis, and lead to social tension and disorder beyond cyber space. Despite new legislation aimed at prosecuting those who misuse new forms of communication to post threatening, harassing, or grossly offensive language - or cyber hate - and the fact large social media companies have committed to protecting their users from harm, it goes largely unpunished due to difficulties in policing online public spaces. To support the automatic detection of cyber hate online, specifically on Twitter, we build multiple individual models to classify cyber hate for a range of protected characteristics including race, disability and sexual orientation. We use text parsing to extract typed dependencies, which represent syntactic and grammatical relationships between words, and are shown to capture ‘othering’ language - consistently improving machine classification for different types of cyber hate beyond the use of a Bag of Words and known hateful terms. Furthermore, we build a data-driven blended model of cyber hate to improve classification where more than one protected characteristic may be attacked (e.g. race and sexual orientation), contributing to the nascent study of intersectionality in hate crime.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700