Maarten Sap

Risk of Racial Bias in Hate Speech Detection

There are three files:

Founta dataset with dialect extracted

founta_all_dial.csv

Davidson dataset with dialect extracted

davidson_dial.csv

Mturk re-annotations with no/dialect/race priming:

Collected with this MTurk template, sap2019risk_mTurkExperiment.csv contains the annotations from our pilot study, with the following columns:

Citations

Annotators with Attitudes

Download annotated data: annWithAttitudes.tgz

Qual file: annWithAttitudes-Qual.html

Large-scale question: annWithAttitudes-LargeScale.html

Citation

Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi & Noah A. Smith (2022) Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection. NAACL.