Social Bias Frames

Social Bias Frames

Reasoning about Social and Power Implications of Language

Read the paper Watch ACL talk Download the data Data statement MTurk Annotation Template

What are Social Bias Frames?

Social Bias Frames is a new way of representing the biases and offensiveness that are implied in language. For example, these frames are meant to distill the implication that "women (candidates) are less qualified" behind the statement "we shouldn’t lower our standards to hire more women."

Why did you create Social Bias Frames?

Language has the power to reinforce stereotypes and project social biases onto others, yet most approaches today are limited in how they can detect such biases. For example, the wealth of hate speech or toxic language detection tools today will output yes/no decisions without further explanation, and have been shown to backfire against minority speech.

Social Bias Frames makes an important step in distilling potential language biases in a much more holistic way, considering the offensiveness, intent of the speaker, as well as explanations of why the implication is biased, using knowledge about social dynamics and stereotypes.

Is there data that I can download?

Yes! We collected the Social Bias Inference Corpus (SBIC) which contains 150k structured annotations of social media posts, covering over 34k implications about a thousand demographic groups.

You can download the SBIC here.

Can Social Bias Frames be predicted automatically?

Somewhat. State-of-the-art neural models can decently predict the offensiveness of a statement, but still struggle to generate the correct implied bias. This is a complex task, since correctly predicting the frame requires making several categorical decisions and generating the correct targeted group and implication. We hope that future work will consider structured modelling to tackle this challenge.

Isn't this a little unethical/problematic?

Simply avoiding the issue of online toxicity does not make it go away; tackling the issue requires us to confront online content that may be offensive or disturbing. Of course, there are multiple ethical concerns with such research, all of which we discuss in detail in section 7 of the paper.

Assessing social media content through the lens of Social Bias Frames is important for automatic flagging or AI-augmented writing interfaces, where potentially harmful online content can be analyzed with detailed explanations for users or moderators to consider and verify. In addition, the collective analysis over large corpora can also be insightful for educating people on reducing unconscious biases in their language.

Understanding and explaining why an arguably innocuous statement is potentially unjust requires reasoning about conversational implicatures and commonsense implications with respect to the underlying intent, offensiveness, and power differentials between different social groups. Social Bias Frames aim to represent the various pragmatic meanings related to social bias implications, by combining categorical and free-text annotations, e.g., that "women are less qualifed" is implied by the statement "we shouldn’t lower our standards to hire more women."

Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A Smith & Yejin Choi (2020).
Social Bias Frames: Reasoning about Social and Power Implications of Language. ACL

   title={Social Bias Frames: Reasoning about Social and Power Implications of Language},
   author={Sap, Maarten and Gabriel, Saadia and Qin, Lianhui and Jurafsky, Dan and Smith, Noah A and Choi, Yejin},