Maarten Sap

I am an assistant professor at CMU's LTI department with a courtesy appointment in HCII, and a part-time research scientist and AI safety lead at the Allen Institute for AI (AI2). My research focuses on (1) measuring and improving AI systems' social and interactional intelligence, (2) assessing and combatting social inequality, safety risks, and socio-cultural biases in human- or AI-generated language, and (3) building narrative language technologies for prosocial outcomes.

I received my PhD from the University of Washington where I was advised by Noah Smith and Yejin Choi.
[bio for talks]

Recent updates:

May 2025 ๐Ÿง‘โ€๐Ÿ’ป๐Ÿ†: Super super excited to announce that our paper Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance received the Best Paper Runner Up award at NAACL 2025. Huge congratulations to Kaitlyn!

April 2025 ๐Ÿœ๏ธ๐Ÿš‚: Though I will not be attending NAACL 2025, my students and collaborators will be presenting some exciting papers: Joel Mire on Rejected Dialects: Biases Against African American Language in Reward Models, Akhila Yerukola on NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models; Kaitlyn Zhou on Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance; Xuhui Zhou on AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents.

April 2025 ๐Ÿฆž๐Ÿ‘จ๐Ÿผโ€๐Ÿซ: Excited to give a talk at the MIT CSAIL NLP seminar on the challenges of socially aware and culturally adaptable LLMs.

March 2025 ๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿค–: It was fun to give a talk at SxSW on How to Be a Smarter AI User to a full room! Read the CNet article here.

January 2025 ๐Ÿ‘จ๐Ÿผโ€๐Ÿซ๐Ÿง : Happy to give a talk in Artificial Social Intelligence at the Cluster of Excellence "Science of Intelligence" (SCIoI) at the Technische Universitรคt Berlin.

January 2025 ๐Ÿ‘จ๐Ÿผโ€๐Ÿซ๐Ÿ“ข: I'm happy to be giving a talk at the First Workshop on Multilingual Counterspeech Generation at COLING 2025 (remotely)!

December 2024 ๐Ÿ‡จ๐Ÿ‡ฆโ›ฐ๏ธ: Excited to be attending my very first NeurIPS conference in Vancouver BC! I'll be giving a talk at New in ML at 3pm on Tuesday!

[older news]


My research group:

Dan Chechelnitsky

LTI PhD student
co-advised with Chrysoula Zerva

Joel Mire

LTI MLT student

Karina Halevy

LTI PhD student
co-advised with Mona Diab

Jimin Mun

LTI PhD student

Jocelyn Shen

MIT PhD student
co-advised with Cynthia Breazeal

Akhila Yerukola

LTI PhD student

Mingqian Zheng

LTI PhD student
co-advised with Carolyn Rosรฉ

Xuhui Zhou

LTI PhD student


Overarching Research Themes

Themes extracted and images generated with the OpenAI API; there may be inconsistencies.

Ethics and Responsible AI Dynamics

My research group explores the critical intersection of ethics, accountability, and human-centered AI in modern applications. We emphasize the importance of cultural sensitivity and bias mitigation by evaluating AI systems through a social lens. Key contributions include the paper [Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures](https://arxiv.org/abs/2502.17710), which addresses the nuances of non-verbal communication in AI interactions. Additionally, our study on [Mitigating Bias in RAG: Controlling the Embedder](https://arxiv.org/abs/2502.17390) highlights strategies for minimizing biases inherent in AI model training. Furthermore, the insights from [Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and Preferences](https://arxiv.org/abs/2506.00195) shed light on user perceptions and trust in AI systems with guardrails designed for safety.

Understanding Narratives and Empathy

My research group explores the rich complexity of narratives and the role they play in shaping human experiences and AI interactions. Our work investigates how stories influence empathy and can be leveraged in AI models. Key papers include [HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs](https://arxiv.org/abs/2405.17633), which examines the stylistic features that evoke empathy in narratives. We also delve into the idea of narrative perceptions with the study [Quantifying the narrative flow of imagined versus autobiographical stories](https://www.pnas.org/doi/10.1073/pnas.2211715119), providing a framework to analyze differences in storytelling. Lastly, [The Empirical Variability of Narrative Perceptions of Social Media Texts](https://aclanthology.org/2024.emnlp-main.1113/) investigates how varying contexts change the interpretation of narratives across digital platforms.

AI Agents and Social Intelligence

My research group explores the development of AI agents that exhibit enhanced social intelligence and the ability to interact meaningfully with humans. A primary focus is on understanding how personality traits influence AI behaviors in conversations, as shown in the paper [BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data](http://arxiv.org/abs/2410.16491). We also investigate the effectiveness of AI agents in collaborative settings through [Interactive Agents to Overcome Ambiguity in Software Engineering](https://arxiv.org/abs/2502.13069), which discusses the dynamic adaptability of AI in complicated environments. Our exploration of negotiation dialogues in [Exploring Big Five Personality and AI Capability Effects in LLM-Simulated Negotiation Dialogues](https://arxiv.org/abs/2506.15928) highlights how AI can be tailored to user preferences for more effective interactions.

Addressing Toxic Language in AI

My research group explores methodologies for identifying and mitigating toxic language generated or perpetuated by AI systems. We focus on creating frameworks that not only detect toxicity but also promote healthier communication patterns. One significant contribution is the study [PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages](https://arxiv.org/abs/2504.04377), which presents a robust mechanism for moderating harmful content across different languages. Another important paper, [Counterspeakersโ€™ Perspectives: Unveiling Barriers and AI Needs in the Fight against Online Hate](https://arxiv.org/abs/2403.00179), identifies practical challenges faced by communities combating toxic language and biases. Furthermore, our research considers context in evaluations, exemplified in [COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements](http://arxiv.org/abs/2306.01985), which aims to develop a nuanced understanding of how toxic statements affect social dynamics.