Maarten Sap

I am an assistant professor at CMU's LTI department with a courtesy appointment in HCII, and a part-time research scientist and AI safety lead at the Allen Institute for AI (AI2). My research focuses on (1) measuring and improving AI systems' social and interactional intelligence, (2) assessing and combatting social inequality, safety risks, and socio-cultural biases in human- or AI-generated language, and (3) building narrative language technologies for prosocial outcomes.

I received my PhD from the University of Washington where I was advised by Noah Smith and Yejin Choi.
[bio for talks]

Recent updates:

July 2025 πŸ§ πŸ›‘οΈ: Five papers were accepted to COLM 2025! Highlights include HAICOSYSTEM, a framework for sandboxing safety risks in human-AI interaction; ALFA, which aligns LLMs to ask better clinical questions; and PolyGuard, a multilingual moderation tool for unsafe content. Two other papers to be released soon :)

May 2025 πŸ§‘β€πŸ’»πŸ†: Super super excited to announce that our paper Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance received the Best Paper Runner Up award at NAACL 2025. Huge congratulations to Kaitlyn!

April 2025 πŸœοΈπŸš‚: Though I will not be attending NAACL 2025, my students and collaborators will be presenting some exciting papers: Joel Mire on Rejected Dialects: Biases Against African American Language in Reward Models, Akhila Yerukola on NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models; Kaitlyn Zhou on Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance; Xuhui Zhou on AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents.

April 2025 πŸ¦žπŸ‘¨πŸΌβ€πŸ«: Excited to give a talk at the MIT CSAIL NLP seminar on the challenges of socially aware and culturally adaptable LLMs.

March 2025 πŸ‘©β€πŸ’»πŸ€–: It was fun to give a talk at SxSW on How to Be a Smarter AI User to a full room! Read the CNet article here.

January 2025 πŸ‘¨πŸΌβ€πŸ«πŸ§ : Happy to give a talk in Artificial Social Intelligence at the Cluster of Excellence "Science of Intelligence" (SCIoI) at the Technische UniversitΓ€t Berlin.

January 2025 πŸ‘¨πŸΌβ€πŸ«πŸ“’: I'm happy to be giving a talk at the First Workshop on Multilingual Counterspeech Generation at COLING 2025 (remotely)!

[older news]


My research group:

Dan Chechelnitsky

LTI PhD student
co-advised with Chrysoula Zerva

Joel Mire

LTI MLT student

Karina Halevy

LTI PhD student
co-advised with Mona Diab

Jimin Mun

LTI PhD student

Jocelyn Shen

MIT PhD student
co-advised with Cynthia Breazeal

Vasudha Varadarajan

LTI Postdoc

Akhila Yerukola

LTI PhD student

Mingqian Zheng

LTI PhD student
co-advised with Carolyn RosΓ©

Xuhui Zhou

LTI PhD student


Overarching Research Themes

Themes extracted and images generated with the OpenAI API; there may be inconsistencies.

Ethics and Human-Centered AI

My research group explores the ethical implications of AI technologies and strives for responsible AI deployment. We investigate public attitudes towards AI through crucial papers like [Why (not) use AI? Analyzing People's Reasoning and Conditions for AI Acceptability](https://arxiv.org/abs/2502.07287), which sheds light on factors influencing AI acceptance. Another important work, [HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions](http://arxiv.org/abs/2409.16427), introduces a framework to mitigate safety risks in human-AI interactions. Additionally, our research on [Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures](https://arxiv.org/abs/2502.17710) highlights the importance of cultural sensitivity in AI design.

Exploring Narratives and Empathy

My research group explores the intersection of narratives and empathy, focusing on how storytelling influences human interactions and perceptions. The paper [Quantifying the narrative flow of imagined versus autobiographical stories](https://www.pnas.org/doi/10.1073/pnas.2211715119) provides insights into how stories shape our understanding of personal experiences. Additionally, our study, [HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs](https://arxiv.org/abs/2405.17633), investigates how language models can analyze narrative styles to enhance empathetic communication. Another significant contribution is [Modeling Empathic Similarity in Personal Narratives](https://arxiv.org/abs/2305.14246), examining how shared experiences resonate in storytelling.

Social Intelligence in AI Agents

My research group explores the development of AI agents that exhibit social intelligence and adapt to human interactions. A prominent paper, [Is This the Real Life? Is This Just Fantasy? The Misleading Success of Simulating Social Interactions With LLMs](http://arxiv.org/abs/2403.05020), critically analyzes the limitations of current AI in mimicking complex social interactions. We also delve into frameworks like [SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents](https://arxiv.org/abs/2310.11667), which provides tools for evaluating AI agents' social capabilities. Furthermore, our research on [AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents](https://aclanthology.org/2025.naacl-long.595/) reveals the challenges of ensuring reliable and truthful interactions with users.

Addressing Bias and Discrimination in AI

My research group explores mechanisms to identify and mitigate biases in AI systems, ensuring fairness and inclusivity. In the paper [Rejected Dialects: Biases Against African American Language in Reward Models](https://arxiv.org/abs/2502.12858), we highlight the systemic biases in AI training data and their implications for language models. Additionally, our work [Particip-AI: A Democratic Surveying Framework for Anticipating Future AI Use Cases, Harms and Benefits](https://arxiv.org/abs/2403.14791) emphasizes community involvement in AI development to safeguard against biased outcomes. Lastly, we investigate [User-Driven Value Alignment: Understanding Users' Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI Companions](https://arxiv.org/abs/2409.00862) to gather insights into how users interpret and respond to AI biases.