Maarten Sap

I am an assistant professor at CMU's LTI department with a courtesy appointment in HCII, and a part-time research scientist and AI safety lead at the Allen Institute for AI (AI2). My research focuses on (1) measuring and improving AI systems' social and interactional intelligence, (2) assessing and combatting social inequality, safety risks, and socio-cultural biases in human- or AI-generated language, and (3) building narrative language technologies for prosocial outcomes. I was named a 2025 Packard Fellow and a recipient of the 2025 Okawa Research Award.

I received my PhD from the University of Washington where I was advised by Noah Smith and Yejin Choi.
[bio for talks]

Recent updates:

December 2025 πŸ…πŸ“ƒ: Very excited to have our paper Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond) selected for a Best Paper Award at NeurIPS 2025 (Datasets and Benchmarks Track)!! Huge congrats to the first author Liwei Jiang!!!

November 2025 πŸ’ŽπŸš€: Honored to be a Spring 2025 recipient of the Amazon Research Award for our project on measuring AI agentic safety!

October 2025 πŸ…β­: I’m super excited and grateful to announce that I'm part of the 2025 class of Packard Fellows. The Packard Foundation and this fellowship will allow me to explore exciting research directions towards culturally responsible and safe AI 🌍🌈

October 2025 πŸ”πŸ§‘β€πŸŽ“: Due to my lab being quite full already, I'm not taking looking for any new students in this upcoming PhD application cycle 😟.

October 2025 πŸ‡¨πŸ‡¦πŸŽ‰: Excited to be attending COLM 2025 in Montreal this October! I'll be giving a talk at the Social Sim Workshop on Unlocking Social Intelligence in AI agents. I'm also thrilled that five papers I co-authored will be presented by my amazing collaborators at COLM: HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions (led by Xuhui Zhou et al.), ALFA: Aligning LLMs to Ask Good Questions: A Case Study in Clinical Reasoning (co-led by Jimin Mun et al.), PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages, Fluid Language Model Benchmarking, and The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains.

August 2025 🌟: Incredibly honored to be one of 7 US recipients of the 2025 Okawa Research Grant from the Okawa Foundation!

August 2025 πŸ§‘β€πŸŽ“: Welcoming my first postdoc, Vasudha Varadarajan, to the lab!

[older news]


My research group:

Dan Chechelnitsky

CMU Portugal LTI PhD student
co-advised with Chrysoula Zerva

Joel Mire

LTI PhD student

Karina Halevy

LTI PhD student
co-advised with Mona Diab

Malia Morgan

Pre-doctoral Young Investigator at Ai2

Jimin Mun

LTI PhD student

Jocelyn Shen

MIT PhD student
co-advised with Cynthia Breazeal

Kynnedy Smith

HCII PhD student
co-advised with Motahhare Eslami

Vasudha Varadarajan

LTI Postdoc

Akhila Yerukola

LTI PhD student

Mingqian Zheng

LTI PhD student
co-advised with Carolyn RosΓ©

Xuhui Zhou

LTI PhD student


Overarching Research Themes

Themes extracted and images generated with the OpenAI API; there may be inconsistencies.

Human-centered AI and responsible use

My research group explores how people perceive, trust, and are affected by AI systems in real interactions, with a focus on safety, usability, and responsible deployment. Recent work shows that user-centered framing can meaningfully change reliance and preferences, as in [Framing an AI with Values Reduces AI Reliance in AI-supported Writing Tasks](https://arxiv.org/abs/2605.20512) and [Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance](https://aclanthology.org/2025.naacl-long.556/). We also see stronger attention to vulnerable users and high-stakes settings through [Lost in Delusion: Examining LLM Safety Under User Delusions and Distress](https://arxiv.org/abs/2606.00975) and [Why (not) use AI? Analyzing People's Reasoning and Conditions for AI Acceptability](https://arxiv.org/abs/2502.07287). Together, these papers suggest a shift from abstract model evaluation toward measuring what people actually experience, need, and risk in practice.

Narrative understanding and stories

My research group explores how LLMs and language technologies interpret, generate, and analyze stories, personal narratives, and the social meaning embedded in them. A central thread is narrative intent and reception, highlighted by [Social Story Frames: Contextual Reasoning about Narrative Intent and Reception](https://arxiv.org/abs/2512.15925) and [HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs](https://arxiv.org/abs/2405.17633). Related work on [Modeling Empathic Similarity in Personal Narratives](https://arxiv.org/abs/2305.14246) and [The Empirical Variability of Narrative Perceptions of Social Media Texts](https://aclanthology.org/2024.emnlp-main.1113/) shows that narrative interpretation is subjective, context-sensitive, and shaped by reader perspective. This line of research is increasingly useful for understanding how systems process lived experience, persuasion, empathy, and backstory-dependent language use.

Social intelligence and agent simulations

My research group explores AI agents that can model social behavior, reason about other minds, and participate in believable multi-party interactions. Recent papers such as [OdysSim: Building Foundation Models for Human Behavior Simulation](https://arxiv.org/abs/2606.14199), [When Should AI Read the Room? Public Perceptions of Social Intelligence in AI Agents](https://arxiv.org/abs/2605.29938), and [Reinforcing Human Behavior Simulation via Verbal Feedback](https://arxiv.org/abs/2605.20506) indicate rapid progress toward agents that can simulate and adapt to human social norms. At the same time, work like [Social World Models](https://arxiv.org/abs/2509.00559) and [SOTOPIA-ToM: Evaluating Information Management in Multi-Agent Interaction with Theory of Mind](https://arxiv.org/abs/2605.02307) emphasizes rigorous evaluation of social reasoning, information management, and theory of mind. Overall, the field is moving from isolated dialogue competence toward richer social simulation and agentic coordination.

Cultural bias, values, and ethics

My research group explores how AI systems encode values, reproduce cultural bias, and respond to ethically sensitive or socially diverse contexts. A recent cluster of work includes [NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models](https://aclanthology.org/2025.naacl-long.120/), [EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preference](https://arxiv.org/abs/2510.06370), and [Rejected Dialects: Biases Against African American Language in Reward Models](https://arxiv.org/abs/2502.12858), which together point to the need for models that are both norm-aware and fair across communities. The field is also grappling with social harm and perspective-taking through [PluriHarms: Benchmarking the Full Spectrum of Human Judgments on AI Harm](https://arxiv.org/abs/2601.08951) and [Not Like Us, Hunty: Measuring Perceptions and Behavioral Effects of Minoritized Anthropomorphic Cues in LLMs](https://arxiv.org/abs/2505.05660). These studies suggest that responsible AI now requires measuring harm, adapting to culture, and explicitly accounting for values and identity in model behavior.