Maarten Sap

I am an assistant professor at CMU's LTI department with a courtesy appointment in HCII, and a part-time research scientist and AI safety lead at the Allen Institute for AI (AI2). My research focuses on (1) measuring and improving AI systems' social and interactional intelligence, (2) assessing and combatting social inequality, safety risks, and socio-cultural biases in human- or AI-generated language, and (3) building narrative language technologies for prosocial outcomes. I was named a 2025 Packard Fellow and a recipient of the 2025 Okawa Research Award.

I received my PhD from the University of Washington where I was advised by Noah Smith and Yejin Choi.
[bio for talks]

Recent updates:

December 2025 πŸ…πŸ“ƒ: Very excited to have our paper Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond) selected for a Best Paper Award at NeurIPS 2025 (Datasets and Benchmarks Track)!! Huge congrats to the first author Liwei Jiang!!!

November 2025 πŸ’ŽπŸš€: Honored to be a Spring 2025 recipient of the Amazon Research Award for our project on measuring AI agentic safety!

October 2025 πŸ…β­: I’m super excited and grateful to announce that I'm part of the 2025 class of Packard Fellows. The Packard Foundation and this fellowship will allow me to explore exciting research directions towards culturally responsible and safe AI 🌍🌈

October 2025 πŸ”πŸ§‘β€πŸŽ“: Due to my lab being quite full already, I'm not taking looking for any new students in this upcoming PhD application cycle 😟.

October 2025 πŸ‡¨πŸ‡¦πŸŽ‰: Excited to be attending COLM 2025 in Montreal this October! I'll be giving a talk at the Social Sim Workshop on Unlocking Social Intelligence in AI agents. I'm also thrilled that five papers I co-authored will be presented by my amazing collaborators at COLM: HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions (led by Xuhui Zhou et al.), ALFA: Aligning LLMs to Ask Good Questions: A Case Study in Clinical Reasoning (co-led by Jimin Mun et al.), PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages, Fluid Language Model Benchmarking, and The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains.

August 2025 🌟: Incredibly honored to be one of 7 US recipients of the 2025 Okawa Research Grant from the Okawa Foundation!

August 2025 πŸ§‘β€πŸŽ“: Welcoming my first postdoc, Vasudha Varadarajan, to the lab!

[older news]


My research group:

Dan Chechelnitsky

CMU Portugal LTI PhD student
co-advised with Chrysoula Zerva

Joel Mire

LTI PhD student

Karina Halevy

LTI PhD student
co-advised with Mona Diab

Jimin Mun

LTI PhD student

Jocelyn Shen

MIT PhD student
co-advised with Cynthia Breazeal

Kynnedy Smith

HCII PhD student
co-advised with Motahhare Eslami

Vasudha Varadarajan

LTI Postdoc

Akhila Yerukola

LTI PhD student

Mingqian Zheng

LTI PhD student
co-advised with Carolyn RosΓ©

Xuhui Zhou

LTI PhD student


Overarching Research Themes

Themes extracted and images generated with the OpenAI API; there may be inconsistencies.

Ethics and Responsible AI

My research group explores the ethical implications of AI systems, focusing on their interactions with users and the societal impacts. Our work on 'The Hidden Puppet Master' investigates emotional manipulation within large language models (LLMs) and highlights the significance of transparency and user autonomy in AI interactions. We also delve into the concept of 'OpenAgentSafety' to create a comprehensive framework for evaluating AI agent safety in real-world contexts, ensuring that AI technology aligns with ethical standards. Furthermore, our paper 'Let Them Down Easy!' provides insights into how LLM guardrails impact user perceptions and can influence trust in AI.

Understanding Narratives and Communication

My research group explores the dynamics of narratives and their effects on interpersonal understanding and communication. Our recent work titled 'Social Story Frames' investigates contextual reasoning about narrative intent and how it shapes reception among diverse audiences. Additionally, 'Words Like Knives' focuses on detecting violent communication through backstory-personalized modeling, underscoring the importance of context in understanding narratives. We also analyze the variability in narrative perceptions in social media texts through 'The Empirical Variability of Narrative Perceptions of Social Media Texts', revealing how public reception influences narrative structure and effectiveness.

Advancing AI Agents and Simulations

My research group explores the development and evaluation of intelligent AI agents, particularly in their ability to simulate human-like interactions. In our important paper 'Mind the Sim2Real Gap in User Simulation for Agentic Tasks', we discuss challenges in bridging real-world applications with simulated environments for effective AI performance. We further address safety aspects in our framework 'OpenAgentSafety', which evaluates the risks associated with AI agents in various social setups. Moreover, 'SOTOPIA' enhances interactive evaluations of social intelligence within language agents, pushing the boundaries of how AI can emulate human connections.

Human-Centered Approaches in AI

My research group explores innovative methodologies to ensure AI systems remain centered on human needs and realities. Our study 'ALFA' investigates how aligning LLMs to pose meaningful questions can enhance clinical reasoning, thereby demonstrating the critical role of AI in healthcare dialogues. Additionally, we focus on mitigating bias in AI outputs, as presented in 'Mitigating Bias in RAG', where we propose strategies to enhance the accuracy and fairness of AI systems. Our ongoing research also includes 'NormAd', a framework aimed at measuring the cultural adaptability of LLMs, ensuring that AI systems honor diverse cultural contexts.