Maarten Sap (he/him)

msap2@andrew.cmu.edu https://scholar.google.com/citations?user=gFN4QUYAAAAJ

Positions

| | | | ------------------------------------------------------------ | -------------: | | **Carnegie Mellon University**: School of Computer Science | | | Assistant Professor - *Language Technologies Institute* | 2022 – present | | Affiliated Faculty - *Human Compuer Interaction Institute* | 2024 – present | | **Allen Institute for AI** | | | Visiting Senior Research Scientist & AI Safety Lead | 2026 – present | | Visiting Research Scientist & AI Safety Lead | 2024 – 2026 | | Visiting Research Scientist | 2022 – 2024 | | Postdoctoral Researcher / Young Investigator | 2021 – 2022 | | Research Intern | 2018 – 2019 | | **Microsoft Research** | | | Research Intern | 2019 |

Education

| | | | ------------------------------------------------------------ | ----------: | | **University of Washington**: Paul G. Allen School of Computer Science & Engineering | 2015 – 2022 | | PhD in Computer Science & Engineering, research focus on Natural Language Processing | | | advised by Yejin Choi & Noah Smith | | | Thesis: [Positive AI with Social Commonsense Models](pdfs/sap2021positiveAIwithSocialCommonsenseModels.pdf) | | | **École Polytechnique Fédérale de Lausanne**: School of Computer and Communication Sciences | 2010 – 2014 | | BS in Communications and Information Systems | |

Advising

### PhD & MLT students #### Current | | | | | ------------------------------------------------------------ | ------------- | --------------------: | | [Kynnedy Smith](http://kynnedysimone.com/) she/her (*co-advised with [Motahhare Eslami](https://www.motahhare.com/)*) | HCII PhD | 09/2025 – present | | [Joel Mire](https://joel-mire.github.io/) he/him | LTI PhD | 09/2023 – present | | [Dan Chechelnitsky](https://chechelnitskd.github.io/) he/him (*co-advised with [Chrysoula Zerva](https://scholar.google.com/citations?user=S5NGkFsAAAAJ&hl=en&oi=ao)*) | LTI PhD Portugal | 09/2024 – present | | [Mingqian Zheng ](https://eeelisa.github.io/) she/her (*co-advised with [Carolyn Rosé](https://www.cs.cmu.edu/~cprose/)*) | LTI PhD | 09/2024 – present | | [Karina Halevy](https://enscma2.github.io/) she/her (*co-advised with [Mona Diab](https://scholar.google.com.vn/citations?user=-y6SIhQAAAAJ&hl=vi)*) | LTI PhD | 09/2023 – present | | [Jimin Mun](https://jiminmun.github.io/) she/her | LTI PhD | 09/2022 – present | | [Akhila Yerukola](https://akhila-yerukola.github.io/) she/her | LTI PhD | 09/2022 – present | | [Xuhui Zhou](https://xuhuizhou.github.io/) he/him | LTI PhD | 09/2022 – present | #### Graduated | | | | | ------------------------------------------------------------ | ------------- | --------------------: | | [Jocelyn Shen](https://jocelynshen.com/) she/her (*co-advised with [Cynthia Breazeal](https://www.media.mit.edu/people/cynthiab/overview/)*) | MIT Media Lab | 11/2023 – 05/2026 | ### Research Interns & Research Masters | | | | | ------------------------------------------------------------ | ------------------- | --------------------: | | Minju Hong she/her | CMU Research Intern | 05/2026 – present | | Anisha Reddy she/her | CMU Research Intern | 05/2026 – present | | Esther Suh she/her | CMU Research Intern | 05/2026 – present | | Mikayla Campbell she/her | CMU Research Intern | 06/2025 – 04/2026 | | [Kaitlyn Zhou](https://cs.stanford.edu/~katezhou/) she/her | AI2 Research Intern | 06/2023 – 06/2025 | | [Zhe Su](https://bugsz.github.io/) he/him | CMU MSML | 09/2023 – 12/2024 | | [Ashutosh Baheti](https://abaheti95.github.io/) he/him | AI2 Research Intern | 09/2022 – 07/2024 | | [Yiming Zhang](https://y0mingzhang.github.io/) he/him (*co-advised with [Sherry Tongshuang Wu]()*) | UChicago MS | 09/2022 – 09/2023 | | [Athiya Deviyani](https://www.athiyadeviyani.com/) she/her | LTI MSAII | 09/2022 – 09/2023 | | [Julia Mendelsohn](https://juliamendelsohn.github.io/) she/her | AI2 Research Intern | 06/2022 – 01/2023 | | [Sebastin Santy](http://sebastinsanty.com/) he/him | AI2 Research Intern | 06/2022 – 01/2023 | ### Undergraduates & Professional Masters | | | | | ------------------------------------------------------------ | ------------------- | --------------------: | | Jenna Godsey she/her | CMU BS | 09/2024 – present | | Medha Hira she/her | CMU LTI MIIS | 09/2025 – present | | Keyu He he/him | CMU LTI MIIS | 09/2025 – present | | Nikita Chaudhari she/her | CMU LTI MSAII | 09/2025 – present | | Vedika Agarwal she/her | CMU LTI MSAII | 08/2026 – present | | Mingxi Yan she/her | CMU BS | 03/2026 – present | | Wenhan Li he/him | CMU LTI MCDS | 01/2026 – present | | Alan Zhang he/him | CMU LTI MCDS | 01/2026 – present | | Jishnu Mittapalli he/him | CMU LTI MCDS | 01/2026 – present | | Kevin Lu he/him | CMU LTI MCDS | 01/2026 – present | | Yi Dai he/him | CMU LTI MCDS | 01/2026 – present | | Sihan He he/him | CMU LTI MCDS | 01/2026 – present | | Jacky Huang he/him | CMU LTI MCDS | 01/2026 – present | | Emma Bai she/her | CMU LTI MCDS | 01/2026 – present | | Neel Bhandari he/him | CMU LTI MIIS | 09/2024 – 12/2025 | | Supriti Vijay she/her | CMU LTI MIIS | 09/2025 – 12/2025 | | Shihua Zeng he/him | CMU LTI MCDS | 02/2025 – 12/2025 | | Ruichen Wang she/her | CMU LTI MCDS | 02/2025 – 12/2025 | | Yashwanth Yerabudala Surendra he/him | CMU LTI MCDS | 02/2025 – 12/2025 | | Aditi Saini she/her | CMU LTI MCDS | 02/2025 – 12/2025 | | Krishnaprasad Vijayshankar he/him | CMU LTI MCDS | 02/2025 – 12/2025 | | Akshita Gupta she/her | CMU LTI MCDS | 02/2025 – 12/2025 | | Nishoak Kosaraju he/him | CMU LTI MCDS | 02/2025 – 12/2025 | | Shrey Jain he/him | CMU LTI MCDS | 02/2025 – 12/2025 | | Sathwik Acharya he/him | CMU LTI MCDS | 02/2025 – 12/2025 | | James Ding he/him | CMU LTI MCDS | 02/2025 – 12/2025 | | Tiya Cao she/her | CMU LTI MIIS | 08/2024 – 06/2025 | | Bruno Neira he/him | CMU BS | 08/2024 – 06/2025 | | Kshitish Ghate he/him (*co-advised with [Mona Diab](https://scholar.google.com.vn/citations?user=-y6SIhQAAAAJ&hl=vi)*) | CMU LTI MLT | 08/2024 – 06/2025 | | Sophie Feng she/her | CMU BS | 08/2024 – 06/2025 | | Wenkai Li he/him (*co-advised with [Mona Diab](https://scholar.google.com.vn/citations?user=-y6SIhQAAAAJ&hl=vi)*) | CMU LTI MIIS | 02/2024 – 06/2025 | | [Devansh Jain](https://devanshrj.github.io/) he/him | CMU LTI MIIS | 09/2023 – 06/2025 | | [Priyanshu Kumar](https://scholar.google.com/citations?user=SHQikPwAAAAJ) he/him | CMU LTI MIIS | 09/2023 – 06/2025 | | Liwen Sun he/him | CMU LTI MIIS | 08/2024 – 12/2024 | | Zhenxiang Guan he/him | CMU LTI MIIS | 08/2024 – 12/2024 | | Jiarui Liu he/him (*co-advised with [Mona Diab](https://scholar.google.com.vn/citations?user=-y6SIhQAAAAJ&hl=vi)*) | CMU LTI MLT | 02/2024 – 09/2024 | | [Abhinav Rao](https://aetherprior.github.io/) he/him | CMU LTI MIIS | 09/2023 – 07/2024 | | [Vishwa Shah](https://sites.google.com/view/vishwavshah/) she/her | CMU LTI MIIS | 09/2023 – 07/2024 | | Sanketh Rangreji he/him | CMU LTI MIIS | 09/2023 – 12/2023 | | Anubha Kabra she/her | CMU LTI MIIS | 09/2023 – 12/2023 | | Sravani Nanduri she/her *(co-advised with [Liwei Jiang](https://liweijiang.me/))* | UW CSE BS | 09/2021 – 10/2022 | | [Skyler Hallinan](https://skylerhallinan.com/) he/him | UW CSE BS | 01/2021 – 08/2022 | | Zhilin Wang he/him | UW CLMS | 01/2021 – 09/2021 | | Michelle Ma she/her *(co-advised with [Hannah Rashkin](https://hrashkin.github.io/)*) | UW CSE BS | 09/2019 – 12/2020 | | Sam Gehman he/him | UW CSE MS | 09/2019 – 07/2020 | | Aishwarya Nirmal she/her | UW CSE MS | 01/2018 – 06/2019 | | Kenta Takatsu he/him | Cornell BS | 07/2018 – 03/2019 | | [Zachary Horvitz](https://zacharyhorvitz.github.io/) he/him *(co-advised with [Antoine Bosselut](https://atcbosselut.github.io/))* | AI2 Research Intern | 07/2018 – 03/2019 | | Sarah Yu she/her | UW CSE BS | 03/2018 – 06/2018 | | Lanhao Wu he/him *(co-advised with [Saadia Gabriel](https://saadiagabriel.com/))* | UW CSE BS | 03/2018 – 06/2018 | | Boyan Li he/him *(co-advised with [Saadia Gabriel](https://saadiagabriel.com/))* | UW CSE BS | 01/2018 – 06/2018 | | Amy Shah she/her *(co-advised with [Elizabeth Clark](https://eaclark07.github.io/))* | UW CSE BS | 09/2017 – 06/2018 | | [Emily Allaway](https://emilyallaway.github.io/) she/her *(co-advised with [Hannah Rashkin](https://hrashkin.github.io/))* | UW CSE BS | 07/2017 – 06/2018 | | Marcela Cindy Prasetio she/her *(co-advised with [Hannah Rashkin](https://hrashkin.github.io/))* | UW CSE BS | 01/2016 – 06/2017 |

Publications

Journal

Karina Halevy, Julia Mendelsohn, Chan Young Park, Yulia Tsvetkov & Maarten Sap (2026) Evaluating Large Language Models for Antisemitic Incident Classification. Digital Hate Review.
Ashutosh Baheti, Debanjana Chakraborty, Faeze Brahman, Ronan Le Bras, Ximing Lu, Nouha Dziri, Yejin Choi, Mark Riedl & Maarten Sap (2025) Multi-Attribute Constraint Satisfaction via Language Model Rewriting. TMLR.
Liwei Jiang, Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jenny T. Liang, Jesse Dodge, Keisuke Sakaguchi, Maxwell Forbes, Jon Borchardt, Saadia Gabriel, Yulia Tsvetkov, Oren Etzioni, Maarten Sap, Regina Rini & Yejin Choi (2025) An Empirical Investigation of Machines’ Capabilities for Moral Judgment with the Delphi Experiment. Nature Machine Intelligence.
Jocelyn Shen, Daniella DiPaola, Safinah Ali, Maarten Sap, Hae Won Park & Cynthia Breazeal (2024) Empathy Towards AI vs Human Experiences: The Role of Transparency in Mental Health and Social Support Chatbot Design. JMIR Mental Health.
Maarten Sap, Anna Jafarpour, Yejin Choi, Noah A. Smith, James W. Pennebaker & Eric Horvitz (2022) Quantifying the narrative flow of imagined versus autobiographical stories. PNAS.
Gregory Park, H Andrew Schwartz, Maarten Sap, Margaret L Kern, Evan Weingarten, Johannes C Eichstaedt, Jonah Berger, David J Stillwell, Michal Kosinski, Lyle H Ungar & Martin E P Seligman (2017) Living in the Past, Present, and Future: Measuring Temporal Orientation with Language. Journal of Personality.
Margaret L Kern, Gregory Park, Johannes C Eichstaedt, H Andrew Schwartz, Maarten Sap, Laura K, Smith & Lyle H Ungar (2016) Gaining Insights From Social Media Language: Methodologies and Challenges. Psychological Methods.
Johannes C Eichstaedt, H Andrew Schwartz, Margaret L Kern, Gregory Park, Darwin R Labarthe, Raina M Merchant, Sneha Jha, Megha Agrawal, Lukasz A Dziurzynski, Maarten Sap, Christopher Weeg, Emily Larson, Lyle H Ungar & Martin E P Seligman (2015) Psychological Language on Twitter Predicts County-level Heart Disease Mortality. Psychological Science 26(2). SAGE Publications. 159--169.
Charlene A Wong, Maarten Sap, Hansen Andrew Schwartz, Robert Town, Tom Baker, Lyle Ungar & Raina M Merchant (2015) Twitter Sentiment Predicts Affordable Care Act Marketplace Enrollment. Journal of Medical Internet Research 17(2). JMIR Publications Inc..
Raina M. Merchant, Yoonhee P. Ha, Charlene A. Wong, H. Andrew Schwartz, Maarten Sap, Lyle H. Ungar & David A. Asch (2014) The 2013 US Government Shutdown (#Shutdown) and Health: An Emerging Role for Social Media. American Journal of Public Health 2014. e1--e3.

Conference

Daniel Chechelnitsky, Sireesh Gururaja, Seyi Olojo, Wesley Hanwen Deng, Giuseppe Attanasio, Chrysoula Zerva & Maarten Sap (2026) Locating Translation as a Craft in the Age of AI Translation. AIES.
Roshni Kaushik, Maarten Sap & Koichi Onoue (2026) Examining the Effect of Explanations of AI Privacy Redaction in AI-mediated Interactions. AIES.
Eunkyu Park, Wesley Hanwen Deng, Gunhee Kim, Motahhare Eslami & Maarten Sap (2026) Cognitive Chain-of-Thought: Structured Multimodal Reasoning about Social Situations. COLM.
Xuhui Zhou, Weiwei Sun, Qianou Ma, Yiqing Xie, Jiarui Liu, Weihua Du, Sean Welleck, Yiming Yang, Graham Neubig, Tongshuang Sherry Wu & Maarten Sap (2026) Mind the Sim2Real Gap in User Simulation for Agentic Tasks. COLM.
Yashwanth YS, Ruichen Wang, Shihua Zeng, Xuhui Zhou, Koichi Onoue, Vasudha Varadarajan & Maarten Sap (2026) SOTOPIA-ToM: Evaluating Information Management in Multi-Agent Interaction with Theory of Mind. COLM.
Xuhui Zhou, Jiarui Liu, Akhila Yerukola, Hyunwoo Kim & Maarten Sap (2026) Social World Models. COLM.
Jocelyn Shen, Amina Luvsanchultem, Jessica Kim, Kynnedy Smith, Valdemar Danry, Kantwon Rogers, Sharifa Alghowinem, Hae Won Park, Maarten Sap & Cynthia Breazeal (2026) The Hidden Puppet Master: Predicting Human Belief Change in Manipulative LLM Dialogues. COLM.
Mingqian Zheng, Malia Morgan, Liwei Jiang, Carolyn Rose & Maarten Sap (2026) Useless but Safe? Benchmarking Utility Recovery with User Intent Clarification in Multi-Turn Conversations. COLM.
Jimin Mun, Chani Jung, Xuhui Zhou, Hyunwoo Kim & Maarten Sap (2026) GoodPoint: Learning Constructive Scientific Paper Feedback from Author Responses. COLM.
Vasudha Varadarajan, Akhila Yerukola, Mona T. Diab & Maarten Sap (2026) CCBENCH: Assessing LLM Cultural Competence via Implicitly Signaled Norms using Health Queries. COLM.
Akhila Yerukola, Fabrice Y. Harel-Canada, Simran Khanuja, Abhinav Sukumar Rao, Ashima Suvarna, Nanyun Peng, Saadia Gabriel & Maarten Sap (2026) NormViz: A Benchmark and Framework for Grounding Multimodal Reasoning in Global Cultures. COLM.
Alice Gao, Andrew N. Meltzoff, Maarten Sap & Katharina Reinecke (2026) Framing an AI with Values Reduces AI Reliance in AI-supported Writing Tasks. FAccT.
Jordan Taylor, William Agnew, Maarten Sap, Sarah E. Fox & Haiyi Zhu (2026) The Algorithmic Gaze: An Audit and Ethnography of the LAION-Aesthetics Predictor Model. FAccT.
Xuhui Zhou, Valerie Chen, Zora Zhiruo Wang, Graham Neubig, Maarten Sap & Xingyao Wang (2026) TOM-SWE: User Mental Modeling For Software Engineering Agents. ICML.
Yunze Xiao, Gordon Dai, Shahan Ali Memon, Jen-tse Huang, Maarten Sap & Mona Diab (2026) Position: AI Welfare Is Bullshit. ICML.
Jordan Taylor, Joel Mire, Alicia DeVrio, Maarten Sap, Haiyi Zhu & Sarah E. Fox (2026) "I Just Don't Want My Work Being Fed Into The AI Blender": Queer Artists on Refusing and Resisting Generative AI. CSCW.
Myke C. Cohen, Mingqian Zheng, Neel Bhandari, Hsien-Te Kao, Xuhui Zhou, Daniel Nguyen, Laura Cassani, Maarten Sap & Svitlana Volkova (2026) Imperfectly Cooperative Human-AI Interactions: Comparing the Impacts of Human and AI Attributes in Simulated and User Studies. Findings of ACL.
Joel Mire, Maria Antoniak, Steven R. Wilson, Zexin Ma, Achyutarama R. Ganti, Andrew Piper & Maarten Sap (2026) Social Story Frames: Contextual Reasoning about Narrative Intent and Reception. ACL.
Sanidhya Vijayvargiya, Aditya Bharat Soni, Xuhui Zhou, Zora Zhiruo Wang, Nouha Dziri, Graham Neubig & Maarten Sap (2026) OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety. ICLR.
Sanidhya Vijayvargiya, Xuhui Zhou, Akhila Yerukola, Maarten Sap & Graham Neubig (2026) Ambig-SWE: Interactive Agents to Overcome Underspecificity in Software Engineering. ICLR.
Jing-Jing Li, Joel Mire, Eve Fleisig, Valentina Pyatkin, Anne Collins, Maarten Sap & Sydney Levine (2026) PluriHarms: Benchmarking the Full Spectrum of Human Judgments on AI Harm. ICLR.
Mikayla Campbell, Joel Mire, Mark Diaz & Maarten Sap (2026) Black LLMirror: User (Self) Perceptions in Black American English Interactions with LLMs. CHI.
Tianyu Cao, Neel Bhandari, Akhila Yerukola, Akari Asai & Maarten Sap (2026) Out of Style: RAG's Fragility to Linguistic Variation. EACL.
Karina H. Halevy, Kimi Wenzel, Seyun Kim, Kyle Dean Bauer, Bruno Neira, Mona T. Diab & Maarten Sap (2026) Common Sense or Ableism? Rethinking Commonsense Reasoning Through the Lens of Disability. EACL.
Wenkai Li, Liwen Sun, Zhenxiang Guan, Xuhui Zhou & Maarten Sap (2026) 1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning. IASEAI.
Zhonghao He, Tianyi Qiu, Hirokazu Shirado & Maarten Sap (2025) Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning. NeurIPS.
Xianzhe Fan, Xuhui Zhou, Chuyang Jin, Kolby Nottingham, Hao Zhu & Maarten Sap (2025) SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions. NeurIPS D&B.
Liwei Jiang, Yuanjun Chai, Margaret Li, Mickel Liu, Raymond Fok, Nouha Dziri, Yulia Tsvetkov, Maarten Sap, Alon Albalak & Yejin Choi (2025) Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond). NeurIPS D&B.
Mingqian Zheng, Wenjia Hu, Patrick Zhao, Motahhare Eslami, Jena D. Hwang, Faeze Brahman, Carolyn Rose & Maarten Sap (2025) Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and Preferences. Findings of EMNLP.
Jiarui Liu, Yueqi Song, Yunze Xiao, Mingqian Zheng, Lindia Tjuatja, Jana Schaich Borg, Mona T. Diab & Maarten Sap (2025) Synthetic Socratic Debates: Examining Persona Effects on Moral Decision and Persuasion Dynamics. EMNLP.
Jocelyn Shen, Akhila Yerukola, Xuhui Zhou, Cynthia Breazeal, Maarten Sap^* & Hae Won Park^* (2025) Words Like Knives: Backstory-Personalized Modeling and Detection of Violent Communication. EMNLP.
Ritam Dutt, Carolyn Rose & Maarten Sap (2025) Social Scaffolds: A Generalization Framework for Social Understanding Tasks. EMNLP.
Jimin Mun, Wei Bin Au Yeong, Wesley Hanwen Deng, Jana Schaich Borg & Maarten Sap (2025) Why (not) use AI? Analyzing People's Reasoning and Conditions for AI Acceptability. AIES.
Xuhui Zhou, Hyunwoo Kim, Faeze Brahman, Liwei Jiang, Hao Zhu, Ximing Lu, Frank Xu, Bill Yuchen Lin, Yejin Choi, Niloofar Mireshghallah, Ronan Le Bras & Maarten Sap (2025) HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions. COLM.
Shuyue Stella Li, Jimin Mun, Faeze Brahman, Jonathan S. Ilgen, Yulia Tsvetkov & Maarten Sap (2025) ALFA: Aligning LLMs to Ask Good Questions: A Case Study in Clinical Reasoning. COLM.
Priyanshu Kumar, Devansh Jain, Akhila Yerukola, Liwei Jiang, Himanshu Beniwal, Thomas Hartvigsen & Maarten Sap (2025) PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages. COLM.
Valentin Hofmann, David Heineman, Ian Magnusson, Kyle Lo, Jesse Dodge, Maarten Sap, Pang Wei Koh, Chun Wang, Hannaneh Hajishirzi & Noah A. Smith (2025) Fluid Language Model Benchmarking. COLM.
Scott Geng, Hamish Ivison, Chun-Liang Li, Maarten Sap, Jerry Li, Ranjay Krishna & Pang Wei Koh (2025) The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains. COLM.
Akhila Yerukola, Saadia Gabriel, Nanyun Peng & Maarten Sap (2025) Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures. ACL.
Wenkai Li, Jiarui Liu, Andy Liu, Xuhui Zhou, Mona T. Diab & Maarten Sap (2025) BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data. ACL.
Anjali Kantharuban, Jeremiah Milbauer, Maarten Sap, Emma Strubell & Graham Neubig (2025) Stereotype or Personalization? User Identity Biases Chatbot Recommendations. Findings of ACL.
Taeyoun Kim, Jacob Springer, Aditi Raghunathan & Maarten Sap (2025) Mitigating Bias in RAG: Controlling the Embedder. Findings of ACL.
Jen-tse Huang, Jiaxu Zhou, Tailin Jin, Xuhui Zhou, Zixi Chen, Wenxuan Wang, Youliang Yuan, Michael R. Lyu & Maarten Sap (2025) On the Resilience of Multi-Agent Systems with Malicious Agents. ICML.
Jing-Jing Li, Valentina Pyatkin, Max Kleiman-Weiner, Liwei Jiang, Nouha Dziri, Anne G. E. Collins, Jana Schaich Borg, Maarten Sap, Yejin Choi & Sydney Levine (2025) SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation. ICML.
Jeffrey Basoah, Daniel Chechelnitsky, Tao Long, Katharina Reinecke, Chrysoula Zerva, Kaitlyn Zhou, Mark Díaz & Maarten Sap (2025) Not Like Us, Hunty: Measuring Perceptions and Behavioral Effects of Minoritized Anthropomorphic Cues in LLMs. FAccT.
Jordan Taylor, Joel Mire, Franchesca Spektor, Alicia DeVrio, Maarten Sap, Haiyi Zhu & Sarah E. Fox (2025) Un-Straightening Generative AI: How Queer Artists Surface and Challenge the Normativity of Generative AI Models. FAccT.
Joel Mire^*, Zubin Trivadi Aysola^*, Daniel Chechelnitsky, Nicholas Deas, Chrysoula Zerva & Maarten Sap (2025) Rejected Dialects: Biases Against African American Language in Reward Models. Findings of NAACL.
Kaitlyn Zhou, Jena D. Hwang, Xiang Ren, Nouha Dziri, Dan Jurafsky & Maarten Sap (2025) Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance. NAACL.
Zhe Su, Xuhui Zhou, Sanketh Rangreji, Anubha Kabra, Julia Mendelsohn, Faeze Brahman & Maarten Sap (2025) AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents. NAACL.
Abhinav Rao^*, Akhila Yerukola^*, Vishwa Shah, Katharina Reinecke & Maarten Sap (2025) NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models. NAACL.
Xianzhe Fan, Qing Xiao, Xuhui Zhou, Jiaxin Pei, Maarten Sap, Zhicong Lu & Hong Shen (2025) User-Driven Value Alignment: Understanding Users' Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI Companions. CHI.
Jiaxin Ge, Zora Zhiruo Wang, Xuhui Zhou, Yi-Hao Peng, Sanjay Subramanian, Qinyue Tan, Maarten Sap, Alane Suhr, Daniel Fried, Graham Neubig & Trevor Darrell (2025) AutoPresent: Designing Structured Visuals from Scratch. CVPR.
Joel Mire, Maria Antoniak, Elliott Ash, Andrew Piper & Maarten Sap (2024) The Empirical Variability of Narrative Perceptions of Social Media Texts. EMNLP.
Xuhui Zhou, Zhe Su, Tiwalayo Eisape, Hyunwoo Kim & Maarten Sap (2024) Is This the Real Life? Is This Just Fantasy? The Misleading Success of Simulating Social Interactions With LLMs. EMNLP.
Jocelyn Shen, Joel Mire, Hae Won Park, Cynthia Breazeal & Maarten Sap (2024) HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs. EMNLP.
Liwei Jiang, Kavel Rao, Seungju Han, Allyson Ettinger, Faeze Brahman, Sachin Kumar, Niloofar Mireshghallah, Ximing Lu, Maarten Sap, Nouha Dziri & Yejin Choi (2024) WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models. NeurIPS.
Jimin Mun, Liwei Jiang, Jenny T. Liang, Inyoung Cheong, Nicole DeCario, Yejin Choi, Tadayoshi Kohno & Maarten Sap (2024) Particip-AI: A Democratic Surveying Framework for Anticipating Future AI Use Cases, Harms and Benefits. AIES.
Devansh Jain, Priyanshu Kumar, Samuel Gehman, Xuhui Zhou, Thomas Hartvigsen & Maarten Sap (2024) PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models. COLM.
Maria Antoniak, Joel Mire, Maarten Sap, Elliott Ash & Andrew Piper (2024) Where Do People Tell Stories Online? Story Detection Across Online Communities. ACL.
Akhila Yerukola, Saujas Vadugur, Daniel Fried & Maarten Sap (2024) Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Non-Literal Intent Resolution in LLMs. ACL.
Kaitlyn Zhou, Jena D. Hwang, Xiang Ren & Maarten Sap (2024) Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty. ACL.
Ruiyi Wang, Haofei Yu, Wenxin Zhang, Zhengyang Qi, Maarten Sap, Graham Neubig, Yonatan Bisk & Hao Zhu (2024) SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents. ACL.
Jimin Mun, Cathy Buerger, Jenny T. Liang, Joshua Garland & Maarten Sap (2024) Counterspeakers’ Perspectives: Unveiling Barriers and AI Needs in the Fight against Online Hate. CHI.
Natalie Shapira, Mosh Levy, Hossein Seyed Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap & Vered Shwartz (2024) Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models. EACL.
Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig & Maarten Sap (2024) SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents. ICLR.
Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri & Yejin Choi (2024) Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory. ICLR.
Ashutosh Baheti, Ximing Lu, Faeze Brahman, Ronan Le Bras, Maarten Sap & Mark Riedl (2024) Leftover-Lunch: Advantage-based Offline Reinforcement Learning for Language Models. ICLR.
Taylor Sorensen, Liwei Jiang, Jena D. Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas & Yejin Choi (2024) Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties. AAAI.
Akhila Yerukola, Xuhui Zhou, Elizabeth Clark & Maarten Sap (2023) ``Don't Take This Out of Context!'' On the Need for Contextual Models and Evaluations for Stylistic Rewriting. EMNLP.
Yiming Zhang, Sravani U. Nanduri, Liwei Jiang, Tongshuang Sherry Wu & Maarten Sap (2023) BiasX: ``Thinking Slow'' in Toxic Content Moderation with Explanations of Implied Social Biases. EMNLP.
Jocelyn Shen, Maarten Sap, Pedro Colon-Hernandez, Hae Won Park & Cynthia Breazeal (2023) Modeling Empathic Similarity in Personal Narratives. EMNLP.
Jimin Mun, Emily Allaway, Akhila Yerukola, Laura Vianna, Sarah-Jane Leslie & Maarten Sap (2023) Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language. Findings of EMNLP.
Hyunwoo Kim, Melanie Sclar, Xuhui Zhou, Ronan Le Bras, Gunhee Kim, Yejin Choi & Maarten Sap (2023) FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions. EMNLP.
Hyunwoo Kim, Jack Hessel, Liwei Jiang, Peter West, Ximing Lu, Youngjae Yu, Pei Zhou, Ronan Le Bras, Malihe Alikhani, Gunhee Kim, Maarten Sap & Yejin Choi (2023) SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization. EMNLP.
Xuhui Zhou, Hao Zhu, Akhila Yerukola, Thomas Davidson, Jena D. Hwang, Swabha Swayamdipta & Maarten Sap (2023) COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements. Findings of ACL.
Julia Mendelsohn, Ronan Le Bras, Yejin Choi & Maarten Sap (2023) From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language Models. ACL.
Sebastin Santy^*, Jenny T. Liang^*, Ronan Le Bras, Katharina Reinecke & Maarten Sap (2023) NLPositionality: Characterizing Design Biases of Datasets and Models. ACL.
Skyler Hallinan, Alisa Liu, Yejin Choi & Maarten Sap (2023) Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts. ACL.
Organizers Of QueerinAI, Anaelia Ovalle, Arjun Subramonian, Ashwin Singh, Claas Voelcker, Danica J. Sutherl, Davide Locatelli, Eva Breznik, Filip Klubicka, Hang Yuan, J Hetvi, Huan Zhang, Jaidev Shriram, Kruno Lehman, Luca Soldaini, Maarten Sap, Marc Peter Deisenroth, Maria Leonor Pacheco, Maria Ryskina, Martin Mundt, Milind Agarwal, Nyx McLean, Pan Xu, A Pranav, Raj Korpan, Ruchira Ray, Sarah Mathew, Sarthak Arora, St John, Tanvi An, Vishakha Agrawal, William Agnew, Yanan Long, Zijie J. Wang, Zeerak Talat, Avijit Ghosh, Nathaniel Dennler, Michael Noseworthy, Sharvani Jha, Emily Baylor, Aditya Joshi, Natalia Y. Bilenko, Andrew McNamara, Raphael Gontijo-Lopes, Alex Markham, Evyn Dǒng, Jackie Kay, Manu Saraswat, Nikhil Vytla & Luke Stark (2023) Queer In AI: A Case Study in Community-Led Participatory AI. FAccT.
Hyunwoo Kim, Youngjae Yu, Liwei Jiang, Ximing Lu, Daniel Khashabi, Gunhee Kim, Yejin Choi & Maarten Sap (2022) ProsocialDialog: A Prosocial Backbone for Conversational Agents. EMNLP.
Maarten Sap, Ronan Le Bras, Daniel Fried & Yejin Choi (2022) Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs. EMNLP.
Zhijing Jin, Sydney Levine, Fernando Gonzalez Adauto, Ojasv Kamal, Maarten Sap, Mrinmaya Sachan, Rada Mihalcea, Joshua B. Tenenbaum & Bernhard Schölkopf (2022) When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment. NeurIPS.
Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi & Noah A. Smith (2022) Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection. NAACL.
Prithviraj Ammanabrolu, Liwei Jiang, Maarten Sap, Hanna Hajishirzi, Yejin Choi & Noah A. Smith (2022) Aligning to Social Norms and Values in Interactive Narratives. NAACL.
Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray & Ece Kamar (2022) ToxiGen: Controlling Language Models to Generate Implied and Adversarial Toxicity. ACL.
Saadia Gabriel, Skyler Hallinan, Maarten Sap, Pemi Nguyen, Franziska Roesner, Eunsol Choi & Yejin Choi (2022) Misinfo Reaction Frames: Reasoning about Readers' Reactions to News Headlines. ACL.
Jesse Dodge, Maarten Sap, Ana Marasović, William Agnew, Gabriel Ilharco, Dirk Groeneveld, Margaret Mitchell & Matt Gardner (2021) Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus. EMNLP.
Ashutosh Baheti, Maarten Sap, Alan Ritter & Mark Riedl (2021) Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts. EMNLP.
Alisa Liu, Maarten Sap, Ximing Lu, Swabha Swayamdipta, Chandra Bhagavatula, Noah A. Smith & Yejin Choi (2021) DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts. ACL.
Albert Xu, Eshaan Pathak, Eric Wallace, Suchin Gururangan, Maarten Sap & Dan Klein (2021) Detoxifying Language Models Risks Marginalizing Minority Voices. NAACL.
Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Yejin Choi & Noah A. Smith (2021) Challenges in Automated Debiasing for Toxic Language Detection. EACL.
Xinyao Ma^*, Maarten Sap^*, Hannah Rashkin & Yejin Choi (2020) PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction. EMNLP.
Sam Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi & Noah A Smith (2020) RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models. Findings of EMNLP.
Maxwell Forbes, Jena D. Hwang, Vered Shwartz, Maarten Sap & Yejin Choi (2020) Social Chemistry 101: Learning to Reason about Social and Moral Norms. EMNLP.
Maarten Sap, Eric Horvitz, Yejin Choi, Noah A Smith & James W. Pennebaker (2020) Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models. ACL.
Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A Smith & Yejin Choi (2020) Social Bias Frames: Reasoning about Social and Power Implications of Language. ACL.
Maarten Sap^*, Hannah Rashkin^*, Derek Chen, Ronan Le Bras & Yejin Choi (2019) Social IQa: Commonsense Reasoning about Social Interactions. EMNLP.
Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi & Noah A Smith (2019) The Risk of Racial Bias in Hate Speech Detection. ACL.
Antoine Bosselut, Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Asli Celikyilmaz & Yejin Choi (2019) COMET: Commonsense Transformers for Automatic Knowledge Graph Construction. ACL.
Maarten Sap, Ronan Le Bras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A Smith & Yejin Choi (2019) ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning. AAAI.
Hannah Rashkin, Antoine Bosselut, Maarten Sap, Kevin Knight & Yejin Choi (2018) Modeling Naive Psychology of Characters in Simple Commonsense Stories. ACL.
Hannah Rashkin^*, Maarten Sap^*, Emily Allaway, Noah A. Smith & Yejin Choi (2018) Event2Mind: Commonsense Inference on Events, Intents, and Reactions. ACL.
Maarten Sap, Marcella Cindy Prasetio, Ari Holtzman, Hannah Rashkin & Yejin Choi (2017) Connotation Frames of Power and Agency in Modern Films. EMNLP.
Roy Schwartz, Maarten Sap, Ioannis Konstas, Li Zilles, Yejin Choi & Noah A Smith (2017) The Effect of Different Writing Tasks on Linguistic Style: A Case Study of the ROC Story Cloze Task. CoNLL.
H. Andrew Schwartz, Gregory Park, Maarten Sap, Evan Weingarten, Johannes Eichstaedt, Margaret Kern, David Stillwell, Michal Kosinski, Jonah Berger, Martin Seligman & Lyle Ungar (2015) Extracting Human Temporal Orientation from Facebook Language. NAACL.
Maarten Sap, Gregory Park, Johannes C. Eichstaedt, Margaret L. Kern, David J. Stillwell, Michal Kosinski, Lyle H. Ungar & Hansen Andrew Schwartz (2014) Developing Age and Gender Predictive Lexica over Social Media. EMNLP.

Workshop

Qiaosi Wang, Xuhui Zhou, Maarten Sap, Jodi Forlizzi & Hong Shen (2025) Rethinking Theory of Mind Benchmarks for LLMs: Towards A User-Centered Perspective. CHI Workshop on Human-centered Evaluation and Auditing of Language Models (HEAL @ CHI).
Myke C. Cohen, Zhe Su, Hsien-Te Kao, Daniel Nguyen, Spencer Lynch, Maarten Sap & Svitlana Volkova (2025) Exploring Big Five Personality and AI Capability Effects in LLM-Simulated Negotiation Dialogues. KDD workshop on Evaluation and Trustworthiness of Agentic and Generative AI Models.
Runtao Zhou, Guangya Wan, Saadia Gabriel, Sheng Li, Alexander J Gates, Maarten Sap & Thomas Hartvigsen (2025) Disparities in LLM Reasoning Accuracy and Explanations: A Case Study on African American English. MELT Workshop.
Emily Allaway, Nina Taneja, Sarah-Jane Leslie & Maarten Sap (2022) Towards Countering Essentialism through Social Bias Reasoning. EMNLP workshop on NLP for Positive Impact.
Zhilin Wang, Anna Jafarpour & Maarten Sap (2022) Uncovering Surprising Event Boundaries in Narratives. Workshop on Narrative Understanding.
Tal August, Maarten Sap, Elizabeth Clark, Katharina Reinecke & Noah A. Smith (2020) Exploring the Effect of Author and Reader Identity in Online Story Writing: the StoriesInTheWild Corpus. Workshop on Narrative Understanding, Storylines, and Events (NUSE)@ ACL.
Roy Schwartz, Maarten Sap, Ioannis Konstas, Li Zilles, Yejin Choi & Noah A Smith (2017) Story Cloze task: UW NLP System. EACL Workshop LSD Sem. 52--55.
Daniel Preotiuc-Pietro, Maarten Sap, H Andrew Schwartz & Lyle Ungar (2015) Mental Illness Detection at the World Well-Being Project for the CLPsych 2015 Shared Task. NAACL Workshop on CLPsych.
Daniel Preotiuc-Pietro, Johannes Eichstaedt, Gregory Park, Maarten Sap, Laura Smith, Victoria Tobolsky, H Andrew Schwartz & Lyle Ungar (2015) The Role of Personality, Age and Gender in Tweeting about Mental Illnesses. NAACL Workshop on CLPsych.
H Andrew Schwartz, Johannes Eichstaedt, Margaret L Kern, Gregory Park, Maarten Sap, David Stillwell, Michal Kosinski & Lyle Ungar (2014) Towards Assessing Changes in Degree of Depression through Facebook. ACL Workshop on CLPsych. 118--125.

Demo

Xuhui Zhou, Zhe Su, Sophie Feng, Jiaxu Zhou, Jen-tse Huang, Svitlana Volkova, Tongshuang Sherry Wu, Anita Woolley, Hao Zhu & Maarten Sap (2025) SOTOPIA-S4: A User-Friendly System for Flexible, Customizable, and Large-Scale Social Simulation. NAACL System Demonstrations.
Maria Antoniak, Anjalie Field, Jimin Mun, Melanie Walsh, Lauren F. Klein & Maarten Sap (2023) Riveter: Measuring Power and Social Dynamics Between Entities. ACL demonstrations.
Hao Fang, Hao Cheng, Maarten Sap, Elizabeth Clark, Ariel Holtzman, Yejin Choi, Noah A Smith & Mari Ostendorf (2018) Sounding Board: A User-Centric and Content-Driven Social Chatbot. NAACL System Demonstrations.
H Andrew Schwartz, Salvatore Giorgi, Maarten Sap, Patrick Crutchley, Lyle Ungar & Johannes Eichstaedt (2017) DLATK: Differential Language Analysis ToolKit. EMNLP System Demonstrations. 55--60.

Other

Maarten Sap (2021) Positive AI with Social Commonsense Models.
Hao Fang, Hao Cheng, Elizabeth Clark, Ariel Holtzman, Maarten Sap, Mari Ostendorf, Yejin Choi & Noah A Smith (2017) Sounding Board - University of Washington’s Alexa Prize Submission. Alexa Prize Proceedings.
H Andrew Schwartz, Maarten Sap, Margaret L Kern, Johannes C Eichstaedt, Adam Kapelner, Megha Agrawal, Eduardo Blanco, Lukasz Dziurzynski, Gregory Park, David Stillwell, Michal Kosinski, Martin E P Seligman & Lyle H Ungar (2016) Predicting individual well-being through the language of social media. Biocomputing 2016: Proceedings of the Pacific Symposium. 516--527.

Preprint

Roshni Kaushik, Maarten Sap & Koichi Onoue (2026) Exploring the Interaction of Explanation Styles, Context, and Trust of AI Privacy Redaction in AI-mediated Interactions. arXiv.
Xuhui Zhou, Weiwei Sun, Weihua Du, Jiarui Liu, Haojia Sun, Qianou Ma, Tongshuang Sherry Wu, Yiming Yang & Maarten Sap (2026) OdysSim: Building Foundation Models for Human Behavior Simulation. arXiv.
Andrew Aquilina, Chetna Nihalani, Vasudha Varadarajan, Nathan S. Fishbein, Yu-Ru Lin & Maarten Sap (2026) Lost in Delusion: Examining LLM Safety Under User Delusions and Distress. arXiv.
Giuseppe Attanasio, Beatrice Savoldi, Daniel Chechelnitsky, Matteo Negri, Marine Carpuat, Maarten Sap & André F. T. Martins (2026) Ouvia: A User-centered Framework for Measuring Usability of Speech Translation in Real-World Communication Scenarios. arXiv.
Leena Mathur, Jenny T. Liang, Vasudha Varadarajan, Jimin Mun, Xuhui Zhou, Jana Schaich Borg, Yonatan Bisk, Louis-Philippe Morency & Maarten Sap (2026) When Should AI Read the Room? Public Perceptions of Social Intelligence in AI Agents. arXiv.
Weiwei Sun, Xuhui Zhou, Jiarui Liu, Weihua Du, Haojia Sun, Yiqing Xie, Qianou Ma, Sihao Chen, Mengting Wan, Longqi Yang, Pei Zhou, Tongshuang Sherry Wu, Sean Welleck, Graham Neubig, Yiming Yang & Maarten Sap (2026) Reinforcing Human Behavior Simulation via Verbal Feedback. arXiv.
Eunkyu Park, Wesley Hanwen Deng, Vasudha Varadarajan, Mingxi Yan, Gunhee Kim, Maarten Sap & Motahhare Eslami (2025) Critical or Compliant? The Double-Edged Sword of Reasoning in Chain-of-Thought Explanations. arXiv.
Weiwei Sun, Xuhui Zhou, Weihua Du, Xingyao Wang, Sean Welleck, Graham Neubig, Maarten Sap & Yiming Yang (2025) Training Proactive and Personalized LLM Agents. arXiv.
Jiaxu Zhou, Jen-tse Huang, Xuhui Zhou, Man Ho Lam, Xintao Wang, Hao Zhu, Wenxuan Wang & Maarten Sap (2025) The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies. arXiv.
Kshitish Ghate, Andy Liu, Devansh Jain, Taylor Sorensen, Atoosa Kasirzadeh, Aylin Caliskan, Mona T. Diab & Maarten Sap (2025) EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preference. arXiv.
Himanshu Beniwal, Youngwoo Kim, Maarten Sap, Soham Dan & Thomas Hartvigsen (2025) Breaking mBad! Supervised Fine-tuning for Cross-Lingual Detoxification. arXiv.
Yubin Kim, Hyewon Jeong, Shan Chen, Shuyue Stella Li, Mingyu Lu, Kumail Alhamoud, Jimin Mun, Cristina Grau, Minseok Jung, Rodrigo Gameiro, Lizhou Fan, Eugene Park, Tristan Lin, Joonsik Yoon, Wonjin Yoon, Maarten Sap, Yulia Tsvetkov, Paul Liang, Xuhai Xu, Xin Liu, Daniel McDuff, Hyeonhoon Lee, Hae Won Park, Samir Tulebaev & Cynthia Breazeal (2025) Medical Hallucinations in Foundation Models and Their Impact on Healthcare. arXiv.
Xianzhe Fan, Qing Xiao, Xuhui Zhou, Yuran Su, Zhicong Lu, Maarten Sap & Hong Shen (2024) Minion: A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications. arXiv.

Awards

### Paper awards | | | | | --------------------------- | ------------------------------------------------------------ | -----------------------: | | Best Paper | [Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)](https://arxiv.org/abs/2510.22954) | NeurIPS (D&B track) 2025 | | Best Paper Runner Up | [Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance](./publications.html#zhou2025relai) | NAACL 2025 | | Outstanding paper | [SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization](./publications.html#kim2023soda) | EMNLP 2023 | | Outstanding Paper | [NLPositionality: Characterizing Design Biases of Datasets and Models](./publications.html#santy2023nlpositionality) | ACL 2023 | | Best Paper | [Queer In AI: A Case Study in Community-Led Participatory AI](./publications.html#OrganizersOfQueerin2023QueerAI) | FAccT 2023 | | Best Paper | [Social Bias Frames: Reasoning about Social and Power Implications of Language](./publications.html#sap2020socialbiasframes) | WeCNLP 2020 | | Best Short Paper Nomination | [The Risk of Racial Bias in Hate Speech Detection](./publications.html#sap2019risk) | ACL 2019 | ### Research awards and honors | | | | | ------------------------------------------------------------ | ------------------------------------------------------------ | ---: | | [Packard Fellowship](www.packard.org/2025fellows) | | 2025 | | Amazon Research Award | OpenAgentSafety: Measuring and Mitigating Safety Harms of LLM-based AI Agent Interactions | 2025 | | [2025 Okawa Research Award](http://okawa-foundation.or.jp/en/activities/research_grant/index.html) (one of 7 US-based recipients) | Toward Socially-Aware AI with Structured Social World Models | 2025 | | Selected to speak at the National Academy of Engineering's [Frontiers of Engineering](https://www.naefrontiers.org/212813/2024-US-Frontiers-of-Engineering-Symposium) | Artificial Social Intelligence? On the challenges of Socially Aware and Ethically informed LLMs | 2024 | | Google Academic Research Award | PARTICIP-AI: Studying Lay People’s Needs, Judgments, and Impact Assessment for Future AI Use Cases and AI Dilemmas | 2024 | | Amazon Research Award | RLKF: Mitigating Factual Hallucinations and Social Biases with Knowledge-based Reinforcement Learning | 2023 | | William Chan Memorial Dissertation Award | [Positive AI with Social Commonsense Models](./publications.html#sap2021positiveAIwithSocialCommonsenseModels) | 2021 | | Amazon Alexa Prize | [Sounding Board](https://sounding-board.github.io/): [A User-Centric and Content-Driven Social Chatbot](./publications.html#fang2017alexatechreport) | 2017 | ### Other awards and honors | | | | -------------------------- | -----------: | | ARR outstanding area chair | October 2025 |

Thesis Committees

| | | | | | ---- | --------------- | --------------------- | ---: | | PhD | Eddie Ungless | Bjorn Ross, University of Edingburgh | 2025 | | PhD | Kaitlyn Zhou | Dan Jurafsky, Stanford University | 2025 | | PhD | Haoyang Wen | Alexander Hauptman, CMU | 2025 | | PhD | Ashutosh Baheti | Mark Rield, GATech | 2024 | | PhD | Kaixin Ma | Eric Nyberg, CMU | 2023 | | PhD | Prakhar Gupta | Jeff Bingham, CMU | 2023 | | Ms | Jocelyn Chen | Cytnhia Breazeal, MIT | 2023 | | PhD | Chan Young Park | Yulia Tsvetkov, UW | 2023 | | PhD | Paul Röttger | Scott Hale, University of Oxford | 2023 |

Teaching

### Courses |||| |-|-|-:| |[11-430/830 Ethics, Safety, and Social Impact in NLP and LLMs](https://maartensap.com/11830/Spring2026/)||Spring 2026| |[11-705 Introduction to Research in Language Technologies](https://maartensap.com/11705/Fall2025/)||Fall 2025| |[11-361 Data Science Seminar](https://mcds-cmu.github.io/11631/f25/)||Fall 2025| |[11-430/830 Ethics, Safety, and Social Impact in NLP and LLMs](https://maartensap.com/11830/Spring2025/)||Spring 2025| |[11-361 Data Science Seminar](https://mcds-cmu.github.io/11631/f24/)||Fall 2024| |[11-830 Ethics, Social Biases, and Positive Impact in Language Technologies](http://maartensap.com/11830/Spring2024)||Spring 2024| |[11-361 Data Science Seminar](https://mcds-cmu.github.io/11631/f23/)||Fall 2023| | [11-830 Computational Ethics](http://maartensap.com/11-830-Spring2023/) | | Spring 2023 | ### Guest lectures & Tutorials | | | | | ------------------------------------------------------------ | -------------------- | ----------: | | Social intelligence of LLM agents | 05-899 Guest lecture | Fall 2024 | | Bias in Natural Language Processing | 66-142 Guest lecture | Spring 2024 | | Bias in Natural Language Processing | 11-711 Guest Lecture | Spring 2024 | | Toxicity in LLMs | 11-667 Guest Lecture | Fall 2023 | | Bias in Natural Language Processing | 11-711 Guest Lecture | Fall 2023 | | Bias in Natural Language Processing | 05-899 Guest Lecture | Spring 2023 | | Bias in Natural Language Processing | 15-884 Guest Lecture | Fall 2022 | | "[Crowdsourcing Beyond Annotation](https://nlp-crowdsourcing.github.io/)" | Tutorial | EMNLP 2021 | | "[Commonsense Reasoning in Natural Language Processing](./acl2020-commonsense/index.html)" | Tutorial | ACL 2020 |

Service

### Workshops | | | | | ------------------------------------------------------------ | ------------------ | ------------: | | [Social Simulation with LLMs - Fidelity in Applications](https://sites.google.com/view/social-sims-with-llms/social-sim26) | co-organizer | COLM 2026 | | Dagstuhl "[Social Intelligence in AI Systems](https://www.dagstuhl.de/seminars/seminar-calendar/seminar-details/26241)" | co-organizer | Dagstuhl 2026 | | [PoliSim@CHI 2026: LLM Agent Simulation for Policy](https://polisim.net/) | co-organizer | CHI 2026 | | [PersonaLLM: Workshop on LLM Persona Modeling](https://personallmworkshop.github.io/) | co-organizer | NeurIPS 2025 | | [NLP 4 Democracy](https://sites.google.com/andrew.cmu.edu/nlp4democracy/home) | co-organizer | COLM 2025 | | [Socially Responsible Language Modelling Research (SoLaR)](https://solar-neurips.github.io/) | advisory board | COLM 2025 | | [Agent Workshop @ CMU](https://cmu-agent-workshop.github.io/) | co-organizer | Spring 2025 | | [Socially Responsible Language Modelling Research (SoLaR)](https://solar-neurips.github.io/) | co-organizer | NeurIPS 2024 | | [Pluralistic Alignment](https://pluralistic-alignment.github.io/) | co-organizer | NeurIPS 2024 | | [Multimodal Content Moderation Workshop](https://multimodal-content-moderation.github.io/) | co-organizer | CVPR 2024 | | [Multimodal Content Moderation Workshop](https://multimodal-content-moderation.github.io/mmcm23/index.html) | co-organizer | CVPR 2023 | | [NLP for Positive Impact Workshop](https://sites.google.com/view/nlp4positiveimpact) | steering committee | EMNLP 2022 | | [NLP for Positive Impact Workshop](https://sites.google.com/view/nlp4positiveimpact/previous-workshops/acl-2021-workshop) | co-organizer | ACL 2021 | #### Senior program committees | | | | ------------------ | -----------: | | ACL rolling review | 2020–present | | COLM | 2026 | | ICLR | 2025–2026 | | ICML | 2026 | | FAccT | 2025 | | AAAI | 2021 | #### Reviewing | | | | ------------------------------------------------------------ | -----------: | | *Journals & conferences* | | | ACL rolling review | 2020–present | | ACL | 2019–present | | FAccT | 2024 | | CHI | 2024 | | PNAS | 2024 | | EMNLP | 2018–2023 | | Journal of Psycholinguistic Research | 2023 | | Computing Survey | 2023 | | Transactions of ACL | 2020, 2022 | | AAAI | 2020 | | ICWSM | 2021 | | Dementia and Geriatric Cognitive Disorders Journal | 2020 | | Computational Linguistics | 2019, 2020 | | Humanities and Social Sciences Communications | 2019 | | Journal of Artificial Intelligence Research | 2019 | | IEEE Transactions on Cognitive and Developmental Systems | 2019 | | Social Psychological and Personality Science | 2018 | | *Workshops* | | | Workshop on NLP for Positive Impact | 2022 | | Workshop on NLP for Causal Inference | 2021 | | NAACL Student Research Workshop | 2019 | | CLPsych workshop | 2016–2018 | | Stylistic Variation workshop | 2018 | ### University committees #### CMU LTI | | | | | ------------------------------------------------------------ | ------- | --------------: | | Belonging and Engagement in Language Technologies Institute (BELTI) committee (lead since 2023) | CMU LTI | 2022–present | | Faculty Hiring committee | CMU LTI | 2026 | | MIIS admissions committee | CMU LTI | 2026 | | PhD & MLT admissions committee | CMU LTI | 2022–2025 | #### Other | | | | | ------------------------------------------------------------ | ------- | --------------: | | Socio-cultural diversity and inclusion committee | ACL | 2020 | | Diversity committee | UW CSE | 2016–2020 | | Graduate student advisory council (G5PAC) | UW CSE | 01/2018–12/2020 | ### Advisory boards | | | | | ------------------------------------------------------------ | ---------------- | --------------: | | Safety Committee Lead of [AI2's OMAI project](https://allenai.org/omai) | Ai2 | 09/2026–present | | Member of the Ethics Committee of [AI2's OLMO project](https://allenai.org/olmo) | Ai2 | 09/2023–06/2024 | ### Panels & other service or outreach | | | | ------------------------------------------------------------ | --------------------: | | CMU Gelfand Center **Communicating with People and AI** | 03/2026 | | [ServiceNow x LawZero - Special AI Safety & Security Event](https://luma.com/0r8v2d0x) (panel opposite Yohua Bengio & Nicolas Chapados) | 10/2025 | | [Responsible AI](https://www.cmu.edu/block-center/responsible-ai/index.html) salon on generative AI at CMU | 03/2023 | | Presentation to U.S. congressional appropriations committee about risks and implications of AI and LLMs | 03/2023 | | [Red-teaming GPT-4](https://cdn.openai.com/papers/gpt-4-system-card.pdf) for OpenAI | 09/2022–12/2022 |

Talks

| | | | ------------------------------------------------------------ | ------: | | Unlocking Social Intelligence in AI agents | | | [Failure Modes in Agentic AI Workshop at ICML](https://fagen-workshop.github.io/#speakers) (*invited speaker*) | 07/2026 | | [SocialLLM Workshop at ICWSM](https://social-llm-workshop.github.io/) (*keynote speaker*) | 05/2026 | | [Theory of Mind for AI (ToM4AI) Workshop](https://tom4ai.github.io/events/AAAI2026/) at AAAI 2026 (*invited speaker*) | 01/2026 | | [Social Simulation with LLMs](https://sites.google.com/view/social-sims-with-llms) at COLM 2025 (*invited speaker*) | 10/2025 | | Enabling Human-centric and Culturally Aware Safety of AI Agents | | | [University of Minnesota NLP seminar](https://cse.umn.edu/cs/events/nlp-seminar-enabling-human-centric-and-culturally-aware-safety-ai-agents) | 04/2026 | | [CMU HCII seminar](https://hcii.cmu.edu/news/event/2026/03/hcii-seminar-series-maarten-sap) | 03/2026 | | [Agentic AI Frontier Seminar](https://agentic-ai-frontier-seminar.github.io/) | 02/2026 | | Responsible AI for Diverse Users and Cultures | | | [Gender Bias in NLP workshop](https://web.archive.org/web/20250716212130/https://gebnlp-workshop.github.io/keynotes.html) at ACL 2025 | 08/2025 | | Artificial Social Intelligence? On the challenges of Socially Aware and Ethically informed LLMs | | | MIT CSAIL NLP seminar | 04/2025 | | UCLA CS 269 Guest Lecture | 02/2025 | | [Cluster of Excellence "Science of Intelligence" (SCIoI)](https://www.scienceofintelligence.de/) | 01/2025 | | NeurIPS [New In ML](https://newinml.github.io/) workshop (*invited speaker*) | 12/2024 | | University of Pittsburgh CS colloquium | 11/2024 | | Columbia NLP seminar | 10/2024 | | [NAE Frontiers of Engineering](https://www.naefrontiers.org/212813/2024-US-Frontiers-of-Engineering-Symposium) (*invited talk*) | 09/2024 | | DSTA Faculty speaker series | 09/2024 | | Aptima Brown Bag | 07/2024 | | [CMU Agent Workshop 2024](https://cmu-agent-workshop.github.io/) (*invited speaker*) | 05/2024 | | [UNC Chapel Hill Symposium on AI and Society](https://cs.unc.edu/event/symposium-on-ai-and-society/) | 04/2024 | | How to Be a Smarter AI user | | | [SxSW](https://schedule.sxsw.com/2025/events/PP154312) | 03/2025 | | Rethinking the Role of AI in Counterspeech | | | [First Workshop on Multilingual Counterspeech Generation](https://sites.google.com/view/multilang-counterspeech-gen/) at COLING 2025 (*invited speaker*) | 01/2025 | | Developing Computational Analyses of the Social Aspects of Narratives | | | EMNLP [Workshop on Narrative Understanding](https://sites.google.com/cs.stonybrook.edu/wnu2024) (*invited speaker*) | 11/2024 | | [Princeton Workshop on Narrative Possibilities](https://anthropology.princeton.edu/events/workshop-narrative-possibilities) (*invited speaker*) | 06/2024 | | Towards Socially Aware AI with Pragmatic Competence | | | [ICML workshop on Theory of Mind](https://tomworkshop.github.io/) (*invited speaker*) | 07/2023 | | The Pivotal Role of Social Context in Toxic Language Detection | | | [ACL workshop on online abuse and harms](https://www.workshopononlineabuse.com/) (*invited speaker*) | 07/2023 | | [Dealing with meaning variation Workshop](https://sites.google.com/view/dealingwithmeaningvariation/demeva-2023-public-kickoff) (*invited speaker*) | 10/2023 | | Toward Prosocial NLP: Reasoning About And Responding to Toxicity in Language | | | MIT Media Lab Breazeal Group Meeting | 11/2022 | | CMU S3D Computational Social Science Seminar | 11/2022 | | Amazon Alexa Trust & Privacy | 11/2022 | | University of Minnesota NLP seminar | 10/2022 | | Detecting and Rewriting Social Biases in Language | | | Pinterest NLP seminar | 09/2022 | | UIUC Responsible Data Science Seminar Series | 02/2022 | | MilaNLP seminar at Università Bocconi | 10/2021 | | [PAN workshop at CLEF](https://pan.webis.de/clef21/pan21-web/index.html) (*invited speaker*) | 09/2021 | | Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection | | | NAACL | 07/2022 | | Text As Data (TADA) | 10/2021 | | Positive AI with Social Commonsense Models | | | The Web Conf [Workshop UserNLP: User-centered Natural Language Processing Workshop](https://caisa.informatik.uni-marburg.de/user_nlp.html) (*invited speaker*) | 04/2022 | | AKBC [Workshop on Commonsense Reasoning](https://akbc-cskb.github.io/) (*invited speaker*) | 10/2021 | | University of Toronto Computer Science | 04/2021 | | MIT EECS | 03/2021 | | CMU LTI/MLD | 03/2021 | | UChicago CS | 03/2021 | | TTIC | 02/2021 | | Emory CS | 02/2021 | | Vanderbilt CS | 02/2021 | | EPFL I&C | 01/2021 | | Yale Data Science & Statistics seminar | 01/2021 | | PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction | | | EMNLP conference | 11/2020 | | Social Bias Frames: Reasoning About Social and Power Dynamics | | | WeCNLP Summit | 10/2020 | | ACL Conference | 07/2020 | | Reasoning about Social Dynamics and Social Bias in Language | | | SRI seminar | 01/2021 | | Georgia Tech NLP seminar | 10/2020 | | Berkeley NLP seminar | 02/2020 | | Stanford NLP seminar | 02/2020 | | Social and Ethical Considerations in English Toxic Language Detection | | | NLP with Friends | 08/2020 | | Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models | | | ACL Conference | 07/2020 | | COMET: Commonsense Transformers for Automatic Knowledge Graph Construction | | | DARPA Communicating with Computers grant meeting | 11/2019 | | Social IQa: Commonsense Reasoning about Social Interactions | | | EMNLP conference | 11/2019 | | The Risk of Racial Bias in Hate Speech Detection | | | ACL Conference | 07/2019 | | ICML Queer in AI workshop (*invited speaker*) | 06/2019 | | ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning | | | AAAI conference | 01/2019 | | AI2 seminar | 01/2019 | | Event2Mind: Commonsense Inference on Events, Intents, and Reactions | | | DARPA Communicating with Computers grant meeting | 07/2018 | | Detecting Implicit Bias in Text through Connotative Language | | | UW Social Psychology seminar | 04/2018 |

News Coverage

##### [Training Proactive and Personalized LLM Agents](publications.html#sun2025trainingProactive) (2025) - [MarketTechpost.com](https://www.marktechpost.com/2025/11/06/cmu-researchers-introduce-ppp-and-userville-to-train-proactive-and-personalized-llm-agents/) ##### [Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and Preferences](publications.html#zheng2025letThemDownEasy) (2025) - [Forbes.com](https://web.archive.org/web/20250821173527/https://www.forbes.com/sites/victordey/2025/08/21/anthropics-claude-ai-can-now-end-abusive-conversations-for-model-welfare/) ##### [Words Like Knives: Backstory-Personalized Modeling and Detection of Violent Communication](publications.html#shen2025wordsLikeKnives) (2025) - [WSJ.com](https://web.archive.org/web/20250717125734/https://www.wsj.com/tech/can-ai-solve-the-content-moderation-problem-9ff2ea09) - [MSN.com](https://web.archive.org/web/20250717130024/https://www.msn.com/en-us/money/other/can-ai-solve-the-content-moderation-problem/ar-AA1IpXhO) ##### [AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents](publications.html#su2025ailiedar) (2025) - [theregister.com](https://web.archive.org/web/20250502022235/https://www.theregister.com/2025/05/01/ai_models_lie_research/) - [CNET.com](https://web.archive.org/web/20250503011903/https://www.cnet.com/tech/services-and-software/openai-yanked-a-chatgpt-update-heres-what-it-said-and-why-it-matters/) ##### [Rejected Dialects: Biases Against African American Language in Reward Models](publications.html#mire2025rejectedDialects) (2025) - [IBM.com](https://perma.cc/YMZ4-2NAM) ##### [NLPositionality: Characterizing Design Biases of Datasets and Models](publications.html#santy2023NLPositionality) (2023) - [marktechpost.com](https://web.archive.org/web/20230717174128/https://www.marktechpost.com/2023/07/14/a-research-group-from-cmu-ai2-and-university-of-washington-introduces-nlpositionality-an-ai-framework-for-characterizing-design-biases-and-quantifying-the-positionality-of-nlp-datasets-and-models/) ##### [ProsocialDialog: A Prosocial Backbone for Conversational Agents](publications.html#kim2022prosocialDialog) (2022) - [sciencefocus.com](https://web.archive.org/web/20230327201644/https://www.sciencefocus.com/news/chatgpt-ted-lasso-internet-hate-speech/) ##### [Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs](publications.html#sap2022neuralToM) (2022) - [nytimes.com](https://web.archive.org/web/20230327184321/https://www.nytimes.com/2023/03/27/science/ai-machine-learning-chatbots.html) ##### [Delphi: Towards Machine Ethics and Norms](publications.html#jiang2021delphi) (2021) - [nytimes.com](https://web.archive.org/web/20230305234708/https://www.nytimes.com/2021/11/19/technology/can-a-machine-learn-morality.html) - [vox.com](https://web.archive.org/web/20221229172748/https://www.vox.com/future-perfect/2021/10/27/22747333/artificial-intelligence-ethics-delphi-ai) - [theguardian.com](https://web.archive.org/web/20230213231329/https://www.theguardian.com/technology/2021/nov/02/delphi-online-ai-bot-philosophy) - [wired.com](https://web.archive.org/web/20230329222042/https://www.wired.com/story/program-give-ai-ethics-sometimes/) - [geekwire.com](https://web.archive.org/web/20230310030100/https://www.geekwire.com/2021/teaching-artificial-intelligence-right-from-wrong-new-tool-from-ai2-aims-to-model-ethical-judgments/) - [futurism.com](https://web.archive.org/web/20230325165850/https://www.futurism.com/delphi-ai-ethics-racist) - [Nature Outlook](https://web.archive.org/web/20231027052346/https://www.nature.com/articles/d41586-023-03258-1) ##### [Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus](publications.html#dodge2021documentingC4) (2021) - [Nature.com](https://web.archive.org/web/20240924194623/https://www.nature.com/articles/s43588-024-00695-4) - [washingtonpost.com](https://web.archive.org/web/20230419120558/https://www.washingtonpost.com/technology/interactive/2023/ai-chatbot-learning/) - [wired.com](https://web.archive.org/web/20230330110735/https://www.wired.com/story/review-ai-chatbots-bing-bard-chat-gpt/) - [wired.com](https://web.archive.org/web/20230314202533/https://www.wired.com/story/efforts-make-text-ai-less-racist-terrible/) - [unite.ai](https://web.archive.org/web/20230321034425/https://www.unite.ai/minority-voices-filtered-out-of-google-natural-language-processing-models/) ##### [Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts](publications.html#baheti2021justSayNo) (2021) - [thenextweb.com](https://web.archive.org/web/20230404211115/https://www.thenextweb.com/news/gpt-3-and-humans-twice-as-likely-agree-with-offensive-reddit-comments-chatbots) ##### [DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts](publications.html#liu2021dexperts) (2021) - [geekwire.com](https://web.archive.org/web/20230404211810/https://www.geekwire.com/2021/researchers-develop-new-way-help-machine-generated-language-systems-reduce-toxic-language/) ##### [RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models](publications.html#gehman2020realtoxicityprompts) (2020) - [ieee.org](https://web.archive.org/web/20230321062841/https://spectrum.ieee.org/open-ais-powerful-text-generating-tool-is-ready-for-business) - [fortune.com](https://web.archive.org/web/20230404205249/https://www.fortune.com/2020/09/29/artificial-intelligence-openai-gpt3-toxic/) - [geekwire.com](https://web.archive.org/web/20230404212031/https://www.geekwire.com/2021/curse-neural-toxicity-ai2-uw-researchers-help-computers-watch-language/) - [wired.com](https://web.archive.org/web/20230314171124/https://www.wired.com/story/ai-fueled-dungeon-game-got-much-darker/) ##### [The Risk of Racial Bias in Hate Speech Detection](publications.html#sap2019risk) (2019) - [forbes.com](https://web.archive.org/web/20221218082924/https://www.forbes.com/sites/nicolemartin1/2019/08/13/googles-artificial-intelligence-hate-speech-detector-is-racially-biased/?sh=37fa3a53326c) - [vox.com](https://web.archive.org/web/20230404205546/https://www.vox.com/recode/2019/8/15/20806384/social-media-hate-speech-bias-black-african-american-facebook-twitter) - [observer.com](https://web.archive.org/web/20220818104542/https://www.observer.com/2019/08/google-ai-hate-speech-detector-black-racial-bias-twitter-study/) - [fortune.com](https://web.archive.org/web/20221206075315/https://www.fortune.com/2019/08/16/google-jigsaw-perspective-racial-bias/) - [techcrunch.com](https://web.archive.org/web/20221130020534/https://www.techcrunch.com/2019/08/14/racial-bias-observed-in-hate-speech-detection-algorithm-from-google/) - [newscientist.com](https://www.newscientist.com/article/2213064-googles-hate-speech-detecting-ai-appears-to-be-racially-biased/) - [breitbart.com](http://web.archive.org/web/20200620080022/https://www.breitbart.com/tech/2019/08/17/ai-is-1-5-times-more-likely-to-flag-social-media-posts-by-black-people-as-offensive/) ##### [Connotation Frames of Power and Agency in Modern Films](publications.html#sap2017connotation) (2017) - [futurity.org](https://web.archive.org/web/20211205141552/https://www.futurity.org/movie-scripts-gender-bias-1605212/) - [kuow.org](https://web.archive.org/web/20230404210655/https://www.kuow.org/stories/record-wednesday-november-15-2017/) - [technologynetworks.com](https://web.archive.org/web/20221007062604/https://www.technologynetworks.com/informatics/news/scientists-use-machine-learning-to-analyze-language-in-movies-294179) - [dailymail.co.uk](https://www.dailymail.co.uk/femail/article-5081501/How-Hollywood-films-FUEL-sexism.html) - [phys.org](https://phys.org/news/2017-11-ai-tool-quantifies-power-imbalance.html) - [electronics360.globalspec.com](https://electronics360.globalspec.com/article/10344/analyzing-gender-bias-with-ai) ##### [Sounding Board - University of Washington’s Alexa Prize Submission](publications.html#fang2017alexatechreport) (2017) - [theverge.com](https://www.theverge.com/2018/6/13/17453994/amazon-alexa-prize-2018-competition-conversational-ai-chatbots) - [geekwire.com](https://www.geekwire.com/2018/secrets-500k-amazon-alexa-prize-winner-inside-univ-washingtons-socialbot/) - [wired.com](https://www.wired.com/story/inside-amazon-alexa-prize/) - [q13fox.com](http://q13fox.com/2018/01/08/uw-students-create-conversational-amazon-alexa-device/) - [dailyuw.com](http://www.dailyuw.com/science/article_bc135d10-f0f7-11e7-a1ab-0be114c44429.html) - [seattletimes.com](https://www.seattletimes.com/seattle-news/education/uw-students-teach-alexa-to-have-a-little-chat-with-us/) - [komonews.com](http://komonews.com/news/local/uw-team-wins-500000-prize-from-amazon-for-conversational-bot) - [amazon.com](https://developer.amazon.com/blogs/alexa/post/1a6a19d8-e45d-4b3b-981d-776a378ba625/university-of-washington-students-win-inaugural-alexa-prize) - [komonews.com](http://komonews.com/news/local/uw-team-finalist-for-1-million-prize-to-hold-20-minute-conversation-amazons-alexa) - [geekwire.com](https://www.geekwire.com/2017/amazon-reveals-3-university-finalists-2-5m-alexa-prize-including-one-uw/) ##### Miscellaneous - 2025 - On personality of AI systems: [ScienceNews.com](https://web.archive.org/web/20250207163618/https://www.sciencenews.org/article/ai-chatbot-personalities) - On how to be a smarter AI user: [CNet.com](https://web.archive.org/web/20250314162336/https://www.cnet.com/tech/services-and-software/5-ways-to-stay-smart-when-using-gen-ai-explained-by-computer-science-professors/) - On AI's cultural awareness: [TechCrunch.com](https://web.archive.org/web/20250320221705/https://techcrunch.com/2025/03/20/ais-answers-on-china-differ-depending-on-the-language-analysis-finds/) - On social biases in AI video generation: [Wired.com](https://web.archive.org/web/20250324123430/https://www.wired.com/story/openai-sora-video-generator-bias/) - On antisemitism and biases in AI: [CNN.com](https://www.cnn.com/2025/07/15/tech/ai-artificial-intelligence-antisemitism) - 2024 - On evaluation of AI: [TheMarkup.org](https://web.archive.org/web/20240717143254/https://themarkup.org/artificial-intelligence/2024/07/17/everyone-is-judging-ai-by-these-tests-but-experts-say-theyre-close-to-meaningless) - 2023 - On "Sparks of AGI": [NYTimes.com](https://web.archive.org/web/20230516091858/https://www.nytimes.com/2023/05/16/technology/microsoft-ai-human-reasoning.html) - On AI social coaches: [TechCrunch.com](https://web.archive.org/web/20230520020144/https://techcrunch.com/2023/05/13/ai-relationship-building-amorai/) - On GPT-4 red-teaming: [FinancialTimes.com](https://web.archive.org/web/20230414132856/https://www.ft.com/content/0876687a-f8b7-4b39-b513-5fee942831e8?accessToken=zwAAAYd_9J4skc8Idmh6-LdLOdO1E1_ulCgx6A.MEQCIFehPSHqO7vjyrQmUHmZGujI6tVxlndevV5vIQGnWzENAiBJg7ltMLzzeyNNXxQC36cpuLwYQ9BB26_O2upfGLGyyw&segmentId=e95a9ae7-622c-6235-5f87-51e412b47e97&shareType=enterprise) - On users falling in love with ChatGPT: [Time.com](https://web.archive.org/web/20230224153642/https://www.time.com/6257790/ai-chatbots-love/) - 2022 - On openness of large LMs: [spectrum.ieee.org](https://web.archive.org/web/20230321062817/https://spectrum.ieee.org/large-language-models-meta-openai)