# [11-430/830 Ethics, Safety, and Social Impact in NLP and LLMs]()
## **Spring 2026**
- **Time**: 11:00-12:20 Tuesdays & Thursdays
- **Place**: BH A36
- **Canvas**: [https://canvas.cmu.edu/courses/51508](https://canvas.cmu.edu/courses/51508) (for discussions, assignments, questions, etc.)
- **Zoom**: Only by request, 2 days in advance.
- For other, e.g., more personal, concerns, email the instructors ([instructors-11-830@andrew.cmu.edu](mailto:instructors-11-830@andrew.cmu.edu)).
-----
### Summary
As language technologies have become increasingly prevalent, there is a growing awareness that decisions we make about our data, methods, and tools are often tied up with their impact on people and societies.
This course introduces students to real-world applications of language technologies and the potential ethical implications associated with them.
We discuss philosophical foundations of ethical research along with advanced state-of-the art techniques. Discussion topics include:
- **Philosophical foundations:** ethical philosophies, history of human experiments, AI alignment and morality.
- **Bias, Misrepresentation, Alignment:** psychological origins of biases in humans, technical sources of AI biases, examples of and limitations to debiasing.
- **Toxicity and Safety of LLMs**: origins of toxicity, unsafe content in pretraining data, safety techniques for AI systems.
- **Language of manipulation:** origins of misinformation and manipulation, persuasion in AI language.
- **Privacy and Profiling:** algorithms for demographic inference, personality profiling, and anonymization of demographic and personal traits.
- **AI for Social Good:** Energy considerations of AI, global and multilingual AI, human-centered AI, AI for positive social impact.
- **Multidisciplinary perspective:** invited lectures from experts in behavioral and social sciences, rhetoric, etc.
----
### Schedule
|Week|Date|Theme|Topics|Assignments|
|---|---|---|---|---|
|1|01/13|Introduction|Motivation, requirements and overview||
||01/15|Introduction|Project expectations and ideas||
|2|01/20|Philosophical Foundations & History|Overview: Moral theories, trolley problem, history of ethics, etc.||
||01/22|Philosophical Foundations & History|Human subjects, IRB and crowdsourcing||
|3|01/27|Philosophical Foundations & History|Paper Discussion||
||01/29|Objectivity and Bias|Stereotypes, prejudice, and discrimination: background|Preproposals & project teams due|
|4|02/03|Objectivity and Bias|Bias in AI systems||
||02/05|Objectivity and Bias|Paper Discussion|Paper summary due (day before class)|
|5|02/10|Toxicity and Safety of LLMs|Overview: Hate speech, Toxicity, Hate speech detection|HW1 due|
||02/12|Toxicity and Safety of LLMs|Origins of LM toxicity and dataset filtering||
|6|02/17|Toxicity and Safety of LLMs|LLM safeguarding|Proposal is due|
||02/19|Toxicity and Safety of LLMs|Safety benchmarking, red-teaming, and jailbreaking||
|7|02/24|Toxicity and Safety of LLMs|Paper Discussion|Paper summary due (day before class)|
||02/26|Guest lecture|Guest lecture: TAs|HW2 due|
|--|03/03|--- *Spring break* ----|---||
||03/05|--- *Spring break* ---|---||
|8|03/10|LLM persuasion and manipulation|Overview & history of persuasion and information manipulation||
||03/12|LLM persuasion and manipulation|LLM persuasion and manipulation|HW 3 due|
|9|03/17|LLM persuasion and manipulation|Paper Discussion|Paper summary due (day before class)|
||03/19|Midterm project check-ins|Midterm project check-ins|Midterm project check ins|
|10|03/24|Privacy, Profiling, Security|Overview: Privacy and profiling||
||03/26|Privacy, Profiling, Security|*Guest lecture: [Tianshi Li](https://tianshili.me/): LLMs and Privacy*||
|11|03/31|Privacy, Profiling, Security|Paper Discussion|Paper summary due (day before class)|
||04/02|No class -- Spring Carnival|...||
|12|04/07|NLP for Social Good|NLP for social good overview, pitfalls, challenges||
||04/09|NLP for Social Good|*Guest lecture: TBD*||
|13|04/14|NLP for Social Good|*Guest lecture: TBD*||
||04/16|NLP for Social Good|*Guest lecture: TBD*|HW 4 due (*extra credit*)|
|14|04/21|Final project presentations||Final project presentations|
||04/23|Final project presentations||Final project presentations|
|15|04/28|---|---|Final reports due|
----
### Grading
Grades are based on a combination of homeworks and participation (individual) and a semester-long class project (group). All assignments are due at 11:59pm Eastern on the specified date.
- **Homework assignments (3 assignments; 40% total).** Each assignment contains a combination of coding, analysis, and discussion. For each assignment, completing the baseline requirements will obtain a passing (B-range) grade. A-range grades can be obtained through completing the open-ended “Advanced Analysis” part of the assignment. Assignments are not necessarily designed to focus on technical solutions, but instead to encourage students to think critically about the course material and understand how to approach ethical problems in NLP, while also allowing for exploration of various methodologies.
- **Project (30%).** A semester-long 3- or 4-person team project (more details below and in class).
- **Discussions (20%).** Classes will include discussions of reading assignments. Students will be expected to read relevant papers and participate in class discussions.
- **Attendance (10%).** Attending classes, including lectures and guest lectures, is an expected part of this class.
----
### Attendance Policy
Since you signed up for this class, you are expected to attend all lectures. However, we all know that sometimes things get in the way.
Each class we will require you to check in, but you will get 4 unexcused absences without a warning (i.e., absences that you don't have to explain why you're not there). You do get as many excused absences (e.g., due to sickness) as needed per CMU university policy. Note that for longer-term absences and accommodations (e.g., long term sickness, taking care of family members, etc.) you will need to contact CMU's [Office of Disability Resources](https://www.cmu.edu/disability-resources/).
For discussion classes and project milestones (e.g., mid-semester check-in, final project presentation), you have to be present, no unexcused absences allowed. As usual, please email the instructors if there are conflicts.
Also note, if you arrive 5+ minutes late you will be marked as absent for that class.
----
### Projects
A major component of this course is a team project. It will be a substantial research effort carried out by each group of students (expected group size = 3; 2-4 is acceptable). You can find some project ideas and resources [here](https://docs.google.com/document/d/1PfUU79IyiQUlbENwthkjbg4pTE-hAhXX9p7PwQLTNhg/edit?usp=sharing). Please use the "Find project groups" discussion in Canvas to find classmates with complementary interests and form groups. There will be a number of project milestones throughout the semester:
- **Pre-proposal:** (2%) Brainstorming phase: pick two or three project ideas and flesh them out, in a 1-2 paragraph describing the focus area of the project along with quick-and-dirty method descriptions and stretch goals. Also, define team members.
- **Proposal:** (8%) Pick a project: A 2-3 page document ([ACL format](https://github.com/acl-org/acl-style-files)) containing a literature review, concrete problem definition and bias statement, evaluation criteria, identification of baseline models, and ideas for final models. Sections should include Introduction (incl. Bias Statement), Related Work, Data, Evaluation, Baseline, Proposed Approach. Baselines should be clearly defined but do not need to be implemented yet.
- **Midterm check-ins:** (2%) An in-class presentation of project and current progress. Presentation should include problem definition, baseline models and results, and description of proposed models.
- **Final Presentations:** (6%) In-class presentations of the project will be held during the last week of classes. See here for [rubric and tips on how to make your final presentation](projectRubric.html).
- **Final Report:** (12%) A final project report will be due the following week. We will use the 8-page ACL Rolling Review [format](https://github.com/acl-org/acl-style-files), [author guidelines](https://aclrollingreview.org/authors), and [Code of Ethics](https://www.aclweb.org/portal/content/acl-code-ethics). See [here for a rubric](projectRubric.html).
----
### Discussion format
Some of the lectures will be discussion-based. Each student will be assigned a paper and a role, and students with the same paper will be in the same group. You will prepare a summary according to that role before class, and during class you will create a short presentation for the entire class, which will then be used to seed a broader entire-class discussion. After class, you will post a reflection on the discussion you just had.
#### Discussion roles
To make the discussion more lively and engaging, we will adopt reading roles. Each student will play one of the following roles:
- *Citation Trail Archaeologist*: who places the paper in the broader research lineage by identifying key prior influence(s) and meaningful later response(s).
- Identify **one key predecessor** (an older paper cited by the current paper) that substantially shaped the current paper’s problem framing, method, or evaluation.
- In 2–4 bullet points: summarize the predecessor’s relevant idea/result and explicitly state what the current paper inherits/adapts.
- Identify **one key successor** (a newer paper that cites the current paper) that meaningfully extends, replicates, operationalizes, or critiques it.
- In 2–4 bullet points: summarize what the successor changes (e.g., metric, dataset, threat model, method) and whether it strengthens or weakens the current paper’s conclusions.
- End with **one “lineage takeaway”**: one sentence on how the conversation changed from predecessor → current paper → successor.
- *Within-Week Connector*: who connects the assigned paper to the other papers being discussed this week and identifies tensions worth debating.
- Choose **two** of the other assigned papers for the week.
- For each chosen paper, write 3–5 bullet points addressing:
- **Connection:** what shared question/setting/assumption makes these papers comparable?
- **Key difference:** where do they diverge (definitions, threat model, measurement/metric, dataset/population, intervention point, conclusions)?
- **Implication:** what would each paper say in response to the other?
- Provide **one cross-paper discussion question** that forces the class to compare assumptions or evidence across papers.
- *Evaluation & Validity Auditor*: who audits whether the paper’s evaluation and evidence justify its headline claims. Focus on *validity threats* rather than generic “limitations.”
- Identify one main claim of the paper and outline the “evaluation chain” (dataset/population → metric(s) → conclusion).
- Discuss **at least two** concrete validity threats (e.g., construct/metric mismatch, baseline unfairness, confounds, distribution shift/external validity, subgroup validity).
- For each validity threat, propose a feasible fix (an additional experiment, alternative metric, new slice, ablation, revised annotation protocol, etc.) and explain how it could change the conclusions.
- *Next-Step Study Designer*: who proposes one follow-up study that is only possible/meaningful because this paper exists, specified enough that someone could actually run it.
- State a clear **motivation**, **research question** and (if applicable) **a hypothesis**.
- Describe a **minimal study design**: dataset/population, metric(s), baseline(s)/comparisons, and what result would change your mind (a falsification criterion).
- *Methodology Challenger (Alternative Methods Advocate)*: who proposes a plausible alternative way to study the same question (method, measure, or study design) and argues how it could change the paper’s conclusions.
- Identify **one key methodological choice** the paper relies on (e.g., study setting, participants, tasks, dataset/labels, metric, analysis).
- Propose **one concrete alternative methodology** (e.g., field deployment instead of lab study; interviews instead of survey; different operationalization/metric; different comparison condition) and what you’d measure instead.
- In 2–4 bullet points: explain **what validity risk** the alternative addresses and **how results might differ** (what outcome would strengthen vs weaken the paper’s claim).
- End with **one discussion question** that forces the class to compare the paper’s method to your alternative using a specific result from the paper.
**Anchor requirements (apply to every role write-up):**
- In your writeup, include **at least two anchors** to the assigned paper:
1) **One artifact anchor**: a specific figure/table/algorithm box/appendix result (e.g., “Table 2, row X…”), and
2) **One design-choice anchor**: a concrete methodological choice (dataset, labeling procedure, model family, threat model assumption, metric definition, prompt format, filtering policy, etc.).
- Role write-ups should be primarily bullet points and must include **one discussion question** that can’t be answered without referencing the paper’s specifics.
#### In-class discussion
- **20–30 minutes: small-group (“in-group”) discussion**
- Each role should briefly share the key points from their notes (aim for ~2–3 minutes per role).
- As a group, you will prepare a **very short slide deck** (intended to support discussion, not a formal presentation). Your slides should include only what the rest of the class needs in order to engage with your discussion question.
- Please include the following (you may use 1–3 slides total):
- **Paper title** (and author/year)
- **Main research question / objective** + **main method/approach** (1–2 bullets each)
- **Main findings / takeaways** (2–4 bullets)
- **Key evidence**: at least one figure/table/result you think is central (you may paste a screenshot with a one-sentence caption)
- **Your discussion question** (the question you want the class to spend time on)
- **50–60 minutes: full-class discussion**
- For each paper:
- The group will give a **brief presentation (max 2 minutes)** using their slides, ending by stating their discussion question.
- We will then spend **~8–10 minutes** in full-class discussion. Discussion may include clarifying questions, critiques of assumptions/metrics, connections to other papers, and attempts to answer the group’s seed question.
- **Participation expectation:** every student should contribute to the full-class discussion at least once during the class meeting (e.g., ask a question, offer a critique, connect two papers, or respond to a peer’s point).
- **Last 5 minutes: reflection**
- Students will write a brief reflection identifying **one new thing** they learned from the discussion (conceptual insight, changed view, connection across papers, or a question they are still thinking about).
#### Discussion grading
- Pre-discussion summary (due day before class): 30%
- In-class presentation (graded as a group): 30%
- In-class participation (graded individually): 20%
- After class reflection (due midnight day of discussion class): 20%
----
### Policies
**Late days policy.** There are no late days for any of the assignments, whether it be homework or project deadlines. For projects, any lateness will result in a zero for that milestone or assignment. For homework assignments, for every day that you submit your homework late, you will receive a deduction proportional to the lateness (10% per late day; if you're 2 days late, you will lose 20% of your grade). You have a 30-minute grace period before the lateness deduction will be applied (i.e., you can submit up to 30 minutes after the deadline without any penalty, but any minute after will result in 10% deduction).
However, if you need an extension for project or homework deadlines, you are free to ask the instructors and we will review on a case-by-case basis, but *you must email at least 24h before the deadline*.
**Academic honesty.** Homework assignments are to be completed individually. Verbal collaboration on homework assignments is acceptable, as well as re-implementation of relevant algorithms from research papers, but everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. The project is to be completed by a team. You are encouraged to use existing NLP components in your project; you must acknowledge these appropriately in the documentation. Suspected violations of academic integrity rules will be handled in accordance with the [CMU guidelines on collaboration and cheating](http://www.cmu.edu/policies/documents/Cheating.html).
**AI-assisted writing policy.** You are allowed to use LLMs and generative AI in your assignments and projects, unless specified otherwise. However, you are responsible for all the AI's outputs as if they were your own, so make sure that you verify the outputs, make sure they are detailed enough, etc. Also, **You must disclose in your writeups how you used GenAI in your workflows** in a detailed manner (e.g., help with grammar correction, automating implementation of certain functions, turning bullets into prose, finding relevant papers for topic X, etc.); do not simply say things like help with coding, help with writing as that is not specific enough.
**Accommodations for students with disabilities.** If you have a disability and have an accommodations letter from the Disability Resources office, we encourage you to discuss your accommodations and needs with the instructors as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the [Office of Disability Resources](https://www.cmu.edu/disability-resources/), we encourage you to contact them via their website.
----
### Prerequisites and required skills
Since the focus is on critical thinking about AI and NLP technologies, this course is meant to be accessible to many people, with only basic experience with machine learning or natural language processing skills required.
Additionally, projects can have varying degrees of computational or algorithmic components, and can be qualitative in nature.
To help you calibrate, **in the past, several HCII students have taken the course and done well**.
----
### Note to students
**Take care of yourself!** As a student, you may experience a range of challenges that can interfere with learning, such as strained relationships, increased anxiety, substance use, feeling down, difficulty concentrating and/or lack of motivation. This is normal, and all of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of a healthy life is learning how to ask for help. Asking for support sooner rather than later is almost always helpful. CMU services are available free to students, and treatment does work. You can learn more about confidential mental health services available on campus through [Counseling and Psychological Services (CaPS)](https://www.cmu.edu/counseling/). Support is always available (24/7) at: 412-268-2922.
**Take care of your classmates and instructors!** In this class, every individual will and must be treated with respect. The ways we are diverse are many and are fundamental to building and maintaining an equitable and an inclusive campus community. These include but are not limited to: race, color, national origin, caste, sex, disability (visible or invisible), age, sexual orientation, gender identity, religion, creed, ancestry, belief, veteran status, or genetic information.
Research shows that greater diversity across individuals leads to greater creativity in the group. We at CMU work to promote diversity, equity and inclusion not only because it is necessary for excellence and innovation, but because it is just. Therefore, while we are imperfect, we ask you all to fully commit to work, both inside and outside of our classrooms to increase our commitment to build and sustain a campus community that embraces these core values. It is the responsibility of each of us to create a safer and more inclusive environment. Incidents of bias or discrimination, whether intentional or unintentional in their occurrence, contribute to creating an unwelcoming environment for individuals and groups at the university. If you experience or observe unfair or hostile treatment on the basis of identity, we encourage you to speak out for justice and offer support in the moment and/or share your experience using the following resources:
- [Center for Student Diversity and Inclusion](https://www.cmu.edu/student-diversity/): [csdi@andrew.cmu.edu](mailto:csdi@andrew.cmu.edu), (412) 268-2150
- [CMU anonymous reporting hotline](https://secure.ethicspoint.com/domain/media/en/gui/81082/index.html), (844) 587-0793