# [11-830 Computational Ethics in NLP]() ## **Spring 2023** - **Time**: 11:00-12:20 Tuesdays & Thursdays - **Place**: WEH 5403 (Wean Hall) - **Canvas**: [https://canvas.cmu.edu/courses/32412](https://canvas.cmu.edu/courses/32412) (for discussions, assignments, questions, etc.) - **Zoom**: see [Canvas](https://canvas.cmu.edu/courses/32412) - For other, e.g., more personal, concerns, email the instructors: [11-830-instructors@cs.cmu.edu](mailto:11-830-instructors@cs.cmu.edu) ----- ### Summary As language technologies have become increasingly prevalent, there is a growing awareness that decisions we make about our data, methods, and tools are often tied up with their impact on people and societies. This course introduces students to real-world applications of language technologies and the potential ethical implications associated with them. We discuss philosophical foundations of ethical research along with advanced state-of-the art techniques. Discussion topics include: - **Philosophical foundations:** what is ethics, history, medical and psychological experiments, IRB and human subjects, ethical decision making. - **Misrepresentation and bias:** algorithms to identify biases in models and data and adversarial approaches to debiasing. - **Civility in communication:** techniques to monitor trolling, hate speech, abusive language, cyberbullying, toxic comments. - **Democracy and the language of manipulation:** approaches to identify propaganda and manipulation in news, to identify fake news, political framing. - **Privacy & security :** algorithms for demographic inference, personality profiling, and anonymization of demographic and personal traits. - **NLP for Social Good:** Low-resource NLP, applications for disaster response and monitoring diseases, medical applications, psychological counseling, interfaces for accessibility. - **Multidisciplinary perspective:** invited lectures from experts in behavioral and social sciences, rhetoric, etc.



Athiya (Tia) Deviyani

MSAII Student

Office hours: 1-2pm (Room TBD + Zoom)

Jeremiah Milbauer

PhD student, LTI

Office hours: 1:00 - 2:00 (GHC 6401 + Zoom)
Friday 10:00-11:00 (GHC 6401 + Zoom, 1:1, sign-up required)
See Canvas for Zoom link

---- ## Syllabus Syllabus is subject to change, this will always be the latest version. | Week | Date | Theme | Topics | Instructor | Readings | Assignments | |------|-------|-------------------------------|------------------------------------------------------------|-------------------------------------------------------------------------------|----------|----------------------------------------------| | 1 | 01/17 | Introduction | Motivation, requirements and overview | Emma | | | | | 01/19 | Introduction | Project expectations and ideas | Emma & Maarten | | | | 2 | 01/24 | Foundations & History | Overview: History of ethics, moral theories, etc. | Maarten | | | | | 01/26 | Foundations & History | *Invited Talk* (virtual) | [Sydney Levine](https://sites.google.com/site/sydneymlevine/) | | Preproposals & project teams due Friday 1/27 | | 3 | 01/31 | Foundations & History | Human subjects research, IRB, crowdsourcing | Maarten | | | | | 02/02 | Foundations & History | Discussion | Emma & Maarten | | Proposal due Monday 2/06 | | 4 | 02/07 | Objectivity & Bias | Overview: Stereotypes, prejudice, and discrimination | Maarten | | HW1 released | | | 02/09 | Objectivity & Bias | Bias detection, debiasing, perspectivism | Maarten | | | | 5 | 02/14 | Objectivity & Bias | Discussion | Emma & Maarten | | | | | 02/16 | Objectivity & Bias | *Invited Talk* | [David Widder](https://davidwidder.me/) | | HW 1 due | | 6 | 02/21 | Civility & Toxicity | Overview: Online Toxicity, Hate Speech, Content Moderation | Maarten | | HW 2 released | | | 02/23 | Civility & Toxicity | Toxicity in conversations | Maarten | | | | 7 | 02/28 | Civility & Toxicity | Detoxifying large LMs | Maarten | | | | | 03/02 | Civility & Toxicity | Discussion | Emma & Maarten | | HW 2 due | | 8 | 03/07 | No class - spring break | | | | | | | 03/09 | No class - spring break | | | | | | 9 | 03/14 | Misinformation & Manipulation | Overview: Fake news | Emma | | | | | 03/16 | Misinformation & Manipulation | Discussion | Emma & Maarten | | | | 10 | 03/21 | Midterm project check-ins | ---- | Emma & Maarten | | Midterm project check ins | | | 03/23 | Transparency & Open Science | *Invited Talk* | [Jesse Dodge](https://jessedodge.github.io/) | | HW 3 released | | 11 | 03/28 | Misinformation & Manipulation | *Invited Talk* (virtual) | [Saadia Gabriel](https://homes.cs.washington.edu/~skgabrie/) | | | | | 03/30 | Privacy, Profiling, Security | Overview: Privacy and profiling | Emma | | | | 12 | 04/04 | Privacy, Profiling, Security | *Invited Talk* (virtual) | [Oluwaseyi Feyisetan](https://scholar.google.com/citations?user=XEnB9X4AAAAJ) | | HW 3 due | | | 04/06 | NLP for Social Good | Overview: Challenges and pitfalls of NLP for social good | Emma | | | | 13 | 04/11 | NLP for Social Good | *Invited Talk* | Jeremiah | | | | | 04/13 | No class -- Spring Carnival | | | | | | 14 | 04/18 | NLP for Social Good | Energy considerations of AI & Climate Change | Emma | | | | | 04/20 | NLP for Social Good | Discussion | Emma & Maarten | | | | 15 | 04/25 | Final project presentations | | Emma & Maarten | | Final project presentations | | | 04/27 | Final project presentations | | Emma & Maarten | | Final project presentations | ---- ### Invited Speakers

Sydney Levine - Formal Models of Human Moral Flexibility

One of the most remarkable things about the human moral mind is its flexibility: we can make moral judgments about cases we have never seen before.Yet, on its face, morality often seems like a highly rigid system of clearly defined rules. Indeed, the past few decades of research in moral psychology have revealed that human moral judgment often depends on rules. But sometimes, it is morally appropriate to break the rules. And sometimes, new rules need to be created. The field of moral psychology is just now beginning to explore and understand this kind of flexibility. Meanwhile, the flexibility of the human moral mind poses a challenge for AI engineers. Current tools for building AI systems fall short of capturing moral flexibility and thus struggle to predict and produce human-like moral judgments in novel cases that the system hasn’t been trained on. I will present a series of experiments and models (inspired by theories from moral philosophy) that demonstrate and capture the human capacity for rule making and breaking. I propose that AI systems would benefit from formal models of human moral flexibility.

Sydney Levine is a Research Scientist at the Allen Institute for AI. She studies the cognitive science of moral judgment and thinks about how that can help us build artificial intelligence systems that can interpret, predict, and produce human-like moral decisions.

David Gray Widder - Talk TBA

David Gray Widder (he/him) studies how people creating “Artificial Intelligence” systems think about the downstream harms their systems make possible. He is a Doctoral Student in the School of Computer Science at Carnegie Mellon University, and previously worked at Intel Labs, Microsoft Research, and NASA’s Jet Propulsion Laboratory. He was born in Tillamook, Oregon and raised in Berlin and Singapore. He maintains a conceptual-realist artistic practice, advocates against police terror and pervasive surveillance, and enjoys distance running.

Jesse Dodge - Talk TBA

Jesse Dodge (he/him) is a Research Scientist at the Allen Institute for AI, on the AllenNLP team, working on natural language processing and machine learning. He is interested in the science of AI, and he works on reproducibility and efficiency in AI research. His research has highlighted the growing computational cost of AI systems, including the environmental impact of AI and inequality in the research community. He has worked extensively on improving transparency in AI research, including open sourcing and documenting datasets, data governance, and measuring bias in data. He has also worked on developing efficient methods, including model compression and improving efficiency of training large language models. His PhD is from the Language Technologies Institute in the School of Computer Science at Carnegie Mellon University. He created the NLP Reproducibility Checklist, which has been used by five main NLP conferences, including EMNLP, NAACL, and ACL, totaling more than 10,000 submissions, and he has been an organizer for the ML Reproducibility Challenge since 2020. His research has won awards including a Best Student Paper at NAACL 2015 and a ten-year Test of Time award at ACL 2022, and is regularly covered by the press, including by outlets like The New York Times, Nature, MIT Tech Review, Wired, and others.

Saadia Gabriel - Talk TBA

Saadia Gabriel is a final-year PhD candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, where she is advised by Prof. Yejin Choi and Prof. Franziska Roesner. Her work focuses on measuring factuality and intent of human-written language. Specifically, she is interested in designing generalizable end-to-end modeling frameworks based upon objectives that are directly aligned with the underlying motivations of a task. Two key dimensions of machine reasoning that excites her are social commonsense reasoning and fairness in NLP. Previously she interned at SRI, in the Mosaic group at AI2 and MSR.

Oluwaseyi Feyisetanol - Talk TBA

Oluwaseyi Feyisetan is an Applied Scientist at Amazon Alexa where he works on Differential Privacy and Privacy Auditing mechanisms within the context of Natural Language Processing. He holds 2 pending patents with Amazon on preserving privacy in NLP systems. He completed his PhD at the University of Southampton in the UK and has published in top tier conferences and journals on crowdsourcing, homomorphic encryption, and privacy in the context of Active Learning and NLP. He has served as a reviewer at top NLP conferences including ACL and EMNLP. He is the lead organizer of the Workshop on Privacy and Natural Language Processing (PrivateNLP) at WSDM with an upcoming event scheduled for EMNLP. Prior to working at Amazon in the US, he spent 7 years in the UK where he worked at different startups and institutions focusing on regulatory compliance, machine learning and NLP within the finance sector, most recently, at the Bank of America.

---- ### Grading Grades are based on a combination of homeworks and participation (individual) and a semester-long class project (group). All assignments are due at 11:59pm Eastern on the specified date. - **Homework assignments.** (3 assignments; 15% each) Each assignment contains a combination of coding, analysis, and discussion. For each assignment, completing the baseline requirements will obtain a passing (B-range) grade. A-range grades can be obtained through completing the open-ended “Advanced Analysis” part of the assignment. Assignments are not necessarily designed to focus on technical solutions, but instead to encourage students to think critically about the course material and understand how to approach ethical problems in NLP, while also allowing for exploration of various methodologies. - **Project.** (30%) a semester-long 3-person team project (more details below and in class). - **Participation.** (25%) classes will include discussions of reading assignments. Students will be expected to read relevant papers and participate in class discussions. Participation points can also be earned by posting interesting questions and useful answers on Canvas/Slack. ---- ### Projects A major component of this course is a team project. It will be a substantial research effort carried out by each group of students (expected group size = 3; 2-4 is acceptable). You can find some project ideas and resources [here](https://docs.google.com/document/d/1PfUU79IyiQUlbENwthkjbg4pTE-hAhXX9p7PwQLTNhg/edit?usp=sharing). Please use the "Find project groups" discussion in Canvas to find classmates with complementary interests and form groups. There will be a number of project milestones throughout the semester: - **Pre-proposal:** (2%) Brainstorming phase: pick two or three project ideas and flesh them out, in a 1-2 paragraph describing the focus area of the project along with quick-and-dirty method descriptions and stretch goals. Also, define team members. - **Proposal:** (8%) Pick a project: A 2-3 page document ([ACL format](https://github.com/acl-org/acl-style-files)) containing a literature review, concrete problem definition, identification of baseline models, and ideas for final models. Sections should include Introduction, Related Work, Data, Baseline, Proposed Approach. Baselines should be clearly defined but do not need to be implemented yet. - **Midterm check-ins:** (2%) An in-class presentation of project and current progress. Presentation should include problem definition, baseline models and results, and description of proposed models. - **Final Presentations:** (6%) In-class presentations of the project will be held during the last week of classes. - **Final Report:** (12%) A final project report will be due the following week. We will use the 8-page ACL Rolling Review [format](https://github.com/acl-org/acl-style-files), [author guidelines](https://aclrollingreview.org/authors), and [Code of Ethics](https://www.aclweb.org/portal/content/acl-code-ethics). ---- ### Discussion format Some of the lectures will be discussion-based. To make the discussion more lively and engaging, we will adopt the role-playing reading roles from [Alec Jacobson and Colin Raffel](https://colinraffel.com/blog/role-playing-seminar.html), inspired by Aditi Raghunathan's [15-884 class](https://www.cs.cmu.edu/~aditirag/teaching/15-884F22.html). Each student will play one of the following roles: - *Positive reviewer*: who advocates for the paper to be accepted at a conference (e.g., ACL, EMNLP, FAccT). - *Negative reviewer*: who advocates for the paper to be rejected at a conference (e.g., ACL, EMNLP, FAccT). - *Archaeologist*: who determines where this paper sits in the context of previous and subsequent work. They must find and report on at least one older paper cited within the current paper that substantially influenced the current paper and atleast one newer paper that cites this current paper. Keep an eye out for follow-up work that contradicts the takeaways in the current paper. - *Academic researcher*: who proposes potential follow-up projects not just based on the current paper but also only possible due to the existence and success of the current paper. - *Visitor from the past*: who is a researcher from the early 2000s. They must discuss how they comprehend the results of the paper, what they like or dislike about the settings and benchmarks considered, and what surprises them the most about presented results. Discussion grading: - You will be automatically assigned a role for each discussion class - **Before each discussion class**, you will post your talking points on Canvas (due the day before class). These notes should include your role, and 2-4 talking points. Make sure to go into enough depth that others can understand what you're saying (simply adding more shallow talking points will not lead to a better grade). - **After each discussion class**, you will reply to your notes and briefly discuss one reflection or point that you had not previously thought about. This can be significantly shorter than the initial post. ---- ### Policies **Late policy.** Each student will have 4 total late days that may be used for HW assignments at any point during the semester. Once the 4 days have been used up, you can still submit your assignments late, but at a loss of 20% (i.e., maximum grade is 80%), and not after the next assigment is due / after finals week. *Note: late days may not be used for project benchmarks.* **Academic honesty.** Homework assignments are to be completed individually. Verbal collaboration on homework assignments is acceptable, as well as re-implementation of relevant algorithms from research papers, but everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. The project is to be completed by a team. You are encouraged to use existing NLP components in your project; you must acknowledge these appropriately in the documentation. Suspected violations of academic integrity rules will be handled in accordance with the [CMU guidelines on collaboration and cheating](http://www.cmu.edu/policies/documents/Cheating.html). **AI-assisted writing policy.** We will follow the [ACL 2023 policy](https://2023.aclweb.org/blog/ACL-2023-policy/) on the use of AI assistants for writing. This means you may use AI tools for checking your writing and correcting your grammar; any usage beyond that (idea generation, literature review, etc.) is strongly discouraged or forbidden, see the policy for details. **Accommodations for students with disabilities.** If you have a disability and have an accommodations letter from the Disability Resources office, we encourage you to discuss your accommodations and needs with the instructors as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, we encourage you to contact them at [access@andrew.cmu.edu](mailto:access@andrew.cmu.edu). ---- ### Note to students **Take care of yourself!** As a student, you may experience a range of challenges that can interfere with learning, such as strained relationships, increased anxiety, substance use, feeling down, difficulty concentrating and/or lack of motivation. This is normal, and all of us benefit from support during times of struggle. There are many helpful resources available on campus and an important part of a healthy life is learning how to ask for help. Asking for support sooner rather than later is almost always helpful. CMU services are available free to students, and treatment does work. You can learn more about confidential mental health services available on campus through [Counseling and Psychological Services (CaPS)](https://www.cmu.edu/counseling/). Support is always available (24/7) at: 412-268-2922. **Take care of your classmates and instructors!** In this class, every individual will and must be treated with respect. The ways we are diverse are many and are fundamental to building and maintaining an equitable and an inclusive campus community. These include but are not limited to: race, color, national origin, caste, sex, disability (visible or invisible), age, sexual orientation, gender identity, religion, creed, ancestry, belief, veteran status, or genetic information. Research shows that greater diversity across individuals leads to greater creativity in the group. We at CMU work to promote diversity, equity and inclusion not only because it is necessary for excellence and innovation, but because it is just. Therefore, while we are imperfect, we ask you all to fully commit to work, both inside and outside of our classrooms to increase our commitment to build and sustain a campus community that embraces these core values. It is the responsibility of each of us to create a safer and more inclusive environment. Incidents of bias or discrimination, whether intentional or unintentional in their occurrence, contribute to creating an unwelcoming environment for individuals and groups at the university. If you experience or observe unfair or hostile treatment on the basis of identity, we encourage you to speak out for justice and offer support in the moment and/or share your experience using the following resources: - [Center for Student Diversity and Inclusion](https://www.cmu.edu/student-diversity/): [csdi@andrew.cmu.edu](mailto:csdi@andrew.cmu.edu), (412) 268-2150 - [CMU anonymous reporting hotline](https://secure.ethicspoint.com/domain/media/en/gui/81082/index.html), (844) 587-0793