- In class: syllabus overview, intro/transcription exercise
- Before class:
- Read: Li-Young Lee, “Persimmons”
- HW0 due: Video intro (on Canvas)
- In class: close reading and Voyant exercise
- Before class:
- Read: Farhad Manjoo, "How Do You Know a Human Wrote This?"
- Spend at least 30 minutes playing AI Dungeon
- HW1 due: Introduce yourself with GPT-2 (on Canvas)
- In class: demo JupyterHub / Jupyter Notebook (class notebook)
- Before class:
- Read: Michael Whitmore, “Text: A Massively Addressable Object"
- Read: Emily M. Bender and Timnit Gebru et al., “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?”
- In class: discussion and more intro to Jupyter / Python (class notebook)
- Before class:
- Read: Lilly Irani, “Justice for ‘Data Janitors’”
- Watch: Andrew Norman Wilson, "Workers Leaving the Googleplex"
- Read: Astrid Smith and Bridget Whearty, “All the Work You Do Not See” (Canvas)
- HW2 due: Intro to Python / Strings (HW2 notebook)
- In class: getting started with web scraping and HTML (class notebook and completed version)
- Before class:
- Read: Astead Herndon et al.,, “What Do Rally Playlists Say About the Candidates?”
- Read: Hanah Anderson and Matt Daniels, “Film Dialogue”
- In class: web scraping on the open web (class notebook and completed version)
- Before class:
- Read: Xavier Adam, “An Illustrated Introduction to APIs” and “API Whispering 101”
- Quiz 1 due: Scraping song lyrics from Genius.com
- In class: APIs (class notebook #1: API intro (completed version) and class notebook #2: Genius API (completed version))
- Before class:
- Read: Joel Spolsky, “The Absolute Minimum Every Software Developer Absolutely, Positivitely Must Know About Unicode and Character Sets (No Excuses!)
- Optional more technical version of the previous post: David Zentgraf, "What Every Programmer Absolutely, Positively Needs to Know about Encodings and Character Sets to Work with Text"
- Also optional but very interesting: Miriam Sweeney and Kelsea Whaley, “Technically White: Emoji Skin-Tone Modifiers as American Technoculture”
- In class: text parsing and regex with your song lyrics (class notebook #1: regex intro (completed version) and class notebook #2: regex w/ song lyrics (completed version))
- Before class:
- Read: Ethan Reed, “Measured Unrest in the Poetry of the Black Arts Movement”, "Poems with Pattern and VADER, Part 1: Quincy Troupe", "Poems with Pattern and VADER, Part 2: Nikki Giovanni"
- Read: Sujay Khandekar et al., “Opico: A Study of Emoji-first Communication in a Mobile Social App”
- Quiz 2 due: Scraping song lyrics using the Genius API
- In class: final lyrics scraping (notebook and complete notebook) and sentiment analysis (notebook and complete notebook)
- Before class:
- Read: Patrick Juola, “How a Computer Program Helped Show J.K. Rowling Wrote A Cuckoo’s Calling”
- Read: Milo Beckman, “These are the Phrases Each GOP Candidate Uses Most”
- Quiz 3 due: Sentiment analysis of your song lyrics
- In class: NLP with spaCy (notebook and complete notebook)
- Before class:
- Read: Maarten Sap et al., “Connotation Frames of Power and Agency in Modern Films”
- Maria Antoniak et al, “Narrative Paths and Negotiation of Power in Birth Stories”
- In class: Guest lecture, Maria Antoniak, Cornell
- Before class:
- Optional: Daniel Jurafsky & James H. Martin, "Vector Semantics & Embeddings": SECTIONS 6-6.3
- In class: intro to scikit-learn (notebook and complete notebook)
- Before class:
- Read: Matt Daniels, “The Language of Hip Hop”
- Optional: Daniel Jurafsky & James H. Martin, "Vector Semantics & Embeddings": SECTIONS 6.5-6.6
- Optional: Lauren Klein, "Dimensions of Scale"
- In class: TF-IDF (notebook and complete notebook)
- Before class:
- Read: Lucy Li and David Bamman, “Gender and Representation Bias in GPT-3 Generated Stories”
- Optional: Richard Jean So, “Consecration: The Canon and Racial Inequality,” from Redlining Culture (Canvas)
- In class: topic modeling (notebook)
- Before class:
- Read: Lauren Klein and Sandeep Soni, “How Words Lead to Justice”
- Optional: Laura K. Nelson, “Leveraging the Alignment Between Machine Learning and Intersectionality” (Canvas)
- HW3 due: topic modeling notebook
- In class: word embeddings (notebook)
- Before class:
- Quiz 4 due: Exploratory research exercise
- In class: Guest lecture, Dr. Sandeep Soni, UC Berkeley
- Before class:
- Read: Anelise Hanson Shrout, "(Re)Humanizing Data: Digitally Navigating the Bellevue Almshouse"
- Optional: Jessica Marie Johnson, “Markup Bodies” (Canvas)
- In class: pandas (notebook); project brainstorming session
- Before class:
- Read: Timnit Gebru et al., “Datasheets for Datasets”
- Optional: Catherine D’Ignazio and Lauren Klein, “The Numbers Don’t Speak for Themselves,” from Data Feminism
- Final project prep (FPP) #1 due: Formal project brainstorm
- In class: pandas ii (notebook #1 and notebook #2); discussion of data and its limits
- Before class:
- Read: Dan Sinykin and Edwin Roland, "Against Conglomeration: Nonprofit Publishing and American Literature after 1980"
- In class: classification, part 1 (notebook)
- Before class:
- Read: Terra Blevins et al., “Automatically Processing Tweets from Gang-Involved Youth: Towards Detecting Loss and Aggression”
- FPP #2 due: Datasheet OR project proposal
- In class: classification, part 2 (notebook)
- Before class:
- Read: Ben Schmidt, "Genre, Manifolds, and AI"
- Read: Matthew Wilkens, "Genre, Computation, and the Varieties of 20th Century U.S. Fiction" (Canvas)
- In class: clustering (notebook and complete notebook)
- Before class:
- FPP #3 due: Datasheet OR project proposal
- In class: classification with BERT colab notebook
- Before class:
- Read: Lucy Li and David Bamman, “Characterizing English Variation across Social Media Communities with BERT”
- Read: Ted Underwood, "How Predictable Is Fiction?"
- In class: Guest lecture, Lucy Li, UC Berkeley
- Before class:
- Read: Dong Nguyen et al., “How we do things with words: Analyzing text as social and cultural data”
- Optional: Richard Jean So, "All Models are Wrong"”
- FPP #4 due: Final project first pass
- In class: more BERT (word similarity notebook and HuggingFace pipeline functions notebook); discussion of models and their limits
November 30 – Project Presentations (sign-up sheet)
December 2 – Project Presentations (sign-up sheet)
This syllabus draws from previous iterations of QTM 340 taught by myself and Dan Sinykin. It also incorporates materials and resources developed by Melanie Walsh, Jinho Choi, Alison Parrish, David Mimno, David Bamman, Ryan Cordell, and Ben Schmidt, as well as suggestions and other input from Heather Froehlich, Ted Underwood, Jacob Eisenstein, Jim Casey, Taylor Arnold, Lauren Tilton, Lisa Rhody, Eileen Clancy, and the Colored Conventions Project Team.