Toggle navigation
CS 333: NLP
Main
Resources
Schedule
Syllabus
Schedule (Tentative)
Week
Topic
Assignment
Readings
Week 1: 9/2 & 9/5
What is NLP?
Processing Text
Poem dataset
Class poetry bot
Homework 0 (due 9/4)
Starter Code
Homework 1 (due 9/11)
Starter Code
Bisk et al. (2021)
Reference:
Think Bayes Chapter 2
Week 2: 9/9 & 9/12
Information in Language
Tokenization
Gutenberg dataset
Word Frequency Script
Homework 2 (due 9/18)
Starter Code
J&M 2 (Quiz 1)
Reference:
In Defense of Tokenizers
Week 3: 9/16 & 9/19
Language Models
LM Evaluation and Smoothing
N-gram Language Model Script
Homework 3 (due 9/25)
Starter Code
J&M 3 (Quiz 2)
Week 4: 9/23 & 9/26
Naive Bayes Classifiers
Evaluating Classifiers
Homework 4 (due 10/6)
Starter Code
Big GoodReads Dataset
J&M Appendix B (Quiz 3)
Reference:
Naive Bayes recap
Week 5: 9/30 & 10/3
Vector Semantics
TF-IDF and Word2Vec
Word Embedding Demo
J&M 5 (Quiz 4)
Week 6: 10/7 & 10/10
Regression
Language ID dataset
Language ID script
Midterm 1 (10/10)
Formula Sheet
Reference Documentation
J&M 4 (No quiz)
Reference:
Regression in more depth
Week 7: 10/17
Gradient Descent
Language ID dataset
Regression code
Homework 5 (due 10/23)
Starter Code
Week 8: 10/21 & 10/24
Neural networks
Feedforward networks
Video of Gradient Computation
Language ID dataset (test/train splits)
Feedforward code
Homework 6 (due 11/1)
Starter Code
J&M 6 (Quiz 6)
Week 9: 10/31
RNNs
J&M 7 (Quiz 7 on Friday)
Reference:
Neural MT blog post
Week 10: 11/4 & 11/7
Attention
Midterm 2 (11/7)
Topic List
J&M 8 (No Quiz)
Week 11: 11/11 & 11/14
Transformers
Homework 7 (due 11/17)
Starter Code
J&M 10 (Quiz 9)
Reference:
GPT-2 blog post
Reference:
BERT paper
Week 12: 11/18 & 11/21
Transfer Learning and Alignment
Prompt Engineering and Alignment
Homework 8 (due 11/24)
Starter Code
J&M 9 (Quiz 10)
Reference:
Overview of transfer learning
Week 13: 11/25
Whiteboard practice
Final Project
Week 14: 12/2 & 12/5
Interpretability and Future of NLP
Guest lecture
: Jin Zhao
Blasi et al. (2021)
Week 15: 12/9
Project presentations