Calendar

Previous offerings: Fall 2023, Spring 2023
The schedule below is tentative and subject to change.
All course materials can be found on Github.
We do not have a reference textbook, but some lectures follow materials from Speech and Language Processing (JM below) and Dive into Deep Learning (D2L below).

Supervised learning

Sep 4

Text classification [recording]: HW 1 outHW1 [pdf]

Course overview
Supervised learning basics
Feature-based text classification
Additional readings
- Textbook: JM Ch4.1, JM Ch5

Sep 5

Section Python/Numpy review [notebook] ; BoW example [notebook]

Sep 11

Word embedding [recording]

Distributed representation of words
Learning word vectors
Additional readings
- Textbook: JM Ch6
- Original word2vec paper: Efficient estimation of word representations in vector space

Sep 12

Section Word vector algebra [slides] ; [notebook]

Sep 18

Sequence modeling [recording]

Neural network basics
RNN and its variants
Attention and Transformers
Additional readings
- Textbook: D2L Ch9.4-9.7, D2L 10.1, D2L Ch11.1-11.7
- Original Transformer paper: Attention is all you need
- Coding: The annotated Transformer

Sep 19

Section HPC and PyTorch tutorial [notebook]

Sep 20

HW 1 due

HW 2 out HW2 [pdf]

Sep 25

Sequence generation [recording]

Encoder-decoder models
Decoding algorithms
Additional readings:
- Original attention paper: Neural Machine Translation by Jointly Learning to Align and Translate
- Original top-p sampling paper: The Curious Case of Neural Text Degeneration

Sep 26

Section Machine translation slides

Oct 2

Tasks and applications [[recording]]

Formulation of NLP tasks
Final project tips
Proposal Template

Oct 3

Section Data processing, Huggingface datasets, Datasheet [slides] [notebook]

Representation learning

Oct 09

Pretraining and finetuning (basics) [[recording]]: HW 2 due HW 3 out HW3 zip [pdf]

Self-supervised learning
Encoder-only, decoder-only, encoder-decoder models

Oct 10

Section Huggingface Transformer Sec06 notebook Sec06 slides

Oct 16

Pretraining and finetuning (advanced) [[recording]]: Proposal due

Sub-word tokenization
Efficient pre-training
Parameter efficient finetuning

Oct 17

Section Mixed-precision training, efficient inference

NLP via language modeling

Oct 23

Scaling language models [[recording]]

History of language models
Scaling law
Emergent capabilities

Oct 24

Section Scaling law review slides

Oct 30

Aligning language models (basics) [[recording]]: HW 3 due HW 4 out HW 4 zip pdf

Instruction tuning
Reinforcement learning

Oct 31

Section Prompt Engineering [slides]

Nov 6

Guest lecture: Misalignment and Scalable Oversight by Ruiqi Zhong: Midterm report due

Nov 7

Section Project midterm peer review

Nov 13

Aligning language models (advanced) [[recording]]: HW 4 due

Reinforcement learning from human feedback
Direct policy optimization
Reward hacking

Nov 14

Section RLHF review

Nov 20

Benchmarking and evaluation :

Building NLP datasets
Holistic evaluation
Challenges in evaluating LLMs

Nov 21

Section Evaluation tools, modern NLP datasets and benchmarks

Nov 27

[Guest lecture: Can Language Models Reason? by Abulhair Saparov.]

Nov 28

Thanksgiving break (no section)

Dec 4

Presentation of final projects

Final Report Template

Dec 5

Section Presentation of final projects

Dec 11

Legislative Friday (no lecture)

Dec 12

Section Presentation buffer; last-minute project help