Calendar

Previous offerings: Fall 2024, Fall 2023, Spring 2023
The schedule below is tentative and subject to change.
All course materials can be found on Github.
We do not have a reference textbook, but some lectures follow materials from Speech and Language Processing (JM below) and Dive into Deep Learning (D2L below).

Course Schedule

Jan 22

Text classification: HW 1 outHW1 [pdf]

Supervised learning basics
Feature-based text classification
Additional readings
- Textbook: JM Ch4.1, JM Ch5

Jan 29

Word embedding [online]: [annotated slides]

Distributed representation of words
Neural network basics
Additional readings
- Textbook: JM Ch6
- Original word2vec paper: Efficient estimation of word representations in vector space

Feb 5

Sequence modeling: HW 1 due HW 2 out HW2 [pdf] [annotated slides]

RNN and its variants
Attention and Transformers
Additional readings
- Textbook: D2L Ch9.4-9.7
- Original Transformer paper: Attention is all you need

Feb 12

Sequence generation: [annotated ]

Encoder-decoder models
Decoding algorithms
Additional readings
- Original attention paper: Neural Machine Translation by Jointly Learning to Align and Translate

Feb 19

Pretraining and finetuning (basics)

Tokenization
Architecture, objective, optimization

Feb 26

Guest lecture Efficient pretraining and finetuning by Haitian Jiang: HW 2 due HW 3 out HW3 [pdf] [HuggingfaceTransformers]

Flash attention
Architecture: mixture-of-experts, multi-head latent attention
Mixed precision training
Additional readings
- Deepseek V3 technical report

Mar 5

Guest lecture Scaling language models by Nick Lourie

Language model basics
Emergent capabilities
Scaling laws

Mar 12

Post-training of language models (basics)

Instruction tuning
Reinforcement learning basics

Mar 19

Post-training of language models (advanced): HW 3 due HW 4 out HW4 [pdf]

Advanced RLHF techniques
Alignment

Mar 26

Spring Break - No Lecture

Apr 2

Benchmarking and evaluation

Apr 9

Guest lecture Retrieval-based LM at scale by Sewon Min (UC Berkeley): HW 4 due

Apr 16

Guest lecture Pretraining Data by Hector Liu (MBZUAI)

Apr 23

Guest lecture LM agent by Yu Su (OSU)

Apr 30

Guest lecture Qwen models by Junyang Lin (Alibaba)