Skip to main content
Calendar
Course Schedule
- Jan 22
-
- Text classification
- HW 1 outHW1 [pdf]
-
- Supervised learning basics
- Feature-based text classification
- Additional readings
- Jan 29
-
- Word embedding [online]
- [annotated slides]
-
- Distributed representation of words
- Neural network basics
- Additional readings
- Feb 5
-
- Sequence modeling
- HW 1 due HW 2 out HW2 [pdf] [annotated slides]
-
- RNN and its variants
- Attention and Transformers
- Additional readings
- Feb 12
-
- Sequence generation
- [annotated ]
-
- Encoder-decoder models
- Decoding algorithms
- Additional readings
- Feb 19
-
- Pretraining and finetuning (basics)
-
- Tokenization
- Architecture, objective, optimization
- Feb 26
-
- Guest lecture Efficient pretraining and finetuning by Haitian Jiang
- HW 2 due HW 3 out
-
- Flash attention
- Architecture: mixture-of-experts, multi-head latent attention
- Mixed precision training
- Additional readings
- Mar 5
-
- Guest lecture Scaling language models by Nick Lourie
-
- Emergent capabilities
- Scaling laws
- Mar 12
-
- Post-training of language models (basics)
-
- Instruction tuning
- RLHF basics
- Mar 19
-
- Post-training of language models (advanced)
- HW 3 due HW 4 out
-
- Advanced RLHF techniques
- Alignment
- Mar 26
- Spring Break - No Lecture
- Apr 2
-
- Benchmarking and evaluation
- Apr 9
-
- Guest lecture retrieval-augmented LM by Sewon Min (UC Berkeley)
- HW 4 due
- Apr 16
-
- Guest lecture pretraining data by Hector Liu (MBZUAI)
- Apr 23
-
- Guest lecture LM agent by Yu Su (OSU)
- Apr 30
-
- Guest lecture Qwen models by Junyang Lin (Alibaba)