CSCI-GA.2590 Natural Language Processing, Spring 2025
Table of contents
About
How can we empower machines to understand and generate human language, enabling them to summarize complex information, answer questions intelligently, or engage in meaningful conversation? This course dives into the principles and cutting-edge tools that make these capabilities possible. Students will explore three key paradigms in Natural Language Processing (NLP): supervised learning, the pretrain-then-finetune approach, and the latest advances in large language models. The course combines theoretical foundations with practical applications. Students are expected to engage with research papers, gain hands-on experience through coding assignments and learn about the frontier from guest speakers.
Prerequisites
Students are expected to have a solid mathematics background and strong programming skills.
- Probability, statistics, linear algebra (DS-GA.1002, MATH-UA.140, MATH-UA.235)
- Algorithms and data structure (CSCI-UA.102)
- Basic knowledge in machine learning (DS-GA.1003, CSCI-UA.0473). We will not spend a significant amount of time on machine learning basics so some prior exposure to the supervised learning framework (e.g., loss functions, SGD) is expected.
Logistics
- Lectures: Wed 4:55 PM - 6:55 PM , 31 Washington Pl (Silver Ctr) Room 408
- Join on Zoom using the NYU account
- Zoom recordings can be found on Brightspace
- Office hours: We will have five office hours each week: one with the instructor (lecture or general questions) and one with each of the three TAs (lecture or assignment questions). Details can be found on the Staff page. You are also encouraged to ask questions on Campuswire, which will be answered by the TAs or your classmates.
- Communication: We will use Campuswire as our main communication tool for announcements and answering questions related to the lectures, assignments, and projects. You can register here.
Grading
- Assignments (60%): There will be four assignments, each counting 15%.
- Quizzes (15%): There will be a quiz after each guest lecture.
- Final exam (25%): There will be an online final exam.
Coursework
Assignments
The assignments will contain both written problems and programming problems (in Python).
- Late policy: You have 8 late days in total that can be distributed among the assignments. You may use a maximum of 7 late days for each assignment, i.e. we will not accept submissions a week after the deadline. Once you have used all 8 late days, each additional late day will incur a 5% penalty on the assignment, e.g., if you are late for two days (beyond the legitimate 8 late days) and get 80 points on the assignment, you final points will be 80-80*(5% + 5%)=72.
- Collaboration policy: You may discuss problems with your classmates. However, you must write up the homework solutions and the code from scratch, without referring to notes from your joint session. In the submission, you must write down the names of any person with whom you discussed the problem; this will not affect your grade.
- Submission: Assignments are submitted through Gradescope. At the beginning of the semester, you will be added to the Gradescope roster through Brightspace. Please do not register on Gradescope separately or change your email; this will cause the rosters will be out-of-sync.
- Grading: We aim to release grades within two weeks of the submission date. Once the grades are released, you will have one week to submit any regrading requests.
Academic integrity
Work you submit should be your own. Please consult the CAS academic integrity policy for more information: http://cas.nyu.edu/page/academicintegrity. Penalties for violations of academic integrity may include failure of the course, suspension from the University, or even expulsion.
AI policy
For the assignments and projects, you are free to use any AI-powered tools including ChatGPT and CoPilot. However, you must declare how you used these tools in the submission. Using the tools will not affect your grades.