DS-GA.1011 Natural Language Processing with Representation Learning, Fall 2024
Table of contents
About
How can we empower machines to understand and generate human language, enabling them to summarize complex information, answer questions intelligently, or engage in meaningful conversation? This course dives into the principles and cutting-edge tools that make these capabilities possible. Students will explore three key paradigms in Natural Language Processing (NLP): supervised learning, the pretrain-then-finetune approach, and the latest advances in large language models, with a particular emphasis on representation learning techniques. The course combines theoretical foundations with practical applications. Students are expected to engage with research papers and gain hands-on experience through coding assignments and course projects.
Prerequisites
Students are expected to have a solid mathematics background and strong programming skills.
- Probability, statistics, linear algebra (DS-GA.1002, MATH-UA.140, MATH-UA.235)
- Algorithms and data structure (CSCI-UA.102)
- Basic knowledge in machine learning (DS-GA.1003, CSCI-UA.0473). We will not spend a significant amount of time on machine learning basics so some prior exposure to the supervised learning framework (e.g., loss functions, SGD) is expected.
Logistics
- Lectures: Wed 4:55 PM - 6:35 PM , 19 West 4th St Room 101
- Join on Zoom using the NYU account
- Zoom recordings can be found on Brightspace
- Sections: THUR 4:55am-5:45am, 6 Washington Pl (Meyer Hall) Room 121
- Join on Zoom using the NYU account
- Zoom recordings can be found on Brightspace
- Office hours: We will have five office hours each week: one with the instructor (lecture or general questions) and one with each of the four section leaders (lecture or assignment questions). Details can be found on the Staff page. You are also encouraged to ask questions on Campuswire, which will be answered by the TAs or your classmates.
- Communication: We will use Campuswire as our main communication tool for announcements and answering questions related to the lectures, assignments, and projects. The registration link is available on Brightspace.
Grading
- Assignments (60%): There will be four assignments, each counting 15%.
- Project (40%): You are required to complete a (group) project applying techniques learned in this course. All group members will receive the same grade.
- Proposal (5%)
- Midterm report and peer review (10%)
- Presentation (5%)
- Report (20%)
Coursework
Assignments
The assignments will contain both written problems and programming problems (in Python).
- Late policy: All assignments are due at noon 12:00pm (New York time) on the due date (which will be announced with the assignment). You have 8 late days in total that can be distributed among the assignments. You may use a maximum of 7 late days for each assignment, i.e. we will not accept submissions a week after the deadline. Once you have used all 8 late days, each additional late day will incur a 5% penalty on the assignment, e.g., if you are late for two days (beyond the legitimate 8 late days) and get 80 points on the assignment, you final points will be 80-80*(5% + 5%)=72.
- Collaboration policy: You may discuss problems with your classmates. However, you must write up the homework solutions and the code from scratch, without referring to notes from your joint session. In the submission, you must write down the names of any person with whom you discussed the problem; this will not affect your grade.
- Submission: Assignments are submitted through Gradescope. At the beginning of the semester, you will be added to the Gradescope roster through Brightspace. Please do not register on Gradescope separately or change your email; this will cause the rosters will be out-of-sync.
- Grading: We aim to release grades within two weeks of the submission date. Once the grades are released, you will have one week to submit any regrading requests.
Project
The project is an important component of this course. It allows you to apply what you have learned to a real problem. You are asked to complete the project in a group of 1 to 5 students. We strongly recommend you do projects in a team. Larger groups are expected to larger scale / more ambitious projects where each team member will have a significant contribution.
- Topics: You can choose any topics related to NLP. Take a look at the ACL proceedings for inspiration. Here are some general directions:
- A new algorithm or model for important problems, e.g., identifying factual errors from LLM’s outputs
- An application of an existing technology, e.g., use NLP models for healthcare problems
- Analysis of a dataset, a model, or an approach, e.g., failure modes of chain-of-thought prompting
- Replication of a published result (see the Reproducibility Challenge)
- Deliverables:
- Proposal: A two-page document that describes the team and the project plan
- Midterm report: A two-page progress report
- Peer review: An in-class written review of a peer midterm report
- Presentation: A short presentation (3 minutes) followed by Q&A (2 minutes) in the last week.
- Report: A four-page final report using a provided template
- We will provide templates and guidelines for written deliverables.
- Note that all group delivarables are to be submitted as a group on Gradescope (i.e. one submission linked to all group members).
Academic integrity
Work you submit should be your own. Please consult the CAS academic integrity policy for more information: http://cas.nyu.edu/page/academicintegrity. Penalties for violations of academic integrity may include failure of the course, suspension from the University, or even expulsion.
AI policy
For the assignments and projects, you are free to use any AI-powered tools including ChatGPT and CoPilot. However, you must declare how you used these tools in the submission. Using the tools will not affect your grades.