Natural Language Processing¶

CSCI-GA 2590, New York University, Fall 2021

Logistics¶

Time and location¶

Lectures: Wed 5:10pm-7pm EST (Rm 109, WWH)
Instructor: He He, hhe@nyu.edu
TAs:
- Udit Arora (lead TA), ua388@nyu.edu
- Hyejin Kim, hk3342@nyu.edu
- Abed Qaddoumi, amq259@nyu.edu
- Wenqian Ye, wy2029@nyu.edu
Office hours:
- He He: Thur 3-4pm EST (Zoom)
- Udit Arora: Tue 3-4pm EST (Zoom)
- Hyejin Kim: Wed 3-4pm EST (Zoom)
- Abed Qaddoumi: Mon 3-4pm EST (Zoom)
- Wenqian Ye: Fri 3-4pm EST (Zoom)
Calender: subscribe to get up-to-date times on lectures/office hours/due dates/etc.

Accessibility¶

We try our best to make all of the course material accessible. If you need additional accommodation, please send us an email. Please let us know in advance any accommodation needed for assignments and the midterm. If you need additional time for the midterm, contact the Moses Center and send us an accommodation letter.

Communication¶

We will use Campuswire as our main communication tool for announcements and answering questions related to the lectures, assignments, and projects. The registration link is available on Brightspace.

Course information¶

How can we teach machines to understand language so that they can answer our queries, extract information from textual data, or even have a conversation with us? The primary goal of this course is to provide students with the principles and tools needed to solve a variety of NLP problems. We will focus on data-driven methods, including classification, sequence labeling, structured prediction, unsupervised learning, and deep learning. Specific applications include text classification, constituent parsing, semantic parsing, and generation.

Prerequisites¶

Students are expected to have solid mathematics background and programming skills.

Probability, statistics, linear algebra (DS-GA.1002, MATH-UA.140, MATH-UA.235)
Algorithms and data structure (CSCI-UA.102)
Basic knowledge in machine learning (DS-GA.1003, CSCI-UA.0473) will be helpful

Resources¶

Textbook: There is no required textbook. Course notes/slides should be sufficient. Some lectures will be based on the following books (available freely online):

Dan Jurafsky and James H. Martin. Speech and Language Processing. A classic textbook covering both traditional and modern approaches to NLP.
Jacob Eisenstein. Introduction to Natural Language Processing. A comprehensive reference with additional coverage on relevant topics in linguistics and slightly more advanced topics in machine learning.
Yoav Goldberg. Neural Network Methods for Natural Language Processing. Covers neural network models for NLP.
Aston Zhang, Zack C. Lipton, Mu Li, and Alex J. Smola. Dive into Deep Learning. Covers many topics in neural networks and features numerous hands-on examples. We will use some examples from this book.

In the lecture notes, we will use JM, E, G, D2L to refer to the above books respectively.

Background: Here are some useful materials if you want to review the background knowledge.

Probability and optimization in the appendix of Eisenstein’s book.
Notes from DS-GA.1002 (Probability and Statistics for Data Science).
Introductory machine learning material from DS-GA.1002.