Natural Language Processing

CSCI-GA 2590, New York University, Fall 2020

Logistics

  • Format: online lectures with (optional) in-person office hours

    • Please note that the entire course can be taken fully online.

  • Lectures: Tue 7:10pm-9pm EST (on Zoom)

  • Instructor: He He

  • TAs:

    • Shivesh Ganju

    • Gaomin Wu

    • Gauri Dhawan

  • Office hours: on Zoom

    • He He: Wed 4-5pm EST

    • Shivesh Ganju: Fri 7-8pm EST

    • Gaomin Wu: Tue 10-11am EST

  • Calender: subscribe to get up-to-date times on lectures/office hours/due dates/etc.

  • Zoom instruction:

    • To attend lectures, log on NYUClasses, and click the Zoom tab on the left.

    • If you are not muted upon entering the room, please mute yourself.

    • Since it’s a large class, questions should be sent to the chat room and the instructor will check and answer them during the class.

    • Recordings will appear on Zoom a few days (depending on how fast Zoom process it) after the lecture. To watch the videos, go to NYUClasses –> Zoom –> Cloud Recordings.

Accessibility

We try to make all of the course material accessible. If you need additional accomadation, please send us an email. Please let us know in advance any accomadation needed for assignments (in PDF) and the midterm (online).

Course information

How can we teach machines to understand language so that they can answer our queries, extract information from textual data, or even have a conversation with us? The primary goal of this course is to provide students with the principles and tools needed to solve a variety of NLP problems. We will focus on data-driven methods, including classification, sequence labeling, structured prediction, unsupervised learning, and deep learning. Specific applications include text classification, constituent parsing, semantic parsing, and generation.

Prerequisites

Students are expected to have solid mathematic background and programming skills.

  • Probability, statistics, linear algebra (DS-GA.1002, MATH-UA.140, MATH-UA.235)

  • Algorithms and data structure (CSCI-UA.102)

  • Basic knowlege in machine learning (DS-GA.1003, CSCI-UA.0473) will be helpful

Resources

Textbook: There is no required textbook. Course notes/slides should be sufficient. Some lectures will be based on the following books (available freely online):

In the lecture notes, we will use JM, E, G, D2L to refer to the above books respectively.

Background: Here are some useful materials if you want to review the background knowledge.

  • Probability and optimization in the appendix of Eisenstein’s book.

  • Notes from DS-GA.1002.

  • Machine learning material from DS-GA.1003.