Natural Language Processing =========================== **CSCI-GA 2590, New York University, Fall 2020** Logistics --------- - Format: online lectures with (optional) in-person office hours - Please note that the entire course can be taken *fully online*. - Lectures: Tue 7:10pm-9pm EST (on Zoom) - Instructor: `He He `__ - TAs: - Shivesh Ganju - Gaomin Wu - Gauri Dhawan - Office hours: on Zoom - He He: Wed 4-5pm EST - Shivesh Ganju: Fri 7-8pm EST - Gaomin Wu: Tue 10-11am EST - Calender: `subscribe `__ to get up-to-date times on lectures/office hours/due dates/etc. - Zoom instruction: - To attend lectures, log on `NYUClasses `__, and click the Zoom tab on the left. - If you are not muted upon entering the room, please mute yourself. - Since it’s a large class, questions should be sent to the chat room and the instructor will check and answer them during the class. - Recordings will appear on Zoom a few days (depending on how fast Zoom process it) after the lecture. To watch the videos, go to NYUClasses –> Zoom –> Cloud Recordings. Accessibility ------------- We try to make all of the course material accessible. If you need additional accomadation, please send us an email. Please let us know in advance any accomadation needed for assignments (in PDF) and the midterm (online). Course information ------------------ How can we teach machines to understand language so that they can answer our queries, extract information from textual data, or even have a conversation with us? The primary goal of this course is to provide students with the principles and tools needed to solve a variety of NLP problems. We will focus on data-driven methods, including classification, sequence labeling, structured prediction, unsupervised learning, and deep learning. Specific applications include text classification, constituent parsing, semantic parsing, and generation. Prerequisites ~~~~~~~~~~~~~ Students are expected to have solid mathematic background and programming skills. - Probability, statistics, linear algebra (DS-GA.1002, MATH-UA.140, MATH-UA.235) - Algorithms and data structure (CSCI-UA.102) - Basic knowlege in machine learning (DS-GA.1003, CSCI-UA.0473) will be helpful Resources ~~~~~~~~~ **Textbook:** There is no required textbook. Course notes/slides should be sufficient. Some lectures will be based on the following books (available freely online): - `Dan Jurafsky and James H. Martin. Speech and Language Processing. `__ A classic textbook covering both traditional and modern approaches to NLP. - `Jacob Eisenstein. Introduction to Natural Language Processing. `__ A comprehensive reference with additional coverage on relevant topics in linguistics and slightly more advanced topics in machine learning. - `Yoav Goldberg. Neural Network Methods for Natural Language Processing. `__ Covers neural network models for NLP. - `Aston Zhang, Zack C. Lipton, Mu Li, and Alex J. Smola. Dive into Deep Learning. `__ Covers many topics in neural networks and features numerous hands-on examples. We will use some examples from this book. In the lecture notes, we will use JM, E, G, D2L to refer to the above books respectively. **Background**: Here are some useful materials if you want to review the background knowledge. - Probability and optimization in the appendix of Eisenstein’s book. - `Notes `__ from DS-GA.1002. - Machine learning material from `DS-GA.1003 `__. .. toctree:: :maxdepth: 2 :hidden: schedule coursework notes/index