CPSC 436H - Intro to NLP (2022)

Textbook

Selected Chapters of Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition 

by Daniel Jurafsky, James H. Martin. (J&M). We will follow the draft chapters from planned 3rd Edition (Links to an external site.).

Lectures and Scheduling

 

Week Lectures Readings Notes/Supplemental Materials
Jan  10-14

Intro Course Overview  (ppt  Download ppt, pdf  Download pdf)

 Finite State Text Processing, Morphology, Pynini (ppt  Download ppt, pdf  Download pdf)

Chp, 2

FIRST CLASS on Jan 11

Background Reading on Finite State Automata  Download Background Reading on Finite State AutomataONLY-Sec-2.2.1-2.2.2-2.2.3 (if you need it)

Quizz1 (questions 1-6) background/quizzes: FSA Reg Expressions (out 10 - due 17)

Jan  17-21

Finish FST + Text normalization, Spelling (ppt  Download ppt, pdf  Download pdf)

 

Language Models: Traditional vs Neural (ppt  Download ppt, pdf  Download pdf)

Appendix B

 

Chp. 3&7

Background reading On Probability and Information Theory  Download Background reading On Probability and Information Theory(if you need it)

Quizz1 (questions 7-12) background/quizzes: Conditional Prob., Bayes rule, ...) (out 10 - due 17)

Hw1 Text Normalization, Pynini (W-FST) and Spelling Checker (out 18 - due 26)

background/quizzes: MLP?

Jan  24-28

Text Classification (Sentiment) (ppt  Download ppt, pdf  Download pdf)

 

Traditional vs. neural (MLP and CNN) (ppt    Download ppt  pdf  Download pdf)

Chp. 4&5

 

Hw2 Language Models Traditional vs. neural (out 27 - due Feb 3)

Jan 31 - Feb 4

Sequence labeling: Markov Models -POS tagging and NER (ppt  Download ppt pdf  Download pdf)

Sequence labeling: RNNs, LSTMs (ppt  Download ppt, pdf  Download pdf)

Chp8&Chp9

 

Hw3 text classification newsgroup BOW and fixed embeddings (out Feb 4 - due 13)

 

Feb    7-11

Encoder-Decoder, Attention (ppt  Download ppt, pdf  Download pdf)

Transformers (ppt  Download ppt, pdf  Download pdf)

Chp. 9&10 Hw4  + Seq modeling traditional & neural (only LSTM) (out 14 - due 23)
Feb 14-18

Finish Transformers (ppt  Download pptpdf  Download pdf)

Pre-trained language models, Transfer Learning with Contextual Embeddings (ppt  Download pptpdf  Download pdf)

Chp. 11 (draft now available)

BERT, BART, roBERTa....  BERT (Links to an external site.)The Illustrated BERT, ELMo, and co. (Links to an external site.)Chen2019 (Links to an external site.)BERTScore (Links to an external site.)

 

Feb 21-25 Winter Session Term 2 mid-term break
Feb 28 -Mar 4

March 1st MIDTERM  (Practice Questions  Download Practice Questions)

 

Intro to syntax, Context Free Grammars and Parsing  (ppt  Download ppt, pdf  Download pdf)

Chp. 12-13

 

 

 

Mar   7-11

Chunking (Shallow Costituency Parsing by Fine Tuning), Dependency Parsing, Treebanks (ppt  Download ppt, pdf  Download pdf)

(cont') Dependency/Constituency Parsing PCFG Traditional CKY / Neural for Both Const. and Dep. (ppt  Download ppt, pdf  Download pdf)

Chp. 13-14

 Dependency/Constituency  Traditional CKY / Neural 

Hw5 newsgroup classification (a) with Roberta document embeddings (b) With syntactic features (POS and dependency relations), 

Mar 14-18

Intro Semantics (ppt  Download ppt, pdf  Download pdf)

 

& Lexical Semantics (ppt  Download ppt, pdf  Download pdf)

Chp. 16

SemEval  (Links to an external site.)(Semantic tasks)

Embeddings, Wordnet, Concept Graphs, Lexicons for Sentiment, 

Hw6 Syntax & Lexical Semantics 

Mar 21-25

Finish Lexical Semantics   (ppt  Download ppt, pdf  Download pdf)

BasicVectorRep + Topic Modeling (LDA) (ppt  Download ppt, pdf  Download pdf)

Chp. 18-19

Chp. 6

 

Hw7 LDA

Mar28-Apr1

- Intro Discourse & Discourse Parsing and Neural topic segmentation/labeling (ppt  Download ppt, pdf  Download pdf)

Chp. 21&22

Coreference, Discourse Parsing (shift-reduce neural)... for e.g. argumentation mining - mention debater IBM system 

Hw8 Discourse Parsing (?just to "explore" SOTA system?)

Apr 4- Apr 8

- Intro Summarization (ppt  Download ppt, pdf  Download pdf)

- Apr 7 (read paper:  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization (ICML 2020), paper (Links to an external site.) blog-post (Links to an external site.)) (ppt  Download ppt, pdf  Download pdf) QUESTIONS (part1) (Links to an external site.)

Not on textbook Extractive / Abstractive (introduce pre-training objectives tailored
for a specific task)

Hw9  Summarization (?just to "explore" SOTA system?)

Wed Apr 20 7pm                 FINAL EXAM room IRC-4Links to an external site.