CPSC 503 - Winter 2020 - Computational Linguistics

Readings, Syllabus, Assignments, Software&Data


Readings
Required
References


Syllabus, Assignments, Software& Data

1
Jan 8 Wed  Intro and Course Overview

We will communicate through Canvas: to log in use your CWL

J&M
Chp. 1  

Intro

- ACL
- NLP demos/videos
- Ambiguity

NLP toolkits: NLTK (Python), Stanford CoreNLP (java)

ProbInfoTheory Handout

2
Jan 13 Mon English Morphology and Finite State Machines: FSA and FST
J&M
Chp. 2&3 (2nd Edition)
missing pages
a b c
Assignment1on Canvas (due Jan 22)
Dementia Material: instructions, data, lib, run.py


Applications of FSTs in NLP Lauri Karttunen, CIAA, 2000.


  Jan 15 Wed CANCELED - SNOW    
3
Jan 20 Mon Finish FST + Stemming + Spelling
J&M
Chp. 3&4
4
Jan 22 Wed  Minimum Edit Distance + Probabilistic Models: N-grams -  N-grams  Evaluation -


J&M
Chp. 4

Google ngrams model

Google books Ngrams viewer

An empirical study of smoothing techniques for NLP S.F. Chen, J. Goodman - TR CS Harvard Univ - 1998

5
Jan 27 Mon Intro - Neural Networks and Neural Language Models - Start Markov Models


J&M 3Ed
Chp. 7-8



Neural Network Demos

 

 

6
Jan 29 Wed

Markov Sequence Labelling Models - Part-of-speech Tagging

 


J&M 3Ed
Some of Appendix A
Chp. 8

- state of the art POS tagging

why tagging can be challenging for humans: Penn tagging scheme

Part-of-Speech Tagging from 97% to 100% C. Manning 2011

Assignment2 on Canvas (due Feb 12)
Corpora: wsj-p.txt  wsj-ps.txt  atis3.pos.tags.txt cmpt-hw2-3.txt

7
Feb 3 Mon  Neural Sequence processing with Recurrent Neural Networks (RNN)  (Attention and Transformers IN ADDITIONAL LECTURES / READINGS)  J&M 3Ed
Chp. 9
see also Goldberg Chps 14-15-16



 8
Feb 5 Wed  Start English Syntax and Context-free Grammars -- Parsing Algorithms J&M 3Ed
Chp. 10-11


 Interactive tutorials on the English grammar  (not working 2020?)
English Dept. University of Calgary.

Another resource on grammar from UCL

- NLTK (demos) - look at *Getting Started*
 - Some public parsers (inlcuding Stanford and MINIPAR visualization  tools)

 9
Feb 10 Mon
Chunking / Dependency Grammars and Transition-based Dep. Parsing/ Treebanks -
 
J&M 3Ed
Chp. 13
 Stanford Parser -
-Popular Stat Parser

- MaltParser - State of the Art Dependency Parser

-Penn Treebank - Universal dependency Treebanks

 10 Feb 12 Mon
Probabilistic CFGs - PCFGs Parsing + Lexicalized PCFGs - Neural Constituency and Dependency Parsing  J&M 3Ed
Chp. 12
- Berkeley Parser with demo!

Feb 17 - 21 mid-term Break    
11 Feb 24 Mon Representing Meaning and
Semantic Analysis
J&M Chp.  book on Computational Semantics

Time ML

Semantic Parser (Cornell - Yoav Artzi)

12
 Feb 26 Wed Lexical Semantics J&M Chp. - Wordnet and YAGO (Wikipedia + Wordnet + GeoNames). See also Probase and Freebase and BabelNet

- (Domain specific thesaurus) Medical Subject Headings (MeSH)
- FrameNet
- ProbBank (adding semantic annotations to the Penn Treebank)

Assignment3 on Canvas (due March 6)  needed files

  Mar 2 Mon Canceled    
13
 Mar 4  Wed Computational Lexical Semantics (focus on Vector Semantics) J&M Chp. 6 - word2vec 

- A systematic comparison of context-counting vs. context-predicting semantic vectors ! (predicting is clearly better)

- generalization of skip-grams to sentences (skip-thought vectors) 2015

- SENSEVAL(Evaluation for WSD)

- WSD online public systems

- WSD with Deep Belief Networks
- Dependency-based word similarity demo
- TREC (Text REtrieval Conference)
- Semantic Labeling (ASSERT)

-Illinois Semantic Role Labeler

 

14
Mar 9 Mon CNNs,  Semantic Role labeling, Brief Intro Pragmatics:
- Appied CL Discourse Research Lab
- DAMSL
- RST annotation tool
15 Mar 11 Wed  Encoder-Decoder, Attention and Transformers Conditioned Neural Generation (Encoder-Decoder framework) pag. 195-211 - Y. Goldberg book 2017- Chp. 17  Assignment4 on Canvas (due March 23) 
-
Transformers package
16 Mar 16 Mon Project Proposal Presentations -

see project work plan

    READINGS (what to do?)   avg. year 2015
17 Mar 18 Wed

Generic Topic Modeling (background reading Comm. ACM) and Topic Modeling in Asynchronous Conversations

  • Steyvers, M. & Griffiths, T. (). Probabilistic topic models. In T. Landauer, D McNamara, S. Dennis, and W. Kintsch (eds), Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum 2006 [ pdf ] or [ pdf  [ Kenny Chiu]  

  • GS. Joty, G. Carenini and R. T. Ng (2013) Topic Segmentation and Labeling in Asynchronous Conversations JAIR, Volume 47, pages 521-573 (2013) (only intro, conclusions and sections of topic segmentation (not labeling) [pdf]   [Ramya Rao Basava ]  

18 Mar 23 Mon

Visual Text Analytics and Interactive Topic Modeling

 Intelligent User Interfaces (IUI), 2016 ] VIDEO  [Joseph Wonsil]  

19  Mar 25 Wed

Distributed Representations for Sentence + Summarization (1)

  •  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding J. DevlinMW Chang, K. Lee, K. Toutanova   2018 [pdf] [Ganesh Jawahar]

  • (background reading:  Summarization Evaluation notes)
    Lidong Bing, Piji Li, Yi Liao, Wai Lam, Weiwei Guo, Rebecca J. Passonneau,  Abstractive Multi-Document Summarization via Phrase Selection and Merging. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics  (ACL 2015) pdf  [ Vaastav Anand]   

20
Mar 30 Mon

 Summarization (2)

  • Jianpeng Cheng Mirella Lapata, Neural Summarization by Extracting Sentences and Words   ACL-2016 [pdf] [Mariko Tatsumi  ]

  •  

  • Wen XiaoGiuseppe Carenini: Extractive Summarization of Long Documents by Combining Global and Local Context. EMNLP/IJCNLP (1) 3009-3019 [pdfShunsuke Ishige ]

  

21
Apr 1 Wed Sentiment + Graph Based WSD

pre-reading for paper1: Chaper 18 of Y. Goldberg (only 5 pages)
  • Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher Manning, Andrew Ng and Christopher Potts, Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Conference on Empirical Methods in Natural Language Processing (EMNLP 2013) [pdf] [Joel Penner]

  • Navigli, R. and Lapata, M. (2010) An experimental study of graph connectivity for unsupervised word sense disambiguation.  IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(4), 678–692. Myunghwan Lee ]


22
Apr 6 Mon

Neural: Text Classification

- Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classication. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 1480{1489 NAACL (2016) [pdf]   [Tanzila Rahman ]

- Weirui Kong, Hyeju Jang, Giuseppe Carenini, Thalia Shoshana Field: A Neural Model for Predicting Dementia from Language. MLHC (2019): 270-286 [pdf] [  Sang-Wha Sien]


23
Apr 8 Wed  Natural Language Generation (data2text) + Discourse Parsing
  •  Portet, E Reiter, A Gatt, J Hunter, S Sripada, Y Freer, C Sykes  Automatic Generation of Textual Summaries from Neonatal Intensive Care Data. Artificial Intelligence 173:789-816. 2009 (pdf)    Francis Nguyen  ] 
  • pre-reading (not mandatory) for paper2: Chaper 20 (sec 20.1 and 20.2) of Y. Goldberg (only 9 pages, lots of figures ;-)
  • Yang Liu and Sujian Li. Implicit discourse relation classification via multi-task neural networks. In Proceedings of AAAI Conference, (2016). [pdf] [Peter Sullivan ] 
24 (Apr 13 holiday - Good Mon)

Apr 15 Wed
Discourse Parsing:  Applications + Distant Supervision
  •  
  • Ji and Smith – Neural Discourse Structure for Text Categorization ACL-17  [pdf] [  Dujian Ding  ]

 

  • Patrick HuberGiuseppe Carenini: Predicting Discourse Structure using Distant Supervision from Sentiment. EMNLP/IJCNLP (1) 2306-2316 [ pdf ] [  Giuseppe/Patrick ]

 

25 Cancelled Project Update Presentations    
26
Apr 24 (time TBD)

deadline for grade submission end of April
Project Final Presentations

Final Project Report Hand in






carenini at cs.ubc.ca