CPSC 503 - Winter 2019 - Computational Linguistics

Readings, Syllabus, Assignments, Software&Data


Readings
Required
References


Syllabus, Assignments, Software& Data

1
Jan 7 Mon  Intro and Course Overview

We will communicate through Canvas: to log in use your CWL

J&M
Chp. 1  

Intro

- ACL
- NLP demos/videos
- Ambiguity

NLP toolkits: NLTK (Python), Stanford CoreNLP (java)

ProbInfoTheory Handout

2
Jan 9 Wed English Morphology and Finite State Machines: FSA and FST
J&M
Chp. 2&3 (2nd Edition)
missing pages
a b c
Assignment1on Canvas (due Jan 21)
Dementia Material: instructions, data, lib, run.py


Applications of FSTs in NLP Lauri Karttunen, CIAA, 2000.


3
Jan 14 Mon Finish FST + Stemming + Spelling
J&M
Chp. 3&4
4
Jan 16 Wed  Minimum Edit Distance + Probabilistic Models: N-grams -  N-grams  Evaluation -


J&M
Chp. 4

Google ngrams model

Google books Ngrams viewer

An empirical study of smoothing techniques for NLP S.F. Chen, J. Goodman - TR CS Harvard Univ - 1998

5
Jan 21 Mon Intro - Neural Networks and Neural Language Models - Start Markov Models


J&M 3Ed
Chp. 7-8



Neural Network Demos

 

 

6
Jan 23 Wed

Markov Sequence Labelling Models - Part-of-speech Tagging

 


J&M 3Ed
Some of Appendix A
Chp. 8

- state of the art POS tagging

why tagging can be challenging for humans: Penn tagging scheme

Part-of-Speech Tagging from 97% to 100% C. Manning 2011

Assignment2 on Canvas (due Feb 11)
Corpora: wsj-p.txt  wsj-ps.txt  atis3.pos.tags.txt cmpt-hw2-3.txt

7
Jan 28 Mon  Sequence processing with Recurrent Neural Networks (RNN)  J&M 3Ed
Chp. 9
see also Goldberg Chps 14-15-16



 8
Jan 30 Wed  Start English Syntax and Context-free Grammars -- Parsing Algorithms J&M 3Ed
Chp. 10-11


 Interactive tutorials on the English grammar 
English Dept. University of Calgary.

- NLTK (demos) - look at *Getting Started*
 - Some public parsers (inlcuding Stanford and MINIPAR visualization  tools)

 9
Feb 4 Mon Chunking / Dependency Grammars and Transition-based Dep. Parsing/ Treebanks -
 
J&M 3Ed
Chp. 13
 Stanford Parser -
-Popular Stat Parser

- MaltParser - State of the Art Dependency Parser

-Penn Treebank - Universal dependency Treebanks

 10 Feb 6 Wed
Probabilistic CFGs - PCFGs Parsing + Lexicalized PCFGs - Neural Constituency and Dependency Parsing  J&M 3Ed
Chp. 12
- Berkeley Parser with demo!
11
Feb 11 Mon
Representing Meaning and
Semantic Analysis
J&M Chp.  book on Computational Semantics

Time ML

Semantic Parser (Cornell - Yoav Artzi)

12
Feb 13 Wed Lexical Semantics J&M Chp. - Wordnet and YAGO (Wikipedia + Wordnet + GeoNames). See also Probase and Freebase and BabelNet

- (Domain specific thesaurus) Medical Subject Headings (MeSH)
- FrameNet
- ProbBank (adding semantic annotations to the Penn Treebank)

Assignment3 on Canvas (due March 2nd)  needed files

  Feb 18 - 22 mid-term Break    
13
 Feb 25 Mon Computational Lexical Semantics (focus on Vector Semantics) J&M Chp. 6 - word2vec 

- A systematic comparison of context-counting vs. context-predicting semantic vectors ! (predicting is clearly better)

- generalization of skip-grams to sentences (skip-thought vectors) 2015

- SENSEVAL(Evaluation for WSD)

- WSD online public systems

- WSD with Deep Belief Networks
- Dependency-based word similarity demo
- TREC (Text REtrieval Conference)
- Semantic Labeling (ASSERT)

-Illinois Semantic Role Labeler

 

14
 Feb 27 Wed CNNs,  Semantic Role labeling, Brief Intro Pragmatics:

- DAMSL
- RST annotation tool
15 Mar 4  Mon Project Proposal Presentations -

see project work plan

   

  Natural Language Generation (NLG): sample system: Generator Evaluative Arguments (GEA)
handout
- SIGGEN
- NLG systems book,   
STOP system, SimpleNLG
- NLG companies:   data2text  CoGenTex

LIST updated for 2019

  READINGS (what to do?)   avg. year 2014.5
16  Mar 6 Wed

Generic Topic Modeling (background reading Comm. ACM) and Topic Modeling in Asynchronous Conversations

  • Steyvers, M. & Griffiths, T. (). Probabilistic topic models. In T. Landauer, D McNamara, S. Dennis, and W. Kintsch (eds), Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum 2006 [ pdf ] or [ pdf  [Roger Lo  ]  

  • GS. Joty, G. Carenini and R. T. Ng (2013) Topic Segmentation and Labeling in Asynchronous Conversations JAIR, Volume 47, pages 521-573 (2013) (only intro, conclusions and sections of topic segmentation (not labeling) [pdf]   [Patrick Boutet ]  


17  Mar 11 Mon

Visual Text Analytics and Interactive Topic Modeling

18  Mar 13 Wed

Natural Language Generation (data2text)

  • F Portet, E Reiter, A Gatt, J Hunter, S Sripada, Y Freer, C Sykes  Automatic Generation of Textual Summaries from Neonatal Intensive Care Data. Artificial Intelligence 173:789-816. 2009 (pdf)    [  Martin Wang  ] 

  • (Pedagogical) Conditioned Neural Generation (Encoder-Decoder framework) pag. 195-211 - Y. Goldberg book 2017- Chp. 17  [ Farnoosh Javadi  ]
19 Mar 18 Mon

Distributed Representations for Sentence + Summarization (1)

  •  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding J. DevlinMW Chang, K. Lee, K. Toutanova   2018 [pdf] [Chiyu Zhang   ]

  • (background reading:  Summarization Evaluation notes)
    Lidong Bing, Piji Li, Yi Liao, Wai Lam, Weiwei Guo, Rebecca J. Passonneau,  Abstractive Multi-Document Summarization via Phrase Selection and Merging. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics  (ACL 2015) pdf  [ Wen Xiao   ]   

20 Mar 20 Wed

 Summarization (2)

 

  • Janara Christensen, Stephen Soderland, Gagan Bansal, and Mausam "Hierarchical Summarization: Scaling Up Multi-Document Summarization" Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014) [pdf ]  [Muhammad Shayan  ]

 

  • Jianpeng Cheng Mirella Lapata, Neural Summarization by Extracting Sentences and Words   ACL-2016 [pdf] [  Yuxi Peter Feng ]

  

21
 Mar 25 Mon Sentiment + Graph Based WSD

pre-reading for paper1: Chaper 18 of Y. Goldberg (only 5 pages)
  • Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher Manning, Andrew Ng and Christopher Potts, Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Conference on Empirical Methods in Natural Language Processing (EMNLP 2013) [pdf] [ Peyman Bateni ]

  • Navigli, R. and Lapata, M. (2010) An experimental study of graph connectivity for unsupervised word sense disambiguation.  IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(4), 678–692. [  John-Jose Nunez   ]


22
Mar 27 Wed

Neural Text Classification + health application

- Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classication. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 1480{1489 NAACL (2016) [pdf]   [ Bicheng Xu ]

ONTOHAN: An Ontology-based Neural Network Model for Patient Need Detection (draft paper PLEASE DO NOT DISTRIBUTE) pdf "announced" on Canvas 2019 [  Hyeju  Jang ]


23
Apr 1 Mon  Discourse Parsing
24
Apr 3 Wed Discourse Parsing Applications
  •  
  • Kelsey Allen, Giuseppe Carenini and Raymond Ng, Detecting Disagreement in Conversations using Pseudo-Monologic Rhetorical Structure - EMNLP-14 [ pdf ] [   Adebara Ife ]
  • Ji and Smith – Neural Discourse Structure for Text Categorization ACL-17  [pdf] [   Ariel Shann  ]


25 Apr 10 Wed 12:30-3 (room 304) Project Update Presentations    
26
Apr 24 Wed 9am-1pm (room 146) Project Final Presentations

Final Project Report Hand in






carenini at cs.ubc.ca