CPSC 503 - Winter 2016 - Computational Linguistics

Readings, Syllabus, Assignments, Software&Data


Syllabus, Assignments, Software& Data

Jan 7 Th  Intro and Course Overview

We will communicate through Connect : to log in use your CWL

Chp. 1  


- NLP demos/videos
- Ambiguity

NLP toolkits: NLTK (Python), Stanford CoreNLP (java)

Jan 12 Tu English Morphology and Finite State Machines: FSA and FST
Chp. 2&3
Applications of FSTs in NLP Lauri Karttunen, CIAA, 2000.
Assignment1on Connect (due Jan 21)

Jan 14 Th Finish FST + Stemming + Spelling
Chp. 3&4
Jan 19 Tu  Minimum Edit Distance + Probabilistic Models: N-grams -  N-grams  Evaluation -

Chp. 4

Google ngrams model

Google books Ngrams viewer

An empirical study of smoothing techniques for NLP S.F. Chen, J. Goodman - TR CS Harvard Univ - 1998

Jan 21 Th Intro - Neural Networks and Neural Language Models - Start Markov Models

Chp. 4-5-6

Neural Network Demos



Jan 26 Tu

Sequence Labelling Models - Part-of-speech Tagging


Chp. 12

- state of the art POS tagging

why tagging can be challenging for humans: Penn tagging scheme

Part-of-Speech Tagging from 97% to 100% C. Manning 2011

Assignment2 on Connect (due Feb 11)
Corpora: wsj-p.txt  wsj-ps.txt  atis3.pos.tags.txt cmpt-hw2-3.txt

Jan 28 Th  Start English Syntax and Context-free Grammars -- Parsing Algorithms J&M Chp. 13

 Interactive tutorials on the English grammar 
English Dept. University of Calgary.

- NLTK (demos) - look at *Getting Started*
 - Some public parsers (inlcuding Stanford and MINIPAR visualization  tools)

 Feb 2 Tu Chunking / Dependency Grammars and Transition-based Dep. Parsing/ Treebanks -
 Stanford Parser -
-Popular Stat Parser

- MaltParser - State of the Art Dependency Parser

-Penn Treebank - Universal dependency Treebanks

Feb 4 Th Probabilistic CFGs - PCFGs Parsing + Lexicalized PCFGs  J&M Chp. 14 - Berkeley Parser with demo!
 10 Feb 9 Tu
Representing Meaning and
Semantic Analysis
J&M Chp. 17-18 book on Computational Semantics

Time ML

Semantic Parser (Cornell - Yoav Artzi)

 Feb 11 Th
Lexical Semantics J&M Chp.19 - Wordnet and YAGO (Wikipedia + Wordnet + GeoNames). See also Probase and Freebase

- (Domain specific thesaurus) Medical Subject Headings (MeSH)
- FrameNet
- ProbBank (adding semantic annotations to the Penn Treebank)

  Feb 15 - 19 mid-term Break    
 Feb 23 Tu Computational Lexical Semantics (focus on Vector Semantics) J&M Chp. 20

J&M Chp19, 3rd Ed. draft

- word2vec 

- A systematic comparison of context-counting vs. context-predicting semantic vectors ! (predicting is clearly better)

- generalization of skip-grams to sentences (skip-thought vectors) 2015

- SENSEVAL(Evaluation for WSD)

- WSD online public systems

- WSD with Deep Belief Networks
- Dependency-based word similarity demo
- TREC (Text REtrieval Conference)
- Semantic Labeling (ASSERT)

-Illinois Semantic Role Labeler

Assignment3 onConnect (due March 15th)  needed files

 Feb 25 Th Pragmatics: Discourse&Dialog J&M Chp.  21 & 24 - DAMSL
- RST annotation tool
 Mar 1 Tu Project Proposal Presentations -

see project work plan

  Natural Language Generation (NLG): sample system: Generator Evaluative Arguments (GEA)
- NLG systems book,   
STOP system, SimpleNLG
- NLG companies:   data2text  CoGenTex

  READINGS (what to do?)   avg. year 2010.5
15  Mar 3  Th

Generic Topic Modeling (background reading Comm. ACM) and Topic Modeling in Synchronous Conversations

  • Steyvers, M. & Griffiths, T. (). Probabilistic topic models. In T. Landauer, D McNamara, S. Dennis, and W. Kintsch (eds), Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum 2006 [ pdf ] or [ pdf  [ Mansfield]  

  • Galley, M., McKeown, K., Fosler-Lussier, E., & Jing, H. Discourse segmentation of multi-party conversation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ACL ’03, Sapporo, Japan.
    ACL. 2003 [pdf]
     [Hossain Imtiaz ]  

16  Mar 8 Tu Topic Modeling and Labelling in Asynchronous Conversations :
  • S. Joty, G. Carenini and R. T. Ng (2013) Topic Segmentation and Labeling in Asynchronous Conversations JAIR, Volume 47, pages 521-573 (2013) [pdf]  

NOTE: Split in two presentations. One student will cover topic segmentation [Masrani]  , the other topic labelling [Chen Jiahong]  


17  Mar 10 Th

Visual Text Analytics and Interactive Topic Modeling

  • E. Hoque and G. Carenini, ConVis: A Visual Text Analytic System for Exploring Blog Conversations, Journal of Computer Graphics Forum (Proc. EuroVis), 2014 [pdf]  [ Paul Bucci  ]  

  • Jason Chuang, Sonal Gupta, Christopher D. Manning, Jeffrey Heer, Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment  [pdf]  [Mahdi Ghodsi ]


   Mar 15 Tu

class cancelled

18 Mar 17 Th  Natural Language Generation (Sentence Planning)
Walker, M., Stent, A., Mairesse, F., & Prasad, R. (2007). Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research, 30, 413-456 2007. (long but lots of tables / figure)  pdf   [  Jacob Chen  ]

Natural Language Generation (data2text)
F Portet, E Reiter, A Gatt, J Hunter, S Sripada, Y Freer, C Sykes  Automatic Generation of Textual Summaries from Neonatal Intensive Care Data. Artificial Intelligence 173:789-816. 2009 (pdf)    [  Felipe Bañados Schwerter  ] 

Mar 22 Tu

Summarization (1)

Regina Barzilay, Kathleen McKeown "Sentence Fusion for Multidocument News Summarization",
Computational Linguistics, 2005. [ps] [  Laura Tammpere ]


(background reading: textbook sec. 23.7 Summarization Evaluation)
Lidong Bing, Piji Li, Yi Liao, Wai Lam, Weiwei Guo, Rebecca J. Passonneau,  Abstractive Multi-Document Summarization via Phrase Selection and Merging. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics  (ACL 2015) pdf  [ Wenqiang Dong  ]   

 Mar 24 Th

Summarization (2)

(Biographies) Fadi Biadsy, Julia Hirschberg, Elena Filatova, "An Unsupervised Approach to Biography Production using Wikipedia", ACL-08: HLT, Columbus, Ohio, Jun 2008 pdf [  Dmitry Tebaykin  ]

Janara Christensen, Stephen Soderland, Gagan Bansal, and Mausam "Hierarchical Summarization: Scaling Up Multi-Document Summarization" Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics
(ACL 2014) [pdf ]  [ Hindalong Emily   ]


Mar 29 Tu

Subjectivity and Sentiment

Mar 31 Th  Sentiment + Graph Based WSD
  • Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher Manning, Andrew Ng and Christopher Potts, Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Conference on Empirical Methods in Natural Language Processing (EMNLP 2013) [pdf] [ Nejat]

  • Navigli, R. and Lapata, M. (2010) An experimental study of graph connectivity for unsupervised word sense disambiguation.  IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(4), 678–692. [  Mehrdad Ghomi  ]

Apr 5 Tu Discourse Parsing

24 Apr 7 Th More Discourse Parsing + Application
  • Yang Liu and Sujian Li. Implicit discourse relation classification via multi-task neural networks. In Proceedings of AAAI Conference, (2016). [pdf] [Chen Jianhui] 
  • Kelsey Allen, Giuseppe Carenini and Raymond Ng, Detecting Disagreement in Conversations using Pseudo-Monologic Rhetorical Structure - EMNLP-14 [ pdf ] [   Michael Haaf    ]


25 Apr 12 Tu Project Update Presentations    
?Apr 25? Project Final Presentations

Final Project Report Hand in

carenini at cs.ubc.ca