CPSC 503 - Winter 2012 - Computational Linguistics

Readings, Syllabus, Assignments, Software&Data


Readings
Required
References


Syllabus, Assignments, Software& Data

1
Jan 8 Tu  Intro and Course Overview

We will communicate through Connect : to log in use your CWL

J&M
Chp. 1  
- ACL
- NLP demos
-Ambiguity
2
Jan 10Th English Morphology and Finite State Machines: FSA and FST
J&M
Chp. 2&3
Applications of FSTs in NLP Lauri Karttunen, CIAA, 2000.
Assignment1 (due Jan 17)


3
Jan 15Tu Finish FST + Stemming + Spelling
J&M
Chp. 3&4
4
Jan 17Th  Minimum Edit Distance + Probabilistic Models: N-grams -  
J&M
Chp. 4

Google ngrams model

Google books Ngrams viewer

An empirical study of smoothing techniques for NLP S.F. Chen, J. Goodman - TR CS Harvard Univ - 1998

5
Jan 22 Tu  N-grams  Evaluation - Markov Models - Part-of-speech Tagging
J&M
Chp. 4-5-6


- state of the art POS tagging

why tagging can be challenging for humans: Penn tagging scheme

 

 

6
Jan 24 Th English Syntax and Context-free Grammars J&M
Chp. 12
Interactive tutorials on the English grammar 
English Dept. University of Calgary.

Assignment2 on Connect (due Feb 7)
Corpora: wsj-p.txt  wsj-ps.txt  atis3.pos.tags.txt cmpt-hw2-3.txt

7
Jan 29 Tu  Parsing Algorithms / J&M Chp. 13


 - NLTK (demos) - look at *Getting Started*
 - Some public parsers (inlcuding Stanford and MINIPAR visualization  tools)

 8
Jan 31 Th Chunking / Dependency Grammars/ Treebank - Start Probabilistic CFGs
 J&M Chp. 14
-Penn Treebank - Stanford Parser -
-Popular Stat Parser

- MaltParser - State of the Art Dependency Parser
 9
Feb 5 Tu PCFGs Parsing + Lexicalized PCFGs   - Berkeley Parser with demo!
 10 Feb 7 Th
Representing Meaning and
Semantic Analysis
J&M Chp. 17-18 book on Computational Semantics

Time ML

11
Feb 12 Tu
Lexical Semantics J&M Chp.19 - Wordnet and YAGO (Wikipedia + Wordnet + GeoNames). See also Probase and Freebase

- (Domain specific thesaurus) Medical Subject Headings (MeSH)
- FrameNet
- ProbBank (adding semantic annotations to the Penn Treebank)

12
Feb 14 Thu Computational Lexical Semantics J&M Chp. 20  - SENSEVAL(Evaluation for WSD)

- WSD online public systems
- Dependency-based word similarity demo
- TREC (Text REtrieval Conference)
- Semantic Labeling (ASSERT)

-Illinois Semantic Role Labeler

Assignment3 onConnect (due Feb 28)  needed files

    Midterm break  Feb 18 - 24    
13
Feb 26 Tu Pragmatics: Discourse&Dialog J&M Chp.  21 & 24
- DAMSL
- RST annotation tool
  Buffer Natural Language Generation (NLG): sample system: Generator Evaluative Arguments (GEA)
handout
- SIGGEN
- NLG systems book,   
STOP system, SimpleNLG
- NLG companies:   data2text  CoGenTex

14
Feb 28 Thu Project Proposal Presentations -

 
15
Mar 5 Tu Project Proposal Presentations -
 

  READINGS (what to do?)    
16 Mar 7 Thu (data2text) Natural Language Generation (1)
F Portet, E Reiter, A Gatt, J Hunter, S Sripada, Y Freer, C Sykes  Automatic Generation of Textual Summaries from Neonatal Intensive Care Data. Artificial Intelligence 173:789-816. 2009 (pdf)    [ Matthew up to 4.5 exluded ]   [Baipeng  from 4.5 to end]  


Ryuichiro Higashinaka  et al. Learning to generate naturalistic utterances using reviews in spoken dialogue systems Proceeding of ACL 2006 pdf [Sanjana]

17 Mar 12 Tu

Summarization (1)
(Biographies) Fadi Biadsy, Julia Hirschberg, Elena Filatova, "An Unsupervised Approach to Biography Production using Wikipedia", ACL-08: HLT, Columbus, Ohio, Jun 2008 pdf [Suman]

 


(Evaluative Text e.g., customer reviews)  Carenini, G., Ng, R., & Pauls, A. (2006). Multi-document summarization of evaluative text.
In Proceedings of EACL,  2006.
( pdf )     [Enamul]

18
Mar 14 Thu

Summarization (2)
Regina Barzilay, Kathleen McKeown "Sentence Fusion for Multidocument News Summarization",
Computational Linguistics, 2005. [ps] [Connor]

 

(read about Rouge on my book)
Ani Nenkova et al. The Pyramid Method: Incorporating human content selection variation in summarization evaluation ACM Trans. on Speech and Language Processing (TSLP), 2007 pdf  [Kaya]  

19
Mar 19 Tu Summarization(3)
Gabriel Murray and Giuseppe Carenini Summarizing Spoken and Written Conversations EMNLP 2008 [pdf]  [Maryam]

Giuseppe Carenini , Raymond NG, Xiaodong Zhou, Summarizing Emails with Conversational Cohesion and Subjectivity ACL 2008 [pdf] [Tatsuro]

 
20
Mar 21 Thu

Subjectivity and Sentiment (1)

21
Mar 26 Tu Guest (postdoc Yashar Mahdad): Information Extraction + Textual Entailment  

IE: Open Information Extraction from the Web  IJCAI 2007[Mahsa]


 
TE: A Survey of Paraphrasing and Textual Entailment Methods (only till page 18, i.e. section 1 and 2) JAIR 2010

 

22
Mar 28 Thu Guest (finishing PhD student Shafiq Joty):  Discourse Parsing
  • R. Soricut and D. Marcu. 2003. Sentence Level Discourse Parsing Using Syntactic and Lexical Information. NAACL 2003 [ pdf [ Michael]
     
  • S. Joty, G. Carenini, and R. T. Ng. 2012. A novel discriminative framework for sentence-level discourse analysis.    EMNLP-CoNLL 2012, [ pdf ] [ Seong]

     
23 Apr 2 Tu

Natural Language Generation (2) / Topic Modeling
Walker, M., Stent, A., Mairesse, F., & Prasad, R. (2007). Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research, 30, 413-456 2007. (long but lots of tables / figure)  pdf   [ Arni ]

 

Kino Coursey and Rada Mihalcea and William Moen, Using Encyclopedic Knowledge for Automatic Topic Identification, in Proceedings of the Conference on Natural Language Learning (CONLL 2009), pp. 210-218, Boulder, Colorado, May 2009. [ pdf ]  [ Vincent ]

 

  BUFFER

Topic Modeling and Topic Identification (background reading Comm. ACM)

  • Y. Chali, S. R. Joty and S. A. Hasan (2009) "Complex Question Answering: Unsupervised Learning Approaches and Experiments", JAIR, Volume 35, pages 1-47, 2009  pdf  

  • Steyvers, M. & Griffiths, T. (2006). Probabilistic topic models. In T. Landauer, D McNamara, S. Dennis, and W. Kintsch (eds), Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum  [ pdf ]

24 Apr 4 Thu Project Update Presentations    
25
Apr 15 Mon Project Final Presentations

Final Project Report Handin






carenini at cs.ubc.ca