CPSC 503 - Winter 2016 -
Computational Linguistics
Readings, Syllabus, Assignments,
Software&Data
|
Natural Language Processing with Python: Bird, Steven; Klein, Ewan, Loper, Edward. n, O'Reilly, 2009. Free HTML version. You can order this book directly from O'Reilly
Introduction to Information Retrieval. by Manning, Raghavan, Schutze webpage
Graph-based Natural Language Processing and Information Retrieval. Rada Mihalcea (Author), Dragomir Radev (Author)
Foundations of Statistical Natural Language Processing by Christopher D. Manning, Hinrich Schutze. (M&S). In many cases the statistical approaches are covered in more detail in this book. However, it does not contain all the topics that we will cover in this course. This book also has a webpage.680 pages 1 edition (1999), M.I.T. Press/Triliteral, ISBN: 0262133601. This book will be useful in cases where you want a different presentation of the same material that is required reading from J&M
Contemporary Linguistics: An introduction by W. O'Grady, J. Archibald, M. Aronoff, J. Rees-Miller. 684 pages 5th Edition (2004). ISBN: 0312419368. This book will be useful in cases where you want a more detailed description of linguistic theories. It also contains lots of clear examples of linguistic phenomena. This book also has a webpage.
Synthesis Lectures in Natural Language Processing webpage
Syllabus, Assignments, Software& Data
1
Jan 7 Th Intro and Course Overview We will communicate through Connect : to log in use your CWL
J&M
Chp. 1
Intro
- ACL
- NLP demos/videos
- AmbiguityNLP toolkits: NLTK (Python), Stanford CoreNLP (java)
2
Jan 12 Tu English Morphology and Finite State Machines: FSA and FST
J&M
Chp. 2&3
Applications of FSTs in NLP Lauri Karttunen, CIAA, 2000.
Assignment1on Connect (due Jan 21)
- Recent book and software
- Xerox: FiniteState Technology
- Finite State Utilities (Van Noord)
3
Jan 14 Th Finish FST + Stemming + Spelling
J&M
Chp. 3&4
- The Porter Stemmer (includes perl implementation)
- ProbInfoTheory Handout
- min-edit-dist demo
- A spelling correction program based on a noisy channel model Kerninghan et al. COLING ,1990.
- minimal Python implementation of spelling correction (by P. Norvig)
4
Jan 19 Tu Minimum Edit Distance + Probabilistic Models: N-grams - N-grams Evaluation -
J&M
Chp. 4
An empirical study of smoothing techniques for NLP S.F. Chen, J. Goodman - TR CS Harvard Univ - 1998
5
Jan 21 Th Intro - Neural Networks and Neural Language Models - Start Markov Models
J&M
Chp. 4-5-6
Neural Network Demos
6
Jan 26 Tu Sequence Labelling Models - Part-of-speech Tagging
J&M
Chp. 12- state of the art POS tagging
why tagging can be challenging for humans: Penn tagging scheme
Part-of-Speech Tagging from 97% to 100% C. Manning 2011
Assignment2 on Connect (due Feb 11)
Corpora: wsj-p.txt wsj-ps.txt atis3.pos.tags.txt cmpt-hw2-3.txt7
Jan 28 Th Start English Syntax and Context-free Grammars -- Parsing Algorithms J&M Chp. 13
Interactive tutorials on the English grammar
English Dept. University of Calgary.- NLTK (demos) - look at *Getting Started*
- Some public parsers (inlcuding Stanford and MINIPAR visualization tools)
8
Feb 2 Tu Chunking / Dependency Grammars and Transition-based Dep. Parsing/ Treebanks -
Stanford Parser -
-Popular Stat Parser
- MaltParser - State of the Art Dependency Parser9
Feb 4 Th Probabilistic CFGs - PCFGs Parsing + Lexicalized PCFGs J&M Chp. 14 - Berkeley Parser with demo! 10 Feb 9 Tu
Representing Meaning and
Semantic AnalysisJ&M Chp. 17-18 book on Computational Semantics Semantic Parser (Cornell - Yoav Artzi)
11
Feb 11 Th
Lexical Semantics J&M Chp.19 - Wordnet and YAGO (Wikipedia + Wordnet + GeoNames). See also Probase and Freebase - (Domain specific thesaurus) Medical Subject Headings (MeSH)
- FrameNet
- ProbBank (adding semantic annotations to the Penn Treebank)Feb 15 - 19 mid-term Break 12
Feb 23 Tu Computational Lexical Semantics (focus on Vector Semantics) J&M Chp. 20 J&M Chp19, 3rd Ed. draft
- word2vec - A systematic comparison of context-counting vs. context-predicting semantic vectors ! (predicting is clearly better)
- generalization of skip-grams to sentences (skip-thought vectors) 2015
- SENSEVAL(Evaluation for WSD)
- WSD with Deep Belief Networks
- Dependency-based word similarity demo
- TREC (Text REtrieval Conference)
- Semantic Labeling (ASSERT)-Illinois Semantic Role Labeler
Assignment3 onConnect (due March 15th) needed files
13
Feb 25 Th Pragmatics: Discourse&Dialog J&M Chp. 21 & 24 - DAMSL
- RST annotation tool14
Mar 1 Tu Project Proposal Presentations -
Natural Language Generation (NLG): sample system: Generator Evaluative Arguments (GEA)
handout
- SIGGEN
- NLG systems book, STOP system, SimpleNLG
- NLG companies: data2text CoGenTex
READINGS (what to do?) avg. year 2010.5 15 Mar 3 Th Generic Topic Modeling (background reading Comm. ACM) and Topic Modeling in Synchronous Conversations
Steyvers, M. & Griffiths, T. (). Probabilistic topic models. In T. Landauer, D McNamara, S. Dennis, and W. Kintsch (eds), Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum 2006 [ pdf ] or [ pdf ] [ Mansfield]
Galley, M., McKeown, K., Fosler-Lussier, E., & Jing, H. Discourse segmentation of multi-party conversation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ACL ’03, Sapporo, Japan.
ACL. 2003 [pdf] [Hossain Imtiaz ]
16 Mar 8 Tu Topic Modeling and Labelling in Asynchronous Conversations :
- S. Joty, G. Carenini and R. T. Ng (2013) Topic Segmentation and Labeling in Asynchronous Conversations JAIR, Volume 47, pages 521-573 (2013) [pdf]
NOTE: Split in two presentations. One student will cover topic segmentation [Masrani] , the other topic labelling [Chen Jiahong]
17 Mar 10 Th Visual Text Analytics and Interactive Topic Modeling
E. Hoque and G. Carenini, ConVis: A Visual Text Analytic System for Exploring Blog Conversations, Journal of Computer Graphics Forum (Proc. EuroVis), 2014 [pdf] [ Paul Bucci ]
Jason Chuang, Sonal Gupta, Christopher D. Manning, Jeffrey Heer, Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment [pdf] [Mahdi Ghodsi ]
Mar 15 Tu class cancelled
18 Mar 17 Th Natural Language Generation (Sentence Planning)
Walker, M., Stent, A., Mairesse, F., & Prasad, R. (2007). Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research, 30, 413-456 2007. (long but lots of tables / figure) pdf [ Jacob Chen ]Natural Language Generation (data2text)
F Portet, E Reiter, A Gatt, J Hunter, S Sripada, Y Freer, C Sykes Automatic Generation of Textual Summaries from Neonatal Intensive Care Data. Artificial Intelligence 173:789-816. 2009 (pdf) [ Felipe Bañados Schwerter ]19
Mar 22 Tu Summarization (1)
Regina Barzilay, Kathleen McKeown "Sentence Fusion for Multidocument News Summarization",
Computational Linguistics, 2005. [ps] [ Laura Tammpere ]
(background reading: textbook sec. 23.7 Summarization Evaluation)
Lidong Bing, Piji Li, Yi Liao, Wai Lam, Weiwei Guo, Rebecca J. Passonneau, Abstractive Multi-Document Summarization via Phrase Selection and Merging. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL 2015) pdf [ Wenqiang Dong ]20
Mar 24 Th Summarization (2)
(Biographies) Fadi Biadsy, Julia Hirschberg, Elena Filatova, "An Unsupervised Approach to Biography Production using Wikipedia", ACL-08: HLT, Columbus, Ohio, Jun 2008 pdf [ Dmitry Tebaykin ]
Janara Christensen, Stephen Soderland, Gagan Bansal, and Mausam "Hierarchical Summarization: Scaling Up Multi-Document Summarization" Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics
(ACL 2014) [pdf ] [ Hindalong Emily ]
21
Mar 29 Tu Subjectivity and Sentiment
Fangzhong Su and Katja Markert Subjectivity recognition on word senses via semi-supervised mincuts HLT- NAACL 2009 [ Yaashaar Hadadian ]
Theresa Wilson, Janyce Wiebe, and Paul Hoffmann (2009). Recognizing Contextual Polarity: An exploration of features for phrase-level sentiment analysis. Computational Linguistics, 35:3, pages 399-433. [ Moumita Roy Tora ]
22
Mar 31 Th Sentiment + Graph Based WSD
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher Manning, Andrew Ng and Christopher Potts, Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Conference on Empirical Methods in Natural Language Processing (EMNLP 2013) [pdf] [ Nejat]
Navigli, R. and Lapata, M. (2010) An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(4), 678–692. [ Mehrdad Ghomi ]
23
Apr 5 Tu Discourse Parsing
- R. Soricut and D. Marcu. 2003. Sentence Level Discourse Parsing Using Syntactic and Lexical Information. NAACL 2003 [ pdf ] [Jang]
- portions of CL paper CODRA: A Novel Discriminative Framework
for Rhetorical Analysis. Computational Linguistics (2015) Vol. 41, No. 3: 385–435, MIT press only sections 1 - 4 are mandatory
DEMO [ Johnson]
24 Apr 7 Th More Discourse Parsing + Application
- Yang Liu and Sujian Li. Implicit discourse relation classification via multi-task neural networks. In Proceedings of AAAI Conference, (2016). [pdf] [Chen Jianhui]
Kelsey Allen, Giuseppe Carenini and Raymond Ng, Detecting Disagreement in Conversations using Pseudo-Monologic Rhetorical Structure - EMNLP-14 [ pdf ] [ Michael Haaf ]
25 Apr 12 Tu Project Update Presentations 26
?Apr 25? Project Final Presentations Final Project Report Hand in