+++++++++++++++++++++++++++++++++++++++ +++++++++++ DRAFT 01/04 +++++++++++++++ +++++++++++++++++++++++++++++++++++++++ >>>>>>> First Class: Wed, Jan 14th <<<<<<< COURSE: 503 (201): Computational Linguistics INSTRUCTOR: Giuseppe Carenini Office: CICSR #129 Phone: 291-4933 Email: carenini at cs.ubc.ca W 8:30-10:00 FSC 1002 F 15:30-17:00 FSC 1611 *************** OVERVIEW **************************** Computational Linguistics is the study of human language from a computational perspective. This course will examine algorithms used in the automatic analysis or production of language. We will cover both knowledge-based and statistical methods, and will look at the use of such methods in a variety of applications, including: information retrieval, question answering, information extraction, spelling, augmentative communication, dialog systems and explanation generation. The course will also provide an introduction to programming with Perl. (please read textbook first chp. for more info: http://www.cs.colorado.edu/~martin/SLP/slp-ch1.pdf) *************** PREREQUISITES ********************** - Intermediate Algorithm Design and Analysis (CPSC 320) - Probability - First-Order Logics *************** SYLLABUS **************************** Intro to linguistics and formal language theory -WORDS Finite-state transducers and word morphology Review of probability theory and information theory Probabilistic Models and Edit distances for spelling correction Collocations Perl Programming Probability models and language: n-grams -SYNTAX Hidden markov models Word classes and part of speech tagging Advanced Perl Programming Context-free grammars for English Parsing algorithms Probabilistic Parsing Algorithms Feature structures and unification -SEMANTICS Representing Meaning and Semantic Analysis Lexical semantics and WordNet Word-sense disambiguation -PRAGMATICS Discourse and dialog models Natural Language Generation (Unfortunately, for lack of time, I will not be able to cover speech and machine translation.) *************** TEXTBOOKS ************************************ REQUIRED Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition By Daniel Jurafsky and James H. Martin 934 pages 1 edition (January 26, 2000), Prentice Hall, ISBN: 0130950696 http://www.cs.colorado.edu/~martin/slp.html RECOMMENDED A book on the Perl Programming Language (probably: Programming Perl by Larry Wall, Brett McLaughlin, Jon Orwant. 400 pages 3 edition (August 30, 2000), O'Reilly & Associates, Inc., ISBN: 0596000278) REFERENCE Foundations of Statistical Natural Language Processing by Christopher D. Manning, Hinrich Schutze. 680 pages 1 edition (1999), M.I.T. Press/Triliteral, ISBN: 0262133601 This book will be useful in cases where you want a different presentation of the same material that is required reading from J&M. In many cases the statistical approaches are covered in a bit more detail in this book. However, it does not contain all the topics that we will cover in this course. **************** ASSIGNMENTS ********************************** Probably six ... **************** PROJECT ************************************ Each student will work on a final project