CS 540 (Machine learning) Fall 2008 (term 1)

Projects

Click here.

Admin

Lectures: TR 11-12.30. Room: Macmillan 154, opposite CS on Main Mall.

Office hours: Fri 1-2pm.

If you cannot register, but you feel you have the required background, please send your student id number to Joyce Poon (poon@cs.ubc.ca). If you are from another UBC department, fill out this form.

Sign up at google groups to get email announcements etc.

Outline

This is a graduate class on machine learning, covering the foundations, such as (Bayesian) statistics and information theory, as well as topics such as supervised learning (classification, regression), and unsupervised learning (clustering, dimensionality reduction). (I will cover graphical models in Stat521A in Spring 2009; note that CS540 is highly recommended as a pre-requiste for Stat 521A.) Examples of applications in the areas of vision, speech/ language and biology will be used throughout.

Pre-requisites

This will be a fast-paced class, so prior exposure to machine learning at the undergraduate level (such as CS340 or Stat 306) is highly desirable. However, the only official pre-requisites are: linear algebra, probability theory, multivariate calculus and programming skills (preferably matlab or R).

If you do not have the pre-requisites, but are still interested in learning about machine learning, I recommend you take CS340, the undergrad version of this class, taught by Nando de Freitas Fall 2008.

Workload

This class will be quite time consuming. Attending lectures: 3h. Weekly homeworks: about 6h. Weekly reading: about 6h. Total: 15h/week.
If you cannot handle this, I recommend you take CS340, the undergrad version of this class.

Textbook

Machine Learning: a probabilistic approach. Students will be able to buy a copy of this book, which I am writing, after Sept 8th, from
Copiesmart Centre, 103-5728 U. Blvd, right next to McDonald's in the UBC Village.

If you find typos, please follow the procedure outline here.

In addition to my book, you may find the following useful:

Grading

Midterm (open-book): 30%, Weekly assignments: 30%, Final project: 40%.

Homeworks

Homeworks are listed below. Numbers refer to exercises in my book. (M) after a homework exercise refers to Matlab. Data and supporting code for the homeworks can be found by downloading PMTK.

Tentative Timetable

Reading material refers to the 7 Sep 08 version. New means the midterm (8 Oct 08) version.

L# Date Topic Reading Homework
L1 Tue Sep 9
Intro Ch 1, Matlab tutorial hw1.pdf
L2 Thu Sep 11 Data visualization, probabilistic models, MLE Ch 2 .
L3 Tue Sep 16 Basic concepts New version of ch 2 hw2.pdf prostate.mat (same as in BLT/Data). hw2Sol.pdf
L4 Thu Sep 18 Linear regression 19.2, 19.3, Review ch 38 .
L5 Tue Sep 23 Linear algebra, Ridge regression 19.4, Review ch 38 Hw3.pdf , hw3Sol.pdf
L6 Thu Sep 25 Logistic regression 22.1, 22.2 .
L7 Tue Sep 30 MVN, LDA/QDA 3.2, 4.2 hw4.pdf, naiveBayesExCode.zip, hw4Sol.pdf
L8 Thu Oct 2 Naive Bayes; Beta-Binomial model Ch 4, 9.3 .
L9 Tue Oct 7 Bayesian concept learning; Beta-Binomial; Dirichlet-Multinomial 8.1-8.3, 9.1-9.4 hw5.pdf, NBLRcode.zip
L10 Thu Oct 9 Bayesian parameter estimation for Gaussians, generative classifiers, linear and logistic regression 5.6, 22.1.3, 9.6 .
L11 Tue Oct 14 Decision theory ; model selection New ch 5, new ch 6, new 3.3, new 8.6 .
L12 Thu Oct 16 Midterm . .
L13 Tue Oct 21 Feature selection 20.1-20.3, 21.1-21.3 .
L14 Thu Oct 23 L1 regularization . .
L15 Tue Oct 28 Mixture models, EM, non-parametric models 3.3-3.4, 14.1-14.5, 17.1-1.3 HW6
L16 Thu Oct 30 Guest lecture by Matt Brown on applications of non-parametric regression . .
L17 Tue Nov 4 Directed graphical models . Project proposals due
L18 Thu Nov 6 Conditioanl mixture models, sparse Bayesian learning, EM as bound optimization . .
L19 Tue Nov 11 Remembrance day . .
L20 Thu Nov 13 Kalman filters . .
L21 Tue Nov 18 PCA . .
L22 Thu Nov 20 Markov models . .
L23 Tue Nov 25 HMMs . .
L24 Thu Nov 27 MCMC . .
Final projects: presentation, Thur Dec 4th, written report Mon Dec 15th.