CPSC 540 - Machine Learning (Winter 2016)

Lectures: 3:30-5pm Tuesdays and Thursdays (CHEM B150).

Office hours: 11:30-12:30 Wednesdays (ICICS 193).

Tutorials: 3:00-4:00 and 4:00-5 Fridays (DMP 101).

Help sessions: 3:30-5 Mondays on days before assignments are due (ICICS X836)

Instructor: Mark Schmidt.

Teaching Assistants: Reza Babanezhad, Alireza Shafaei, Sharan Vaswani.

Synopsis: This is a graduate-level course on machine learning, a field that focuses on using automated data analysis for tasks like pattern recognition and prediction. The course will move quickly and assumes a strong background in math and computer science as well as previous experience with statistics and/or machine learning (other students interested in machine learning should first register for CPSC 340). Topics will (roughly) include linear models, density estimation, graphical models, Bayesian methods, deep learning, online/active/causal learning, reinforcement learning, and learning theory.

Textbook: No textbook covers all of the topics above, but the one with the most extensive coverage is Kevin Murphy's Machine Learning: A Probabilistic Perspective (MLAPP). This book can be purchased from Amazon, is on reserve in the CS Reading Room (ICCS 262), and can be accessed through the library here. Optional readings will be given out of this textbook, in addition to other free online resources.

Registration and Prerequisites: Graduate and undergraduate students from any department are welcome to take the class. However, you can only register automatically if you are enrolled as a graduate student in CPSC or in EECE. If you are a graduate student from a different department or are an undergraduate student satisfying these requirements, you can register by following the instructions here and submitting the prerequisites form here. Graduate students in CPSC and EECE also need to submit the prerequisites form in the first two weeks of class to stay enrolled.

CPSC 340 vs. CPSC 540 In previous years, there was substantial overlap between CPSC 340 (the undergraduate machine learning and data mining course) and CPSC 540 (the graduate-level machine learning course). In the 2015-16 academic year, these two will roughly be structured as one full-year course. Both courses will cover a few core machine learning topics, but CPSC 340 will put a larger emphasis on data mining methods and applications of machine learning while CPSC 540 will put a larger emphasis on research-level machine learning methods and theory. CPSC 540 will also not cover many important, but practically very useful, techniques like random forests, clustering, collaborative filtering, and high-dimensional visualization. Below are the planned topics for both courses (the overlapping topics are in blue, but they will be covered at a faster pace and in more detail in CPSC 540).

CPSC 340CPSC 540
340 will cover the following topics:
  • Data representation/summarization.
  • Frequency-based supervised learning.
  • Data clustering, association rules, and outlier detection.
  • Linear models.
  • Latent-factor models.
  • Deep learning.
  • Sequences, time-series, and graphs.
540 will cover the following topics:
  • Linear models.
  • Density estimation.
  • Graphical models.
  • Deep learning.
  • Bayesian methods.

CPSC 540 requires a stronger background (Coursera and programming experience is not enough) and will require substantially more work (including proofs and implementing many methods from scratch). Note that CPSC grad students typically only take 1-3 courses per term compared to 3-6 for undergraduate students, to give you an idea of the workload difference. If you do not have a strong computer science and math background, or are mainly interested in applying machine learning in your research, then CPSC 340 (in term 1) is the right course to take. You can always decide to take (or audit) CPSC 540 later.

Auditting: Rather than registering as a student, an alternate option is to register as an auditor. This is a good option for students that may be missing some of the prerequisites or that don't have enough time to do the assignments, but that still want exposure to the material. The form for auditing the course is available here. I will describe the auditting requirements and sign these forms on the first day of class.

Grading: One-third of the mark will be based on the 5 assignments, one third based on the midterm (covering the first 4 topics), and one third based on the final (group) project.

Piazza for course-related questions.

Timetable

Date Lecture Slides Related Readings and Links Homework, Tutorials, and Notes
Tue Jan 5 Syllabus, Linear Regression MLAPP 1.1-1.2 and 7.1-3
ML vs. Stats (2001, 2015) Cultures of ML
Essentials of Linear Algebra
Assignment 1 a1.zip
Linear Algebra
Thu Jan 7 Nonlinear Bases, Validation and Regularization MLAPP 1.4, 6.5, and 7.5 Tutorial 1
Matlab Commands Big O
Tue Jan 12 Loss Functions, MAP Estimation MLAPP 7.4, 8.1-8.3, 14.5
Norms Max and Argmax
Probability
Thu Jan 14 Convex Functions, Gradient Methods BV 2.1-2.3, 3.1-3.2, and 9.1-3 Tutorial 2
Tue Jan 19 L1-Regularization, Coordinate Optimization MLAPP 13.3-4
Coordinate Optimization
A1 due
Assignment 2 a2.zip Convexity
Thu Jan 21 Structured Sparsity, Projected-Gradient MLAPP 13.5
Structured Sparsity
Tutorial 3
Tue Jan 26 Proximal-Gradient, Stochastic Subgradient MLAPP 8.5
Proximal-Gradient
Thu Jan 28 Stochastic Average Gradient, Kernel Trick MLAPP 14.1-3
SAG
Tutorial 4
Tue Feb 2 Fenchel Duality, Density Estimation MLAPP 14.4-5
BV 3.3 and 5.1-5.2
A2 due
Assignment 3 a3.zip HighwayI
Thu Feb 4 Mixture Models, Multivariate Gaussian MLAPP 4.1-2, 11.1-2
MKL
Tutorial 5
Tue Feb 9 Expectation Maximization, Kernel Density Estimation MLAPP 11.3-4 and 11.6, 14.7 EM
Thu Feb 11 Probabilistic PCA, Factor Analysis MLAPP 12.1-2, 12.4 Tutorial 6
Reading Break
Tue Feb 23 Directed Acyclic Graphical Models MLAPP 10.1-5 A3 due
Assignment 4 a4.zip
Thu Feb 25 Special Guest Lecture: Rich Sutton (DMP 110) Reinforcement Learning Tutorial 7
Tue Mar 1 Undirected Graphical Models MLAPP 19.1-4
Thu Mar 3 Exact Inference in Graphical Models MLAPP 17.1-4, 20.1-4 Tutorial 8
Tue Mar 8 Neural Networks (340 Lecture) MLAPP 16.5 A4 due
Thu Mar 10 Deep Learning (340 Lecture) MLAPP 28.3-4 Tutorial 9
Tue Mar 15 CNNs (340 Lecture), Bayesian Statistics MLAPP 3.1-4, 5.1-2
Thu Mar 17 Midterm Help Session: March 16
(3pm-5pm, ICICS 146)
Friday Tutorials Cancelled
Tue Mar 22 Empirical and Hierarchical Bayes MLAPP 4.4-6, 5.3, 5.5-6, 7.6
Thu Mar 24 Conjugate Priors, Monte Carlo Methods MLAPP 5.4, 23.1-4
Tue Mar 29 MCMC, Non-Parametric Bayes MLAPP 15.1-2, 24.1-4, 25.2 Assignment 5 a5.zip
Thu Mar 31 CRFs, Variational Inference MLAPP 8.4, 15.3, 21.1-3, 21.5 Tutorial 10
Tue Apr 5 More CRFs, Latent Dynamics MLAPP 19.5-6
Thu Apr 7 Deep Graphical Models, Recurrent Neural Networks MLAPP 28.1-2
Tue Apr 12 A5 due
Tue Apr 26 Project due

Related Courses: Other closely-related courses available at UBC include:

Related courses that have online notes:

Mark Schmidt > Courses > CPSC 540