CPSC 540 - Machine Learning (Winter 2017)

Lectures: Mondays and Wednesdays (4-5:30, Hugh Dempster Pavilion 110)

Tutorials: Fridays (4-5:30, Hugh Dempster Pavilion 110)

Instructor Office Hours / Help sessions: Tuesdyas (2:30-3:30, ICICS 193) and Fridays (1-2:30, ICICS 238)

Instructor: Mark Schmidt.

Teaching Assistants: Jason Hartford, Robbie Rolin, Sharan Vaswani

Synopsis: This is a graduate-level course on machine learning, a field that focuses on using automated data analysis for tasks like pattern recognition and prediction. The course will move quickly and assumes a strong background in math and computer science as well as previous experience with statistics and/or machine learning. The class is intended as a continuation of CPSC 340 and it is strongly recommended that you take CPSC 340 first before enrolling in CPSC 540. Topics will (roughly) include linear prediction, graphical models, Bayesian methods, deep learning, online/active/causal learning, and reinforcement learning.

Textbook: No textbook covers all of the topics above, but the one with the most extensive coverage is Kevin Murphy's Machine Learning: A Probabilistic Perspective (MLAPP). This book can be purchased from Amazon, is on reserve in the CS Reading Room (ICCS 262), and can be accessed through the library here. Optional readings will be given out of this textbook, in addition to other free online resources.

Registration and Prerequisites: Graduate and undergraduate students from any department are welcome to take the class, provided that they satisfy the prerequisites. However, you can only register automatically if you are enrolled as a graduate student in CPSC or in EECE. If you are a graduate student from a different department or are an undergraduate student satisfying these requirements, you can register by following the instructions here and submitting the prerequisites form here. Graduate students in CPSC and EECE also need to submit the prerequisites form in the first two weeks of class to stay enrolled. In any case, before registering please read the section below.

CPSC 340 vs. CPSC 540: CPSC 340 and CPSC 540 are roughly structured as one full-year course. CPSC 340 has more focus on data mining methods and applications of machine learning while CPSC 540 puts more focus on research-level machine learning methods and theory. It is strongly recommended that you take CPSC 340 first, as it covers the most common and practically-useful techniques. Note that this year multivariate calculus has been added as a prerequisite to CPSC 340. This means that CPSC 340 will be more challenging than previous years and will cover some topics that were previously covered in CPSC 540. If CPSC 340 is the more appropriate class for you but if CPSC 340 is full, you should still sign up for the CPSC 340 waiting list (not CPSC 540) as we may expand the class size: taking CPSC 540 because CPSC 340 is full is a terrible idea. In 540 it will be assumed that you are familiar with the material in the current offering of CPSC 340, and note that the Coursera machine learning course is not an adequate replacement for CPSC 340. In 540 it will also be assumed that you have taken a proper class in algorithms and complexity (like CPSC 320) and that you have taken a probability class like MATH 302 (STAT 200 is not enough), while prior exposure to scientific computing (like CPSC 302) will also be helpful. Below are the planned topics for both courses:

CPSC 340CPSC 540
  • Supervised learning with frequencies and distances.
  • Data clustering, outlier detection, and association rules.
  • Linear prediction, regularization, and kernels.
  • Latent-factor models and collaborative filtering.
  • Neural networks and deep learning.
  • Density estimation and Markov models.
  • Large-scale machine learning.
  • Probabilistic graphical models.
  • Bayesian learning.
  • Recurrent neural networks.
  • Causal, active, and online learning.
  • Reinforcement learning.

CPSC 540 requires a stronger computer science and math background and will require substantially more work (including proofs and implementing methods from scratch). Note that CPSC grad students typically only take 1-3 courses per term compared to 3-6 for undergraduate students while this is one of the most challenging graduate courses: you should expect the workload to be 2-3 times higher than in typical courses. If you do not have a strong computer science and math background, or are mainly interested in applying machine learning in your research, then CPSC 340 is the right course to take. You can always decide to take (or audit) CPSC 540 later.

Auditting: Rather than registering as a student, an alternate option is to register as an auditor. This is a good option for students that may be missing some of the prerequisites or that don't have enough time to do the assignments, but that still want exposure to the material. For graduate students, the form for auditing the course is available here. For undergraduates, you need to fill out the form here and indicate on the course information section that you wish to "audit". I will describe the auditting requirements and sign these forms on the first day of class.

Grading: Assignments 40%, Final 30%, Project 30%.

Piazza for course-related qustions


Date Lecture Slides Related Readings and Links Homework, Tutorials, and Notes
Wed Jan 4 Syllabus
Matrix Notation
MLAPP 1.1-1.2, 1.4, 6.5, 7.1-3, 7.5
ML vs. Stats (2001, 2015) 3 Cultures of ML
Essence of Linear Algebra
Assignment 1
a1.zip a1.tex
Linear Algebra
Mon Jan 9 MAP Estimation
Convex Functions
MLAPP 7.4, 8.1-8.3, 14.5, Probability Primer
BV 2.1-2.3, 3.1-3.2
Probability Norms
Max and Argmax
Wed Jan 11 Gradient Descent
Newton-like Methods
BV 9.1-3, 9.5
Tutorial 1
Mon Jan 16 Optimization Zoo
Coordinate Optimization
BV 9.4, MLAPP 13.3-4
Coordinate Optimization
Assignment 1 due
Convexity Inequalities
Wed Jan 18 Group L1-Regularization
MLAPP 13.5
Assignment 2 a2.zip a2.tex
Tutorial 2
Mon Jan 23 Structured Sparsity
Stochastic Subgradient
Structured Sparsity
Wed Jan 25 Stochastic Average Gradient
Kernel Trick
MLAPP 14.1-5, BV 3.3 and 5.1-5.2
Tutorial 3
Mon Jan 30 Kernel Methods
Fenchel Dual
BV 3.3 and 5.1-5.2
Wed Feb 1 Density Estimation
Multivariate Gaussian
MLAPP 2.3-5
4.1-3, Covariance Marginal Conditional
Tutorial 4
Mon Feb 6 Mixture Models
Expectation Maximization
MLAPP 11.1-3
11.4 and 11.6
Assignment 2 due
Wed Feb 8 Guest Lecture: Derek Murray
TensorFlow (CHBE 101)
Tutorial 5
Wed Feb 15 Kernel Density Estimation
Factor Analysis
MLAPP 14.7
MLAPP 12.1-2 and 12.4
Assignment 3 a3.zip a3.tex
Tutorial 6 EM notes
Mon Feb 27 Independent Component Analysis
Markov Chains
MLAPP 12.6
MLAPP 17.1-2 and 23.1-2
Assignment 3 due
Assignment 4 a4.zip a4.tex
Wed Mar 1 Message Passing
Directed Acyclic Graphical Models
MLAPP 17.4
MLAPP 10.1-2, 10.5
Tutorial 7
Mon Mar 6 More DAGs
Undirected Graphical Models
MLAPP 10.3-4
MLAPP 19.1-2, 19.4
Wed Mar 8 Gibbs Sampling
Variational Inference
MLAPP 20.1-4, 24.1-2
MLAPP 21.1-3
Tutorial 8
Mon Mar 13 Hidden Markov Models
Boltzmann Machines
MLAPP 17.3-5, 18.1-4, 21.4, 22.1-2
MLAPP 27.7, 28.1-2
Wed Mar 15 Log-Linear Models
Conditional Random Fields
MLAPP 19.3-5
MLAPP 19.6
Tutorial 9
Mon Mar 20 Structure Learning
Structured SVMs
MLAPP 26.1-3, 26.7-8
MLAPP 19.7
Assignment 4 due
Wed Mar 22 Deep CRFs
Convolutional Neural Networks
MLAPP 28.3
Tutorial 10
Mon Mar 27 Fully-Convolutional Networks
Bayesian Statistics

MLAPP 3.1-2, 5.1-2, 5.7
Assignment 5 a5.zip a5.tex
Wed Mar 29 Empirical Bayes
Hierarchical Bayes
MLAPP 3.3-4 4.4-6, 5.3-4, 5.6-7, 7.6
Tutorial 11
Mon Apr 3 Topic Models
MLAPP 27.1, 27.3
MLAPP 21.1-4
Wed Apr 5 Recurrent Neural Networks
Generative Adversarial Networks
Reinforcement Learning
Tutorial 12

Related Courses: Besides CPSC 340, other closely-related courses available at UBC include EECE 360, EECE 592, EOSC 510, STAT 305/306/406, STAT 460/461/560/561, STAT 540, and CPSC 532P. There is some discussion of how 340/540 relate to some of the STAT classes written by a former student (Geoff Roeder) here.

Some related courses that have online notes are:

Mark Schmidt > Courses > CPSC 540