CPSC 440/540: Advanced Machine Learning – 2022W2 (Jan-Apr 2023)

There's a more recent version of this course.

Instructor: Danica Sutherland (she): dsuth@cs.ubc.ca, ICICS X563.
Lecture info: Mondays/Wednesdays, 3:30 - 4:50pm, Swing 122.
Canvas; Piazza (other link); Gradescope. Office hour calendar and course recordings are linked from Piazza and Canvas.

Previous offerings by Mark Schmidt: 2021w2, 2020w2. This instance will be broadly similar, but not the same.

Schedule

First, miscellaneous notes mentioned in the lectures and homeworks: and assignment submission instructions.

Italicized entries are tentative; in particular, the timing and even number of assignments might change. Textbook acronyms are explained below.

DateTopic/slidesSupplements
M Jan 9Syllabus
Binary density estimation
ML vs. Stats, 3 Cultures of ML
Math for ML, Essence of Linear Algebra
PML1 2.1-2.4 / MLAPP 2.1-2.3
TuJan 10Assignment 1 released: pdf, tex, zip
W Jan 11MAPMLAPP 3.1-3.3 / PML1 4.5, 4.6.2
M Jan 16Generative vs discriminative classifiers; neural netsMLAPP 3.5, 8.1-8.3, 8.6, 16.5
PML1 9.3, 10.2, 13.2
W Jan 18Double descent; deep networks; autodiffPML1 13.1-13.3
Double descent papers: 1 2 3
F Jan 20Add/drop deadline
M Jan 23Assignment 1 due at noon (extended)
M Jan 23Convolutions, auto-encoders, multi-label classifiersPML1 14.1-14.3, 20.3
MLAPP 28.3
W Jan 25Fully-convolutional nets, categorical variables, Monte CarloPML1 14.4.2, 14.5.4; 2.5; PML2 11.2
MLAPP 2.7, 23.1
M Jan 30Bayesian learningPML1 4.6.1-4.6.3, PML2 3.1-3.3 / MLAPP 5.1-5.3
T Jan 31Assignment 2 released: pdf, tex, zip
W Feb 1More Bayesian learning; Empirical BayesPML2 3.9 - 3.10 / MLAPP 5.6
M Feb 6Hierarchical Bayes; multiclass classificationPML1 5.1, 10.3, PML2 3.8.1 / MLAPP 5.5-5.7
W Feb 8Recurrent networks and LSTMsPML1 15.2
M Feb 13Attention, TransformersPML1 15.4-15.7; PML2 16.2.7, 16.3.5
W Feb 15What do we learn?
F Feb 17Assignment 2 due at 11:59pm
M Feb 20Class cancelled: Family Day + midterm break
W Feb 22Class cancelled: midterm break
M Feb 27GaussiansPML1 2.6, 3.2
W Mar 1Learning with Gaussians
F Mar 3Project proposal guidelines released
F Mar 3Withdrawal deadline
M Mar 6Bayesian linear regression; approximate inferencePML1 11.7, PML2 11.4-11.5, PML2 7.4.3
W Mar 8Assignment 3 released: pdf, tex, zip
W Mar 8End-to-end learning; exponential familiesPML2 2.3
M Mar 13Markov chainsPML2 2.6
W Mar 15Message passing; MCMCPML2 9.2, 12.1-12.2
M Mar 20More MCMC; directed graphical modelsPML2 12.2-12.3, 4.2
W Mar 22More graphical modelsPML2 4.2-4.3, bonus on 9
F Mar 24Project proposal due at 11:59pm
M Mar 27Assignment 3 due at 11:59pm
M Mar 27Log-linear UGMs, CRFs; start mixture modelsPML2 4.4; PML1 3.5, 21.4; PML2 28.2
T Mar 28Assignment 4 released: pdf, tex, zip
W Mar 29More mixture models; EM; KDEPML1 8.7.2 / PML2 6.5; PML 2 16.3
M Apr 3HMMs and topic models (+ RBMs)PML2 9.2, 28.5 (+ 4.3.3)
W Apr 5Variational inference and VAEsPML2 10.1-10.4, 21
M Apr 10Class cancelled: Easter Monday
W Apr 12Generating images; diffusion models; course wrapupPML2 20-25, especially 25
ThApr 13Assignment 4 due at 11:59pm
SaApr 22Final exam (in person, handwritten) at noon in Swing 122
F Apr 28Final project due at noon
F Apr 28Lecture+assignment project due at noon

Overview

This course is intended as a second or third university-level course on machine learning, a field that focuses on using automated data analysis for tasks like pattern recognition and prediction. The class is intended as a continuation of CPSC 340 (or 532M), and will assume a strong background in math and computer science. Topics will (roughly) include deep learning, generative models, latent-variable models, Markov models, probabilistic graphical models, and Bayesian methods.

Logistics

The course meets in person in Swing 122. I plan to release recordings, but can't guarantee their quality, so please come to class.

Grading scheme:

Further details in the syllabus slides.

Registration and Prerequisites

Registration: Graduate and undergraduate students from any department are welcome to take the class. Undergraduate students should enroll in CPSC 440, and graduate students should enroll in CPSC 540. Below are more details on registration for each course: My expectation (no guarantee) is that everyone on both waitlists will probably get in, and we should also have room for auditors. Join the waiting list by January 16th if you want to register.

Starting in the second week of classes, we'll have weekly tutorials run by the TAs. These will do things like go through provided assignment code, review background material, review big concepts, and/or do exercises. You can register for particular tutorial sections if you want to save a seat at a particular time, but note that you do not need to register in a tutorial section.

CPSC 340/532M vs. CPSC 440/540: CPSC 340 and CPSC 440 are roughly structured as one full-year course. CPSC 340 (which is occasionally listed as CPSC 532M for graduate students) covers more data mining methods and the methods that are most widely-used in applications of machine learning while CPSC 440 (listed as CPSC 540 for graduate students) focuses on probabilistic methods which appear in more niche applications. It is strongly recommended that you take CPSC 340 first, as it covers the most fundamental ideas as well as the most common and practically-useful techniques. In 440 it will be assumed that you are familiar with all the material in the current offering of CPSC 340, and note that online machine learning courses (and courses from many other universities) are not an adequate replacement for CPSC 340 (they typically have more overlap with our applied machine learning course, CPSC 330).

Prerequisites

Undergraduate students will not be able to take the class without these prerequisites. Graduate students may be asked to show how they satisfy prerequisites.

Resources

Textbook: There is no textbook for the course, but the textbook with the most extensive coverage of many of the course's topics is Kevin Murphy's Probabilistic Machine Learning series. The one-volume 2012 version is Machine Learning: A Probabilistic Perspective (MLAPP; you can access a PDF through the UBC library, read a hardcopy version in the CS Reading Room [ICCS 262], or buy a hardcopy). Alternately, he has a very recent two-volume version (2022/2023), PML1 and PML2, both of which have free Creative Commons draft pdfs through those links. I'll try to refer to the relevant sections of both versions as we go, as well as links to various other free online resources.

If you need to refresh your linear algebra or other areas of math, check out Mathematics for Machine Learning (Marc Deisenroth, Aldo Faisal, Cheng Soon Ong; 2020).

Related courses: Besides CPSC340, there are several 500-level graduate courses in CPSC and STAT that are relevant: check out the graduate courses taught by people on the ML@UBC page and the MILD list. CPSC 422/425/436N, EECE 360/592, EOSC 510/550, and STAT 305/306/406/460/461 are also all relevant.

Some related courses that have online notes are:

A YouTube playlist covering in detail many of the core topics in the course: