CPSC 532D: Modern Statistical Learning Theory – 2022W1

Instructor: Danica Sutherland (she): dsuth@cs.ubc.ca, ICICS X563. TA: Milad Jalali.
Lecture info: Mondays/Wednesdays, 15:00 - 16:20, Orchard Commons 3004.
Office hours: Milad: Fridays 16:00-17:00, ICCS X150 table 3.
Piazza (or other link if that Canvas one doesn't work for whatever reason).

Previously offered in 2021W2 (under the name CPSC 532S); this instance will be broadly similar.


Italicized entries are tentative. The books SSBD, MRT, and Tel are described further here.

WSep 7Course intro; start finite hypothesis classesintro slidesSSBD 1-2; MRT 2
ThSep 8Assignment 1 posted: pdf, tex
MSep 12Class canceled (sick)
WSep 14PAC learning: definitions, finite hypothesis classesslidesSSBD 2-4; MRT 2
MSep 19No class: National day of mourning
TSep 20Assignment 1 due at noonsolutions
TSep 20Drop deadline
WSep 21Uniform convergence, concentration inequalitiesSSBD 4, B; MRT 2, D
Tel 12; Wainwright 2
SaSep 24Assignment 2 posted: pdf, tex
MSep 26More uniform convergence: Rademacher complexityMRT 3; SSBD 26; Tel 13
WSep 28More on Rademacher complexity
MOct 3VC dimension
[guest lecturer: Nick Harvey]
WOct 5The fundamental theorem of statistical learning
[guest lecturer: Nick Harvey]
MOct 10No class: Thanksgiving
WOct 12Assignment 2 due at noon
WOct 12
MOct 17
WOct 19
MOct 24
WOct 26
FOct 28Withdrawal deadline
MOct 31
WNov 2
MNov 7
WNov 9No class: midterm break
MNov 14
WNov 16
MNov 21
WNov 23
MNov 28
WNov 30
MDec 5
WDec 7
WDec 14Final exam (in person in ICCS 246, handwritten) — 14:00 - 16:30


The course meets in person in Orchard Commons 3004. There are no lecture recordings due to limitations of the room.

Grading scheme: 70% assignments, 30% final.

The lowest assignment grade (not including the project) will be dropped. Assignments should be done in LaTeX – not handwritten or in a word processor. Hand-in on Gradescope, to be described in more detail soon.

One assignment, late in the term, will be a "mini-project" that will involve reading a paper and running a small exploratory experiment, doing a detailed exploration of their assumptions, etc. Suggested papers and details to come later.


The brief idea of the course: when should we expect machine learning algorithms to work? What kinds of assumptions do we need to be able to be able to rigorously prove that they will work?

Definitely covered: PAC learning, VC dimension, Rademacher complexity, concentration inequalities. Probably: PAC-Bayes, analysis of kernel methods, margin bounds, stability. Maybe: limitations of uniform convergence, analyzing deep nets via neural tangent kernels, provable gaps between kernel methods and deep learning, online learning, feasibility of private learning, compression-based bounds.


There are no formal prerequisites. I will roughly assume:

If you have any specific questions about your background, feel free to ask.


Books where we'll use sections in some detail:

Some other points of view you might like:

If you need to refresh your linear algebra or other areas of math:

Measure-theoretic probability is not required for this course, but there are instances and related areas where it could be helpful:

Similar courses: