CPSC 532D: Modern Statistical Learning Theory – Fall 2023 (2023W1)

Instructor: Danica Sutherland (she): dsuth@cs.ubc.ca, ICICS X539.
TA: Hamed Shirzad.
Lecture info: Tuesdays/Thursdays, 15:30 - 16:50, Swing 210.
Office hours: Wednesdays 11am-12pm and Fridays 2pm-3pm, ICICS X539 or Zoom (link on Piazza).
Office hours: Or ask after class, or on Piazza, or request another time (potential availability calendar if helpful).
Piazza (or more-direct link); Canvas; Gradescope.

Previously offered in in 2022W1 and (with the name 532S) in 2021W2; this instance will be broadly similar.


Italicized entries are tentative. The book acronyms are described here.

TuSep 5No class: Imagine Day
ThSep 7Course intro, ERMSSBD 1-2; MRT 2
FSep 8Assignment 1 posted: pdf, tex
TuSep 12Class canceled: sick
ThSep 14Uniform convergence with finite classesSSBD 2-4; MRT 2
MSep 18Assignment 1 due at noon
MSep 18Drop deadline
TuSep 19Concentration inequalitiesSSBD B; MRT D
Zhang 2; Wainwright 2
ThSep 21PAC learning; covering numbersSSBD 3, MRT 2
Bach 4.4.4, Zhang 3.4/4/5 (much more detailed)
SaSep 23Assignment 2 posted: pdf, tex
TuSep 26Rademacher complexityMRT 3; SSBD 26; Bach 4.5; Zhang 6
ThSep 28More Rademacher (same notes)
TuOct 3VC dimension
ThOct 5No Free Lunch
TuOct 10
WOct 11Assignment 2 due at midnight
ThOct 12No class: UBC follows a Monday schedule
TuOct 17
ThOct 19
TuOct 24
ThOct 26
FOct 27Withdrawal deadline
TuOct 31
ThNov 2
TuNov 7
ThNov 9
TuNov 14No class: midterm break
ThNov 16
MNov 21
WNov 23
TuNov 28
ThNov 30
TuDec 5
ThDec 7
?Dec ??Final exam (in person, handwritten) — date and time TBA, sometime Dec 11-22


The course meets in person in Swing 210, with possible rare exceptions (e.g. if I get sick but can still teach, I'll move it online). Note that this room does not have a recording setup.

Grading scheme: 70% assignments, 30% final.

There will be four or five written assignments through the term; answers should be written in LaTeX, and handed in on Gradescope. There will also be a small number (one or two) of assignments that involve reading a paper, reacting to it, and poking at it slightly further; details to come.


The brief idea of the course: when should we expect machine learning algorithms to work? What kinds of assumptions do we need to be able to be able to rigorously prove that they will work?

Definitely covered: PAC learning, VC dimension, Rademacher complexity, concentration inequalities, margin bounds, stability. Also, most of: PAC-Bayes, analysis of kernel methods, limitations of uniform convergence, analyzing deep nets via neural tangent kernels, provable gaps between kernel methods and deep learning, online learning, feasibility of private learning, compression-based bounds.


There are no formal prerequisites. I will roughly assume:

If you have any specific questions about your background, feel free to ask.


Books that the course will definitely pull from:

New books where I may or may not pull from sections, TBD:

Some other points of view you might like:

If you need to refresh your linear algebra or other areas of math:

Measure-theoretic probability is not required for this course, but there are instances and related areas where it could be helpful:

Similar courses: