CPSC 532D: Modern Statistical Learning Theory – Fall 2023 (2023W1)

Instructor: Danica Sutherland (she): dsuth@cs.ubc.ca, ICICS X539.
TA: Hamed Shirzad.
Lecture info: Tuesdays/Thursdays, 15:30 - 16:50, Swing 210.
Office hours: Wednesdays 11am-12pm and Fridays 2pm-3pm, ICICS X539 or Zoom (link on Piazza).
Office hours: Or ask after class, or on Piazza, or request another time (potential availability calendar if helpful).
Piazza (or more-direct link); Canvas; Gradescope.

Previously offered in in 2022W1 and (with the name 532S) in 2021W2; this instance will be broadly similar.


Italicized entries are tentative. The book acronyms are described here. As a general rule:
TuSep 5No class: Imagine Day
ThSep 7Course intro, ERMSSBD 1-2; MRT 2
FSep 8Assignment 1 posted: pdf, tex
TuSep 12Class canceled: sick
ThSep 14Uniform convergence with finite classes
[Online: sick]
SSBD 2-4; MRT 2
MSep 18Assignment 1 due at noon
MSep 18Drop deadline
TuSep 19Concentration inequalitiesSSBD B; MRT D
Zhang 2; Wainwright 2
ThSep 21PAC learning; covering numbersSSBD 3, MRT 2
Bach 4.4.4, Zhang 3.4/4/5
SaSep 23Assignment 2 posted: pdf, tex
TuSep 26Rademacher complexityMRT 3.1; SSBD 26; Bach 4.5; Zhang 6
ThSep 28
TuOct 3VC dimensionSSBD 6; MRT 3.2-3.3
ThOct 5finish VC; No Free Lunch; “Fundamental Theorem”SSBD 5; MRT 3.4
Bach 4.6 / 12; Zhang 12
TuOct 10Structural Risk Minimization / Min Description LengthSSBD 7; MRT 4
WOct 11Assignment 2 due at midnight
ThOct 12No class: UBC follows a Monday schedule
TuOct 17finish SRM, MDL; briefly start margins
ThOct 19Margins, SVMsMRT 5; SSBD 15, 26
MOct 23Assignment 3 posted: pdf, tex
TuOct 24More margins/SVMsMRT 5; SSBD 15, 26
ThOct 26KernelsBach 7, MRT 6, SSBD 16
FOct 27Withdrawal deadline
TuOct 31More kernels
ThNov 2Talk by Yejin Choi on limits of LLMs, Fred Kaiser Building 2020/2030
TuNov 7Universal approximationTelgarsky 2; SSBD 20; Bach 9.3; SC 4.6
ThNov 9Finish approximation; Is ERM enough?
[Online: at a workshop]
FNov 10Assignment 3 due at midnight
TuNov 14No class: midterm break
ThNov 16Stability, regularization, convex problemsSSBD 12-13, MRT 14
TuNov 21
ThNov 23(Stochastic) gradient descentSSBD 14, Bach 5
MNov 27Assignment 4 posted: pdf, tex
TuNov 28Nonconvex optimization, start neural tangent kernels
ThNov 30Neural tangent kernelsTelgarsky 4, Bach 11.3
TuDec 5Implicit regularizationBach 11.1
ThDec 7Grab-bag
MDec 18Final exam (in person, handwritten) — 1-3:30pm, ICCS 246
WDec 20Assignment 4 due at midnight


The course meets in person in Swing 210, with possible rare exceptions (e.g. if I get sick but can still teach, I'll move it online). Note that this room does not have a recording setup.

Grading scheme: 70% assignments, 30% final.

There will be four or five written assignments through the term; answers should be written in LaTeX, and handed in on Gradescope. There will also be a small number (one or two) of assignments that involve reading a paper, reacting to it, and poking at it slightly further; details to come.


The brief idea of the course: when should we expect machine learning algorithms to work? What kinds of assumptions do we need to be able to be able to rigorously prove that they will work?

Definitely covered: PAC learning, VC dimension, Rademacher complexity, concentration inequalities, margin bounds, stability. Also, most of: PAC-Bayes, analysis of kernel methods, limitations of uniform convergence, analyzing deep nets via neural tangent kernels, provable gaps between kernel methods and deep learning, online learning, feasibility of private learning, compression-based bounds.


There are no formal prerequisites. I will roughly assume:

If you have any specific questions about your background, feel free to ask.


Books that the course will definitely pull from:

New books where I may or may not pull from sections, TBD:

Some other points of view you might like:

If you need to refresh your linear algebra or other areas of math:

Measure-theoretic probability is not required for this course, but there are instances and related areas where it could be helpful:

Similar courses: