SSBD below refers to the book of Shalev-Shwartz and Ben-David; MRT to that of Mohri, Rostamizadeh, and Talwakar.
|1||Mon||Jan 10||Intro / overview||SSBD chap. 1-2; MRT chap. 2|
|Mon||Jan 10||Assignment 1 posted (and .tex)|
|2||Wed||Jan 12||PAC||SSBD chap. 2-3; MRT chap. 2|
|3||Mon||Jan 17||Probability / uniform convergence / more?||SSBD chap. 4; MRT chap. 2|
|Thu||Jan 20||Assignment 1 due, 11:59pm|
|Fri||Jan 21||Drop deadline|
|Mon||Feb 7||Planned shift to hybrid mode (rather than online-only) 🤞|
|Mon||Feb 21||Midterm break|
|Wed||Feb 23||Midterm break|
The course will initially meet on Zoom: the meeting link is available on Canvas and Piazza. Starting
January 24thFebruary 7th, we will hopefully meet in person in DMP 101. I currently plan to both livestream and record lectures throughout the term, either via Zoom (same link) or Panopto (link will be provided if so). Plans here are subject to change.
Recordings are available from both Canvas and Piazza.
Grading scheme: 70% assignments (including a small project), 30% final.
The lowest assignment grade (not including the project) will be dropped. The exact relative weight of assignments and the project is TBD. Assignments should be done in LaTeX – not handwritten or in a word processor. Hand-in procedure will be announced before the first deadline.
There will be one “big assignment” which serves as a (small) project: something on the scale of doing some experiments to explore a paper, doing a lit review in a particular area, extending / unifying a few papers, etc. A proposal will be due beforehand; details to come.
The final exam may be take-home, synchronous online, or in-person; TBD.
There may also be some paper presentations later in the course, in which case the paper presenters will be able to use that to replace part of an assignment grade. This is dependent on the COVID situation and other factors; TBD.
The brief idea of the course: when should we expect machine learning algorithms to work? What kinds of assumptions do we need to be able to be able to rigorously prove that they will work?
Definitely covered: PAC learning, VC dimension, Rademacher complexity, concentration inequalities. Probably: PAC-Bayes, analysis of kernel methods, margin bounds, stability. Maybe: limitations of uniform convergence, analyzing deep nets via neural tangent kernels, provable gaps between kernel methods and deep learning, online learning, feasibility of private learning, compression-based bounds.
There will be some overlap with CPSC 531H: Machine Learning Theory (Nick Harvey's course, last taught in 2018), but if you've taken that course, you'll still get something out of this one. We'll cover less on optimization / online learning / bandits than that course did, and try to cover some more recent ideas used in contemporary deep learning theory.
(This course is unrelated to CPSC 532S: Multimodal Learning with Vision, Language, and Sound, from Leon Sigal.)
There are no formal prerequisites. I will roughly assume:
Learning theory textbooks and surveys:
If you need to refresh your linear algebra or other areas of math:
Resources on learning measure-theoretic probability (not required to know this stuff in detail, but you might find it helpful):