CPSC 531H: Machine Learning Theory

2018-19 Winter Term 1

General Information

Course Website: http://www.cs.ubc.ca/~nickhar/F18-531

Lecture Time: Monday & Wednesday 12-1:30pm

Lecture Room: DMP 101

Instructor: Prof. Nicholas Harvey, X851, nickhar@cs.ubc.ca
TA: Chris Liaw, cvliaw@cs.ubc.ca.


This is a graduate course on some theoretical aspects of machine learning. The emphasis is on foundations and on results with rigorous proofs. The viewpoint is much more computational than statistical.









·       Definitely: PAC learning, VC dimension, Perceptron, Stochastic Gradient Descent, Multiplicative Weights, Online Learning, Boosting, Online Convex Optimization,

·       Likely: Deep Neural Networks, Bandits

·       Possibly: Distribution Learning, Rademacher Complexity, Mirror Descent, Spectral clustering




Further reading

The PAC Model

·       Valiant “A Theory of the Learnable


Hardness of Proper PAC learning k-term DNFs

·       Pitassi’s Machine Learning Theory, Lecture 12


Hardness of Learning Halfspaces

·       Bshouty, Burroughs. Maximizing agreements and coagnostic learning.

·       Feldman et al. On agnostic learning of parities, monomials and halfspaces.

·       Guruswami, Raghavendra. Hardness of learning halfspaces with noise.



·       Boosting by Filtering: Schapire. The Strength of Weak Learnability.

·       Hedge and AdaBoost: Freund, Schapire. A decision-theoretic generalization of on-line learning and an application to Boosting. EuroCOLT ’95 and JCSS ’97.



·       Balcan, Blum. On a Theory of Learning with Similarity Functions.


Convex Optimization

·       Stephen Boyd’s video and slides on Lagrangians.


Faster Least Squares

·       Drineas, Mahoney, Muthukrishnan, Sarlos. Faster Least Squares Approximation.

·       Mahoney. Randomized Algorithms for Matrices and Data.


Online Learning

·       Deterministic & Randomized Weighted Majority: Littlestone, Warmuth. The Weighted Majority Algorithm. FOCS ’89, Inf and Comp ‘94.

·       Hedge and AdaBoost: Freund, Schapire. A decision-theoretic generalization of on-line learning and an application to Boosting. EuroCOLT ’95 and JCSS ’97.

·       Optimal bounds: Cesa-Bianchi, Freund, Haussler, Helmbold, Schapire, Warmuth. How to Use Expert Advice. STOC ’93, JACM ‘97.



·       Wikipedia. The multi-armed bandit problem.

·       Kleinberg. Online Decision Problems with Large Strategy Sets. PhD Thesis, ‘05.

·       Auer, Cesa-Bianchi, Freund, Schapire. The non-stochastic multi-armed bandit problem. FOCS ’95, SICOMP ’02.

·       Bubeck, Cesa-Bianchi. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. FnTML ’12.


Spectral Clustering

·       Charles Martin’s blog post. Spectral Clustering: A Quick Overview.

·       SciKit-Learn’s Spectral clustering package, source code.

·       Shi, Malik. Normalized Cuts and Image Segmentation. CVPR ‘97, IEEE Trans PAMI ‘00.

·       Ng, Jordan, Weiss. On Spectral Clustering: Analysis and an Algorithm. NIPS ‘01.

·       Lee, Oveis Gharan, Trevisan. Multi-way spectral partitioning and higher-order Cheeger inequalities. STOC ’12, JACM ‘14.


Past offerings of this class

·       CPSC 531H “Machine Learning Theory” Winter 2014


Other UBC classes on machine learning

·       CPSC 340 “Machine Learning and Data Mining” Fall 2013

·       CPSC 532H “Machine Learning and Optimisation for Automated Algorithm Design” Fall 2013

·       CPSC 540 “Machine Learning” Winter 2013

·       STAT 447B “Methods for Statistical Learning” Fall 2013


Similar classes at other universities

·       Robert Schapire’s “Theoretical Machine Learning” at Princeton

·       Nina Balcan’s “Machine Learning Theory” at Georgia Tech

·       Avrim Blum’s “Machine Learning Theory” at Carnegie Mellon

·       Peter Bartlett’s “Statistical Learning Theory” at Berkeley

·       Toni Pitassi’s “Machine Learning Theory” at Toronto

·       Sham Kakade and Ambuj Tewari’s “Learning Theory” at U Chicago


Other moderately related courses

·       Csaba Szepesvari’s “Online Learning” at U Alberta

·       Sanjeev Arora’s “Is Learning Easy?” at Princeton

·       Ankur Moitra’s “Algorithmic Aspects of Machine Learning” at MIT