The main purpose of CPSC 531H is to bring students
closer to the research frontier in the theory of machine learning. Towards that
goal, students must complete a course project, which is an in-depth study on a
topic relevant to the coursework.
·
Read at least one research
paper
·
Possible activities:
o implement the paper's method and perform some small
experiments
o try to generalize the paper’s results, or apply
their techniques in a new context
o compare the results of multiple research papers
A brief description of the topic that you plan to
study and the paper(s) that you plan to read. The length should be between
one-half and one page in length.
Due: TBD.
A 4-5-page written report, say 2000-2500 words,
typeset in Latex. Word processors are not acceptable.
Due: TBD.
A 10-15 minute presentation with slides discussing
the context and contributions of your project.
Presentation
Date: TBD, probably sometime
around April 10th.
Some suggested topics
Boosting
·
Mukherjee,
Schapire. A Theory
of Multiclass Boosting. JMLR 2013.
·
Kalai, Servedio. Boosting
in the presence of noise. JCSS 2005.
·
Long,
Servedio. Random
Classification Noise Defeats All Convex Potential Boosters. Machine
Learning 2010.
Clustering
·
Kleinberg.
An
Impossibility Theorem for Clustering. NIPS 2002.
·
Balcan, Blum, Vempala.
A
Discriminative Framework for Clustering via Similarity Functions. STOC
2008.
·
Balcan, Blum, Gupta. Approximate
clustering without the approximation. SODA 2009.
·
Balcan, Gupta. Robust
Hierarchical Clustering. COLT 2010.
Mixtures of Gaussians
·
Dasgupta. Learning mixtures of
Gaussians. FOCS 1999.
·
Vempala, Wang. A
spectral algorithm for learning mixture models. JCSS 2004.
·
Feldman,
Servedio, O'Donnell. PAC
Learning Mixtures of Gaussians with No Separation Assumption. COLT 2006.
·
Achlioptas, McSherry. On
Spectral Learning Mixtures of Distributions. COLT 2005.
·
Kannan,
Salmasian, Vempala. The spectral
method for general mixture models. SICOMP 2008.
·
Brubaker,
Vempala. Isotropic PCA and
Affine-invariant Clustering. FOCS 2008.
·
Belkin,
Sinha. Toward
Learning Gaussian Mixtures with Arbitrary Separation. COLT 2010
·
Kalai, Moitra, Valiant. Efficiently
learning mixtures of two Gaussians. FOCS 2010.
·
Kalai, Moitra, Valiant. Disentangling
Gaussians. CACM 2012.
Submodular Functions
·
Feldman,
Kothari, Vondrak. Representation,
Approximation, and Learning of Submodular Functions Using Low-rank Decision
Trees. COLT 2013.
·
Feldman,
Vondrak. Optimal
bounds on Approximation of Submodular and XOS Functions by Juntas. FOCS
2013.
·
Cheraghchi, Klivans,
Kothari, Lee. Submodular
Functions Are Noise Stable. SODA 2012.
·
Balcan, Harvey. Submodular
Functions: Learnability, Structure, and Optimization. STOC 2011.
·
Balcan, Constantin, Iwata, Wang. Learning
Valuation Functions. COLT 2012.
·
Nagano,
Kawahara. Structured
Convex Optimization under Submodular Constraints.
·
Iyer, Jegelka, Bilmes. Fast Semidifferential-based Submodular Function Optimization.
ICML 2013, Best Paper Award.
·
Iyer, Bilmes. Submodular
Optimization with Submodular Cover and Submodular Knapsack Constraints.
NIPS 2013, Outstanding Paper Award.
·
Bilmes. Deep
Mathematical Properties of Submodularity with
Applications to Machine Learning. NIPS 2013 Tutorial.
·
Soma,
Kakimura, Inaba, Kawarabayashi. Optimal Budget
Allocation: Theoretical Guarantee and Efficient Algorithm. ICML 2014.
Sparse / Noisy Recovery
·
Recht, Fazel, Parrilo. Guaranteed
Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization.
SIAM Rev 2010.
·
Candes, Recht. Exact
Matrix Completion via Convex Optimization. FCM 2009.
·
Candes, Li, Ma, Wright. Robust Principal Component Analysis?. JACM 2011.
·
Recht. A
Simpler Approach to Matrix Completion. JMLR 2011.
·
Hardt, Moitra. Algorithms and
Hardness for Robust Subspace Recovery. COLT 2013.
Regression
·
Sarlos. Improved approximation
algorithms for large matrices via random projections. FOCS 2006.
·
Chin,
Madry, Miller, Peng. Runtime Guarantees for Regression
Problems. ITCS 2013.
·
Woodruff,
Zhang. Subspace
Embeddings and L_p-Regression
Using Exponential Random Variables. COLT 2013.
Johnson-Lindenstrauss Dimensionality Reduction
Classification and
Random Projection
·
Balcan, Blum, Vempala.
Kernels
as Features: On Kernels, Margins, and Low-dimensional Mappings. Machine
Learning 2006.
·
Rahimi, Recht. Random
Features for Large-Scale Kernel Machines. NIPS 2007.
·
Rahimi, Recht. Weighted
Sums of Random Kitchen Sinks: Replacing minimization with randomization in
learning. NIPS 2008.
·
Zhang,
Mahdavi, Jin, Yang, Zhu. Recovering the
Optimal Solution by Dual Random Projection. COLT 2013.
·
Le,
Sarlos, Smola. Fastfood – Approximating Kernel Expansions in Loglinear Time. ICML 2013.
Active Learning
·
Dasgupta, Kalai, Monteleoni. Analysis of
Perceptron-Based Active Learning. JMLR 2009.
Nearest Neighbors
Complexity of finding
inconsistent linear separators
·
Bshouty, Burroughs. Maximizing
agreements and coagnostic learning. TCS 2006.
·
Feldman et al. On
agnostic learning of parities, monomials and halfspaces.
SICOMP 2009.
·
Guruswami, Raghavendra. Hardness
of learning halfspaces with noise. SICOMP 2009.
Boolean Functions and
Fourier Analysis
·
De
Wolf. A Brief
Introduction to Fourier Analysis on the Boolean Cube.
ToC 2008.
·
Linial, Mansour, Nisan. Constant
Depth Circuits, Fourier Transform, and Learnability. JACM 1993.
·
Kushilevitz, Mansour. Learning decision
trees using the Fourier spectrum. SICOMP 1993.
·
Mansour.
Learning
Boolean Functions via the Fourier Transform. 1994.
Misc
·
Chandrasekaran, Jordan. Computational and Statistical Tradeoffs via Convex Relaxation. PNAS 2013.