Course Project

The main purpose of CPSC 531H is to bring students closer to the research frontier in the theory of machine learning. Towards that goal, students must complete a course project, which is an in-depth study on a topic relevant to the coursework.

Requirements

· Read at least one research paper

· Possible activities:

o implement the paper's method and perform some small experiments

o try to generalize the paper’s results, or apply their techniques in a new context

o compare the results of multiple research papers

Project Proposal

A brief description of the topic that you plan to study and the paper(s) that you plan to read. The length should be between one-half and one page in length.

Due: TBD.

Written report

A 4-5-page written report, say 2000-2500 words, typeset in Latex. Word processors are not acceptable.

Due: TBD.

Oral presentation

A 10-15 minute presentation with slides discussing the context and contributions of your project.

Presentation Date: TBD, probably sometime around April 10th.

Some suggested topics

Boosting

· Mukherjee, Schapire. A Theory of Multiclass Boosting. JMLR 2013.

· Kalai, Servedio. Boosting in the presence of noise. JCSS 2005.

· Long, Servedio. Random Classification Noise Defeats All Convex Potential Boosters. Machine Learning 2010.

Clustering

· Kleinberg. An Impossibility Theorem for Clustering. NIPS 2002.

· Balcan, Blum, Vempala. A Discriminative Framework for Clustering via Similarity Functions. STOC 2008.

· Balcan, Blum, Gupta. Approximate clustering without the approximation. SODA 2009.

· Balcan, Gupta. Robust Hierarchical Clustering. COLT 2010.

Mixtures of Gaussians

· Dasgupta. Learning mixtures of Gaussians. FOCS 1999.

· Vempala, Wang. A spectral algorithm for learning mixture models. JCSS 2004.

· Feldman, Servedio, O'Donnell. PAC Learning Mixtures of Gaussians with No Separation Assumption. COLT 2006.

· Achlioptas, McSherry. On Spectral Learning Mixtures of Distributions. COLT 2005.

· Kannan, Salmasian, Vempala. The spectral method for general mixture models. SICOMP 2008.

· Brubaker, Vempala. Isotropic PCA and Affine-invariant Clustering. FOCS 2008.

· Belkin, Sinha. Toward Learning Gaussian Mixtures with Arbitrary Separation. COLT 2010

· Kalai, Moitra, Valiant. Efficiently learning mixtures of two Gaussians. FOCS 2010.

· Kalai, Moitra, Valiant. Disentangling Gaussians. CACM 2012.

Submodular Functions

· Feldman, Kothari, Vondrak. Representation, Approximation, and Learning of Submodular Functions Using Low-rank Decision Trees. COLT 2013.

· Feldman, Vondrak. Optimal bounds on Approximation of Submodular and XOS Functions by Juntas. FOCS 2013.

· Cheraghchi, Klivans, Kothari, Lee. Submodular Functions Are Noise Stable. SODA 2012.

· Balcan, Harvey. Submodular Functions: Learnability, Structure, and Optimization. STOC 2011.

· Balcan, Constantin, Iwata, Wang. Learning Valuation Functions. COLT 2012.

· Nagano, Kawahara. Structured Convex Optimization under Submodular Constraints.

· Iyer, Jegelka, Bilmes. Fast Semidifferential-based Submodular Function Optimization. ICML 2013, Best Paper Award.

· Iyer, Bilmes. Submodular Optimization with Submodular Cover and Submodular Knapsack Constraints. NIPS 2013, Outstanding Paper Award.

· Bilmes. Deep Mathematical Properties of Submodularity with Applications to Machine Learning. NIPS 2013 Tutorial.

· Soma, Kakimura, Inaba, Kawarabayashi. Optimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm. ICML 2014.

Sparse / Noisy Recovery

· Recht, Fazel, Parrilo. Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization. SIAM Rev 2010.

· Candes, Recht. Exact Matrix Completion via Convex Optimization. FCM 2009.

· Candes, Li, Ma, Wright. Robust Principal Component Analysis?. JACM 2011.

· Recht. A Simpler Approach to Matrix Completion. JMLR 2011.

· Hardt, Moitra. Algorithms and Hardness for Robust Subspace Recovery. COLT 2013.

Regression

· Sarlos. Improved approximation algorithms for large matrices via random projections. FOCS 2006.

· Chin, Madry, Miller, Peng. Runtime Guarantees for Regression Problems. ITCS 2013.

· Woodruff, Zhang. Subspace Embeddings and L_p-Regression Using Exponential Random Variables. COLT 2013.

Johnson-Lindenstrauss Dimensionality Reduction

Ailon, Chazelle. Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform. SICOMP 2009.
Matousek. On Variants of the Johnson-Lindenstrauss Transform. Random Structures and Algorithms
Dasgupta, Kumar, Sarlos. A Sparse Johnson-Lindenstrauss Transform. STOC 2010.
Ailon, Liberty. Almost Optimal Unrestricted Fast Johnson-Lindenstrauss Transform. SODA 2011.
Kane, Nelson. Sparser Johnson-Lindenstrauss Transforms. SODA 2012.

Classification and Random Projection

· Balcan, Blum, Vempala. Kernels as Features: On Kernels, Margins, and Low-dimensional Mappings. Machine Learning 2006.

· Rahimi, Recht. Random Features for Large-Scale Kernel Machines. NIPS 2007.

· Rahimi, Recht. Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning. NIPS 2008.

· Zhang, Mahdavi, Jin, Yang, Zhu. Recovering the Optimal Solution by Dual Random Projection. COLT 2013.

· Le, Sarlos, Smola. Fastfood – Approximating Kernel Expansions in Loglinear Time. ICML 2013.

Active Learning

· Dasgupta, Kalai, Monteleoni. Analysis of Perceptron-Based Active Learning. JMLR 2009.

Nearest Neighbors

Indyk. Nearest Neighbors in High-Dimensional Spaces.
Andoni, Datar, Immorlica, Indyk, Mirrokni. Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. SOCG 2006.
Andoni, Indyk. Near-Optimal Hashing Algorithms for Near Neighbor Problem in High Dimensions. FOCS 2006.
Dasgupta, Freund. Random projection trees and low dimensional manifolds. STOC 2008.
Dasgupta, Kumar, Sarlos. Fast Locality-Sensitive Hashing. KDD 2011.
Dasgupta, Sinha. Randomized partition trees for exact nearest neighbor search. COLT 2013.

Complexity of finding inconsistent linear separators

· Bshouty, Burroughs. Maximizing agreements and coagnostic learning. TCS 2006.

· Feldman et al. On agnostic learning of parities, monomials and halfspaces. SICOMP 2009.

· Guruswami, Raghavendra. Hardness of learning halfspaces with noise. SICOMP 2009.

Boolean Functions and Fourier Analysis

· De Wolf. A Brief Introduction to Fourier Analysis on the Boolean Cube. ToC 2008.

· Linial, Mansour, Nisan. Constant Depth Circuits, Fourier Transform, and Learnability. JACM 1993.

· Kushilevitz, Mansour. Learning decision trees using the Fourier spectrum. SICOMP 1993.

· Mansour. Learning Boolean Functions via the Fourier Transform. 1994.

Misc

· Chandrasekaran, Jordan. Computational and Statistical Tradeoﬀs via Convex Relaxation. PNAS 2013.