Selected Publications

A complete list of my publications is also available.

An Algorithmic Proof of the Lovasz Local Lemma via Resampling Oracles 2015/10
Nicholas J. A. Harvey, Jan Vondrak.
IEEE Symposium on Foundations of Computer Science (FOCS 15), Berkeley, CA, October 2015. PDF.
arXiv, April 2015. PDF.

The Lovasz Local Lemma is a seminal result in probabilistic combinatorics. It gives a sufficient condition on a probability space and a collection of events for the existence of an outcome that simultaneously avoids all of those events. Finding such an outcome by an efficient algorithm has been an active research topic for decades. Breakthrough work of Moser and Tardos (2009) presented an efficient algorithm for a general setting primarily characterized by a product structure on the probability space.

In this work we present an efficient algorithm for a much more general setting. Our main assumption is that there exist certain functions, called resampling oracles, that can be invoked to address the undesired occurrence of the events. We show that, in all scenarios to which the original Lovasz Local Lemma applies, there exist resampling oracles, although they are not necessarily efficient. Nevertheless, for essentially all known applications of the Lovasz Local Lemma and its generalizations, we have designed efficient resampling oracles. As applications of these techniques, we present new results for packings of Latin transversals, rainbow matchings and rainbow spanning trees.

Keywords: Lovasz local lemma, algorithmic proof, general probability spaces, resampling oracles

Characterizing Storage Workloads with Counter Stacks 2014/10
Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas J. A. Harvey, Andrew Warfield.
Eleventh Symposium on Operating Systems Design and Implementation (OSDI 14), Broomfield, CO, October 2014. PDF.

Existing techniques for identifying working set sizes based on miss ratio curves (MRCs) have large memory overheads which make them impractical for storage workloads. We present a novel data structure, the counter stack, which can produce approximate MRCs while using sublinear space. We show how counter stacks can be checkpointed to produce workload representations that are many orders of magnitude smaller than full traces, and we describe techniques for estimating MRCs of arbitrary workload combinations over arbitrary windows in time. Finally, we show how online analysis using counter stacks can provide valuable insight into live workloads.

Keywords: Miss ratio curves, workload analysis, streaming algorithms

Submodular Functions: Learnability, Structure, and Optimization 2011/06
Maria-Florina Balcan, Nicholas J. A. Harvey.
43rd Annual ACM Symposium on Theory of Computing (STOC 11), San Jose, CA, June 2011. PDF.
arXiv, August 2010. PDF.

Submodular functions are discrete functions that model laws of diminishing returns and enjoy numerous algorithmic applications. They have been used in many areas, including combinatorial optimization, machine learning, and economics. In this work we study submodular functions from a learning theoretic angle. We provide algorithms for learning submodular functions, as well as lower bounds on their learnability. In doing so, we uncover several novel structural results revealing ways in which submodular functions can be both surprisingly structured and surprisingly unstructured. We provide several concrete implications of our work in other domains including algorithmic game theory and combinatorial optimization.

At a technical level, this research combines ideas from many areas, including learning theory (distributional learning and PAC-style analyses), combinatorics and optimization (matroids and submodular functions), and pseudorandomness (lossless expander graphs).

Keywords: Learning Theory, Submodular Functions, Matroids
Notes: In this paper we use Talagrand's inequality to prove a concentration inequality for Lipschitz, monotone submodular functions. We were unaware of prior work proving concentration for subadditive functions by Schechtman in 2003, using Talagrand's inequality, and by Hajiaghayi et al. in 2005, using self-bounding functions.

In 2009, Chekuri and Vondrak also proved a concentration inequality for Lipschitz, monotone submodular functions, using the FKG inequality. Their inequality achieves Chernoff-type concentration, which is somewhat better than the Talagrand-like concentration that we achieve. An earlier version of our paper only proved concentration for matroid rank functions. After learning of the work of Chekuri and Vondrak, we improved our inequality to its current form, in which it applies to arbitrary Lipschitz, monotone, submodular functions.

Perhaps the nicest way to prove such inequalities is using self-bounding functions. In particular, this approach yields concentration for non-monotone functions; it is not clear whether the FKG or Talagrand approaches do too. See the note by Vondrak from 2010.


Graph Sparsification by Edge-Connectivity and Random Spanning Trees 2011/06
Wai Shing Fung, Ramesh Hariharan, Nicholas J. A. Harvey, Debmalya Panigrahi.
43rd Annual ACM Symposium on Theory of Computing (STOC 11), San Jose, CA, June 2011. PDF.
Submitted to SIAM Journal on Computing. PDF.

We present two new approaches to constructing graph sparsifiers -- weighted subgraphs for which every cut has the same value as the original graph, up to a factor of (1 ± ε). The first approach independently samples each edge uv with probability inversely proportional to the edge-connectivity between u and v. The fact that this approach produces a sparsifier resolves an open question of Benczur and Karger. Concurrent work of Hariharan and Panigrahi (2010) also resolves this question. The second approach samples uniformly random spanning trees. Both of our approaches produce sparsifiers with O(n log2(n)/ε2) edges. Our proofs are based on extensions of the Karger-Stein contraction algorithm which allow it to compute minimum "Steiner cuts".

Keywords: Graph Algorithms, Random Sampling, Contraction Algorithm, Splitting Off, Random Walks
Notes: Many of the results in this paper were independently discovered by Fung & Harvey and by Hariharan & Panigrahi. Actually Hariharan & Panigrahi had many more algorithmic results than we did, including the linear time algorithm for unweighted graphs. They graciously offered to merge the papers for a joint STOC submission.

Algebraic Algorithms for Matching and Matroid Problems 2006/10
Nicholas J. A. Harvey.
IEEE Symposium on Foundations of Computer Science (FOCS 06), Berkeley, CA, October 2006. Received the Best Student Paper (Machtey) Award. PDF PS Slides.
SIAM Journal on Computing, 39(2):679-702, 2009. PDF.

We present new algebraic approaches for several well-known combinatorial problems, including non-bipartite matching, matroid intersection, and some of their generalizations. Our work yields new randomized algorithms that are the most efficient known. For non-bipartite matching, we obtain a simple, purely algebraic algorithm with running time O(nω) where n is the number of vertices and ω is the matrix multiplication exponent. This resolves the central open problem of Mucha and Sankowski (FOCS 2004). For matroid intersection, our algorithm has running time O(nrω-1) for matroids with n elements and rank r that satisfy some natural conditions.

Keywords: Non-bipartite Matching, Matroid Intersection, Algebraic Algorithms, Fast Matrix Multiplication
Notes: I recommend reading the journal version of this paper, not the conference version. The algorithms in the journal version are different and simpler. The conference paper sketched algorithms for the basic path-matching and independent matching problems. To keep the journal paper as simple as possible, these are not discussed there. The conference papers also claims that graphic matroid intersection can be solved in O(n2.575) time. When asked to provide the details, I could not reproduce them. Most likely I was mistaken, so I retract the claim.