Nearly-tight VC-dimension bounds for piecewise linear neural networks 2017/07
Nicholas J. A. Harvey, Chris Liaw, Abbas Mehrabian.
Conference on Learning Theory (COLT 17), Amsterdam, Netherlands, July 2017. PDF.
arXiv, March 2017. PDF.

We prove new upper and lower bounds on the VC-dimension of deep neural networks with the ReLU activation function. These bounds are tight for almost the entire range of parameters. Letting W be the number of weights and L be the number of layers, we prove that the VC-dimension is O(W L log(W)) and

Ω(W L log(W/L)). This improves both the previously known upper bounds and lower bounds. In terms of the number U of non-linear units, we prove a tight bound (WU) on the VC-dimension. All of these results generalize to arbitrary piecewise linear activation functions.

Keywords: Deep neural networks, learning theory, VC-dimension.

Computing the independence polynomial in Shearer's region for the LLL 2016/07
Nicholas J. A. Harvey, Piyush Srivastava, Jan Vondrak.
arXiv, July 2016. PDF.

The independence polynomial has been widely studied in algebraic graph theory, in statistical physics, and in algorithms for counting and sampling problems. Seminal results of Weitz (2006) and Sly (2010) have shown that in bounded-degree graphs the independence polynomial can be efficiently approximated if the argument is positive and below a certain threshold, whereas above that threshold the polynomial is hard to approximate. Furthermore, this threshold exactly corresponds to a phase transition in physics, which demarcates the region within which the Gibbs measure has correlation decay.

Evaluating the independence polynomial with negative or complex arguments may not have a counting interpretation, but it does have strong connections to combinatorics and to statistical physics. The independence polynomial with negative arguments determines the maximal region of probabilities to which the Lovasz Local Lemma (LLL) can be extended, and also gives a lower bound on the probability in the LLL's conclusion (Shearer 1985). In statistical physics, complex roots of the independence polynomial relate to existence of phase transitions, and there is a relation between negative and complex roots (Penrose 1963).

We study algorithms for approximating the independence polynomial with negative and complex arguments. Whereas many algorithms for computing combinatorial polynomials are restricted to the univariate setting, we consider the multivariate independence polynomial, since there is a natural multivariate region of interest -- Shearer's region for the LLL. Our main result is: for any n-vertex graph of degree at most d, any α ∈ (0,1], and any complex vector p such that (1+α)⋅p lies in Shearer's region, there is a deterministic algorithm to approximate the independence polynomial at p within (1+ε) multiplicative error and with runtime (n/εα)O(log(d)/α). Our results also extend to graphs of unbounded degree that have a bounded connective constant. Our analysis uses a novel multivariate form of the correlation decay technique.

On the hardness side, we prove that every algorithm must have some dependence on α, as it is #P-hard to evaluate the polynomial within any poly(n) factor at an arbitrary given point in Shearer's region. Similarly, deciding if a given point lies in Shearer's region is also #P-hard. Finally, it is NPhard to approximate the independence polynomial in a graph of degree at most d with a fixed negative argument that is only a constant factor outside Shearer's region.

Keywords: Independence polynomial, Lovasz Local Lemma, Shearer region, correlation decay, FPTAS

Generating Random Spanning Trees via Fast Matrix Multiplication 2016/04
Nicholas J. A. Harvey, Keyulu Xu.
Latin American Theoretical Informatics Symposium (LATIN 2016), Ensenada, Mexico, April 2016. PDF.

We consider the problem of sampling a uniformly random spanning tree of a graph. This is a classic algorithmic problem for which several exact and approximate algorithms are known. Random spanning trees have several connections to Laplacian matrices; this leads to algorithms based on fast matrix multiplication. The best algorithm for dense graphs can produce a uniformly random spanning tree of an n-vertex graph in time O(n2.38). This algorithm is intricate and requires explicitly computing the LU-decomposition of the Laplacian.

We present a new algorithm that also runs in time O(n2.38) but has several conceptual advantages. First, whereas previous algorithms need to introduce directed graphs, our algorithm works only with undirected graphs. Second, our algorithm uses fast matrix inversion as a black-box, thereby avoiding the intricate details of the LU-decomposition.

Keywords: Algebraic Algorithms, Random Spanning Trees, Fast Matrix Multiplication.

Sparse Sums of Positive Semidefinite Matrices 2015/12
Marcel K. de Carli Silva, Nicholas J. A. Harvey, Cristiane M. Sato.
ACM Transactions on Algorithms, 12(1), December 2015. PDF.

Recently there has been much interest in "sparsifying" sums of rank one matrices: modifying the coefficients such that only a few are nonzero, while approximately preserving the matrix that results from the sum. Results of this sort have found applications in many different areas, including sparsifying graphs. In this paper we consider the more general problem of sparsifying sums of positive semidefinite matrices that have arbitrary rank.

We give several algorithms for solving this problem. The first algorithm is based on the method of Batson, Spielman and Srivastava (2009). The second algorithm is based on the matrix multiplicative weights update method of Arora and Kale (2007). We also highlight an interesting connection between these two algorithms.

Our algorithms have numerous applications. We show how they can be used to construct graph sparsifiers with auxiliary constraints, sparsifiers of hypergraphs, and sparse solutions to semidefinite programs.

Keywords: Graph Algorithms, Sparsifiers, SDPs, Sparse Solutions, Multiplicative Weights Method.

Rainbow Hamilton cycles and lopsidependency 2015/11
Nicholas J. A. Harvey, Chris Liaw.
Accepted to Discrete Mathematics. PDF.

The Lovasz Local Lemma is a powerful probabilistic tool used to prove the existence of combinatorial structures which avoid a set of constraints. A standard way to apply the local lemma is to prove that the set of constraints satsify a lopsidependency condition and obtain a lopsidependency graph. For instance, Erdos and Spencer used this framework to posit the existence of Latin transversals in matrices provided no symbol appears too often in the matrix.

The local lemma has been used in various ways to infer the existence of rainbow Hamilton cycles in complete graphs when each colour is used at most O(n) times. However, the existence of a lopsidependency graph for Hamilton cycles has neither been proved nor refuted. All previous approaches have had to prove a variant of the local lemma or reduce the problem of finding Hamilton cycles to finding another combinatorial structure, such as Latin transversals. In this paper, we revisit the question of whether or not Hamilton cycles have a lopsidependency graph and give a positive answer for this question. We also use the resampling oracle framework of Harvey and Vondrak to give a polynomial time algorithm for finding rainbow Hamilton cycles in complete graphs.

Keywords: Lovasz Local Lemma, lopsidependency, Hamilton cycles, resampling oracles

An Algorithmic Proof of the Lovasz Local Lemma via Resampling Oracles 2015/10
Nicholas J. A. Harvey, Jan Vondrak.
IEEE Symposium on Foundations of Computer Science (FOCS 15), Berkeley, CA, October 2015. PDF.
arXiv, April 2015. PDF.

The Lovasz Local Lemma is a seminal result in probabilistic combinatorics. It gives a sufficient condition on a probability space and a collection of events for the existence of an outcome that simultaneously avoids all of those events. Finding such an outcome by an efficient algorithm has been an active research topic for decades. Breakthrough work of Moser and Tardos (2009) presented an efficient algorithm for a general setting primarily characterized by a product structure on the probability space.

In this work we present an efficient algorithm for a much more general setting. Our main assumption is that there exist certain functions, called resampling oracles, that can be invoked to address the undesired occurrence of the events. We show that, in all scenarios to which the original Lovasz Local Lemma applies, there exist resampling oracles, although they are not necessarily efficient. Nevertheless, for essentially all known applications of the Lovasz Local Lemma and its generalizations, we have designed efficient resampling oracles. As applications of these techniques, we present new results for packings of Latin transversals, rainbow matchings and rainbow spanning trees.

Keywords: Lovasz local lemma, algorithmic proof, general probability spaces, resampling oracles

Approximating Hit Rate Curves using Streaming Algorithms 2015/08
Zachary Drudi, Nicholas J. A. Harvey, Stephen Ingram, Andrew Warfield, Jake Wires.
International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX), Princeton, NJ, August 2015. PDF.

A hit rate curve is a function that maps cache size to the proportion of requests that can be served from the cache. (The caching policy and sequence of requests are assumed to be fixed.) Hit rate curves have been studied for decades in the operating system, database and computer architecture communities. They are useful tools for designing appropriate cache sizes, dynamically allocating memory between competing caches, and for summarizing locality properties of the request sequence. In this paper we focus on the widely-used LRU caching policy.

Computing hit rate curves is very efficient from a runtime standpoint, but existing algorithms are not efficient in their space usage. For a stream of m requests for n cacheable objects, all existing algorithms that provably compute the hit rate curve use space linear in n. In the context of modern storage systems, n can easily be in the billions or trillions, so the space usage of these algorithms makes them impractical.

We present the first algorithm for approximating hit rate curves with sublinear space and guaranteed accuracy. Our algorithm uses O(p2 log(n) log2(m) / ε2) bits of space and approximates the hit rate curve at p uniformly-spaced points to within additive error ε . This is not far from optimal. Any single-pass algorithm with the same guarantees must use Ω(p2 + ε-2 + log n) bits of space. Furthermore, our use of additive error is necessary. Any single-pass algorithm achieving multiplicative error requires Ω(n) bits of space.

Keywords: Hit rate curves, streaming algorithms, sublinear space

A generalization of the Cauchy-Schwarz inequality involving four vectors 2015/06
Nicholas J. A. Harvey.
Journal of Mathematical Inequalities, 9(2):489-491, June 2015. PDF.

We generalize the well-known Cauchy-Schwarz inequality to an inequality involving four vectors. Although the statement is very simple and the proof is short, it does not seem to appear elsewhere in the literature.

Keywords: Cauchy-Schwarz Inequality

A note on the discrepancy of matrices with bounded row and column sums 2015/04
Nicholas J. A. Harvey.
Discrete Mathematics, 338:517-521, April 2015. PDF.

A folklore result uses the Lovasz local lemma to analyze the discrepancy of hypergraphs with bounded degree and edge size. We generalize this result to the context of real matrices with bounded row and column sums.

Keywords: Discrepancy, Lovasz Local Lemma.

Counter Stacks and the Elusive Working Set 2015/02
Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas J. A. Harvey, Andrew Warfield.
;login: the Usenix magazine, 40(1), February 2015. PDF.

Counter stacks are a compact and effective data structure for summarizing access patterns in memory and storage workloads. They are a stream abstraction that efficiently characterizes the uniqueness of an access stream over time, and are sufficiently low overhead as to allow both new approaches to online decision-making (such as replacement or prefetching policies) and new applications of lightweight trace transmission and archiving.

Keywords: Miss ratio curves, workload analysis, streaming algorithms

Characterizing Storage Workloads with Counter Stacks 2014/10
Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas J. A. Harvey, Andrew Warfield.
Eleventh Symposium on Operating Systems Design and Implementation (OSDI 14), Broomfield, CO, October 2014. PDF.

Existing techniques for identifying working set sizes based on miss ratio curves (MRCs) have large memory overheads which make them impractical for storage workloads. We present a novel data structure, the counter stack, which can produce approximate MRCs while using sublinear space. We show how counter stacks can be checkpointed to produce workload representations that are many orders of magnitude smaller than full traces, and we describe techniques for estimating MRCs of arbitrary workload combinations over arbitrary windows in time. Finally, we show how online analysis using counter stacks can provide valuable insight into live workloads.

Keywords: Miss ratio curves, workload analysis, streaming algorithms

Discrepancy Without Partial Colorings 2014/09
Nicholas J. A. Harvey, Roy Schwartz, Mohit Singh.
International Workshop on Approximation Algorithms for Combinatorial Optimization Problems (APPROX), Barcelona, Spain, September 2014. PDF.

Spencer's theorem asserts that, for any family of n subsets of ground set of size n, the elements of the ground set can be "colored" by the values {+1,-1} such that the sum of every set is O(n0.5) in absolute value. All existing proofs of this result recursively construct "partial colorings", which assign {+1,-1} values to half of the ground set. We devise the first algorithm for Spencer's theorem that directly computes a coloring, without recursively computing partial colorings.

Keywords: Discrepancy, Semidefinite Programs, Random Walks

Near-Optimal Herding 2014/06
Nicholas J. A. Harvey, Samira Samadi.
Conference on Learning Theory (COLT 14), Barcelona, Spain, June 2014. PDF Slides.

The Herding algorithm is an algorithm of recent interest in the machine learning community, motivated by inference in Markov random fields. It solves the following sampling problem: given a set X in Rd with mean μ, construct an infinite sequence of points from X such that, for every t≥1, the mean of the first t points in that sequence lies within Euclidean distance O(1/t) of μ. The classic Perceptron boundedness theorem implies that such a result actually holds for a wide class of algorithms, although the factors suppressed by the O(1/t) notation are exponential in d. Thus, to establish a non-trivial result for the sampling problem, one must carefully analyze the factors suppressed by the O(1/t) error bound.

This paper studies the best error that can be achieved for the sampling problem. Known analyses of the Herding algorithm give a error bound that depends on geometric properties of X but, even under favorable conditions, this bound depends linearly on d. We present a new polynomial-time algorithm that solves the sampling problem with error O(d0.5 log2.5|X|/t). Our algorithm is based on recent algorithmic results in discrepancy theory. We also show that any algorithm for the sampling problem must have error Ω(d0.5/t), implying that our algorithm is optimal to within logarithmic factors.

Keywords: Discrepancy, Deterministic Sampling.

Pipage Rounding, Pessimistic Estimators and Matrix Concentration 2014/01
Nicholas J. A. Harvey, Neil Olver.
ACM-SIAM Symposium on Discrete Algorithms (SODA 14), Portland, OR, January 2014. PDF.
arXiv, July 2013. PDF.

Pipage rounding is a dependent random sampling technique that has several interesting properties and diverse applications. One property that has been useful in applications is negative correlation of the resulting vector. There are some further properties that would be interesting to derive, but do not seem to follow from negative correlation. In particular, recent concentration results for sums of independent random matrices are not known to extend to a negatively dependent setting.

We introduce a simple but useful technique called concavity of pessimistic estimators. This technique allows us to show concentration of submodular functions and concentration of matrix sums under pipage rounding. The former result answers a question of Chekuri et al. (2009). To prove the latter result, we derive a new variant of Lieb's celebrated concavity theorem in matrix analysis.

We provide numerous applications of these results. One is to spectrally-thin trees, a spectral analog of the thin trees that played a crucial role in the recent breakthrough on the asymmetric traveling salesman problem. We show a polynomial time algorithm that, given a graph where every edge has effective conductance at least κ, returns an O(κ-1 log n / log log n)-spectrally-thin tree. There are further applications to rounding of semidefinite programs, to the column subset selection problem, and to a geometric question of extracting a nearly-orthonormal basis from an isotropic distribution.

Keywords: Spectral Graph Theory, Matrix Concentration, Thin Trees.

UNO is hard, even for a single player 2013/11
Erik D. Demaine, Martin L. Demaine, Nicholas J. A. Harvey, Ryuhei Uehara, Takeaki Uno, Yushi Uno.
Theoretical Computer Science, November 2013. PDF.

This paper investigates the popular card game UNO from the viewpoint of algorithmic combinatorial game theory. We define simple and concise mathematical models for the game, including both cooperative and uncooperative versions, and analyze their computational complexity. In particular, we prove that even a single-player version of UNO is NP-complete, although some restricted cases are in P. Surprisingly, we show that the uncooperative two-player version is also in P.

Keywords: Combinatorial Game Theory, Complexity

An introduction to the Kadison-Singer Problem and the Paving Conjecture 2013/07
Nicholas J. A. Harvey.
Not submitted for publication, July 2013. PDF.

We give a gentle introduction to the famous Kadison-Singer problem, aimed at readers without much background in functional analysis or operator theory. We explain why this problem is equivalent to some "paving" or "discrepancy" problems concerning finite-dimensional matrices.

Keywords: Kadison-Singer Problem, Paving Conjecture, Survey.

Learning symmetric non-monotone submodular functions 2012/12
Maria-Florina Balcan, Nicholas J. A. Harvey, Satoru Iwata.
NIPS Workshop on Discrete Optimization in Machine Learning (DISCML 12), Lake Tahoe, NV, December 2012. PDF.

We prove a new structural result for symmetric submodular functions. We use that result to obtain an efficient algorithm for approximately learning such functions in the passive, supervised learning setting. We also complement this result with a nearly matching lower bound. Our work provides the first results for learning a large class of non-monotone submodular functions under general distributions.

Keywords: Learning Theory, Submodular Functions, Symmetric Submodular Functions.

On disjoint common bases in two matroids 2011/12
Nicholas J. A. Harvey, Tamas Kiraly, Lap Chi Lau.
SIAM Journal on Discrete Mathematics, 25(4):1792-1803, Dec 2011. PDF.

We prove two results on packing common bases of two matroids. First, we show that the computational problem of common base packing reduces to the special case where one of the matroids is a partition matroid. Second, we give a counterexample to a conjecture of Chow, which proposed a sufficient condition for the existence of a common base packing. Chow's conjecture is a generalization of Rota's basis conjecture.

Keywords: Matroid Intersection, Common Bases, Packing, Dijoins, Rota's conjecture

Submodular Functions: Learnability, Structure, and Optimization 2011/06
Maria-Florina Balcan, Nicholas J. A. Harvey.
43rd Annual ACM Symposium on Theory of Computing (STOC 11), San Jose, CA, June 2011. PDF.
arXiv, August 2010. PDF.

Submodular functions are discrete functions that model laws of diminishing returns and enjoy numerous algorithmic applications. They have been used in many areas, including combinatorial optimization, machine learning, and economics. In this work we study submodular functions from a learning theoretic angle. We provide algorithms for learning submodular functions, as well as lower bounds on their learnability. In doing so, we uncover several novel structural results revealing ways in which submodular functions can be both surprisingly structured and surprisingly unstructured. We provide several concrete implications of our work in other domains including algorithmic game theory and combinatorial optimization.

At a technical level, this research combines ideas from many areas, including learning theory (distributional learning and PAC-style analyses), combinatorics and optimization (matroids and submodular functions), and pseudorandomness (lossless expander graphs).

Keywords: Learning Theory, Submodular Functions, Matroids
Notes: In this paper we use Talagrand's inequality to prove a concentration inequality for Lipschitz, monotone submodular functions. We were unaware of prior work proving concentration for subadditive functions by Schechtman in 2003, using Talagrand's inequality, and by Hajiaghayi et al. in 2005, using self-bounding functions.

In 2009, Chekuri and Vondrak also proved a concentration inequality for Lipschitz, monotone submodular functions, using the FKG inequality. Their inequality achieves Chernoff-type concentration, which is somewhat better than the Talagrand-like concentration that we achieve. An earlier version of our paper only proved concentration for matroid rank functions. After learning of the work of Chekuri and Vondrak, we improved our inequality to its current form, in which it applies to arbitrary Lipschitz, monotone, submodular functions.

Perhaps the nicest way to prove such inequalities is using self-bounding functions. In particular, this approach yields concentration for non-monotone functions; it is not clear whether the FKG or Talagrand approaches do too. See the note by Vondrak from 2010.

Graph Sparsification by Edge-Connectivity and Random Spanning Trees 2011/06
Wai Shing Fung, Ramesh Hariharan, Nicholas J. A. Harvey, Debmalya Panigrahi.
43rd Annual ACM Symposium on Theory of Computing (STOC 11), San Jose, CA, June 2011. PDF.
Submitted to SIAM Journal on Computing. PDF.

We present two new approaches to constructing graph sparsifiers -- weighted subgraphs for which every cut has the same value as the original graph, up to a factor of (1 ± ε). The first approach independently samples each edge uv with probability inversely proportional to the edge-connectivity between u and v. The fact that this approach produces a sparsifier resolves an open question of Benczur and Karger. Concurrent work of Hariharan and Panigrahi (2010) also resolves this question. The second approach samples uniformly random spanning trees. Both of our approaches produce sparsifiers with O(n log2(n)/ε2) edges. Our proofs are based on extensions of the Karger-Stein contraction algorithm which allow it to compute minimum "Steiner cuts".

Keywords: Graph Algorithms, Random Sampling, Contraction Algorithm, Splitting Off, Random Walks
Notes: Many of the results in this paper were independently discovered by Fung & Harvey and by Hariharan & Panigrahi. Actually Hariharan & Panigrahi had many more algorithmic results than we did, including the linear time algorithm for unweighted graphs. They graciously offered to merge the papers for a joint STOC submission.

Approximating Submodular Functions Everywhere 2009/01
Michel X. Goemans, Nicholas J. A. Harvey, Satoru Iwata, Vahab Mirrokni.
ACM-SIAM Symposium on Discrete Algorithms (SODA 09), New York, NY, January 2009. PDF Slides.
Draft of full version, October 2010. PDF.

Submodular functions are a key concept in combinatorial optimization. Algorithms that involve submodular functions usually assume that they are given by a (value) oracle. Many interesting problems involving submodular functions can be solved using only polynomially many queries to the oracle, e.g., exact minimization or approximate maximization.

In this paper, we consider the problem of approximating a monotone submodular function f on a ground set of size n everywhere, after only poly(n) oracle queries. Our main result is a deterministic algorithm that makes poly(n) oracle queries and derives a function g such that, for every set S, g(S) approximates f(S) within a factor α(n), where α(n)=sqrt(n+1) for rank functions of matroids and α(n)=O(sqrt(n) log n) for general monotone submodular functions. Our result is based on approximately finding a maximum volume inscribed ellipsoid in a symmetrized polymatroid, and the analysis involves various properties of submodular functions and polymatroids.

Our algorithm is tight up to logarithmic factors. Indeed, we show that no algorithm can achieve a factor better than Ω(sqrt(n) / log n), even for rank functions of a matroid.

Keywords: Submodular Functions, Approximation, Maximum Volume Ellipsoids

On the Complexity of Reconfiguration Problems 2008/12
Takehiro Ito, Erik D. Demaine, Nicholas J. A. Harvey, Christos Papadimitriou, Martha Sideri, Ryuhei Uehara, Yushi Uno.
19th International Symposium on Algorithms and Computation (ISAAC 2008), Gold Coast, Australia, December 2008. PDF.
Theoretical Computer Science, March 2011. PDF.

Reconfiguration problems arise when we wish to find a step-by-step transformation between two feasible solutions of a problem such that all intermediate results are also feasible. We demonstrate that a host of reconfiguration problems derived from NP-complete problems are PSPACE-complete, while some are also NP-hard to approximate. In contrast, several reconfiguration versions of problems in P are solvable in polynomial time.

Keywords: Reconfiguration, PSPACE complete

Sketching and Streaming Entropy via Approximation Theory 2008/10
Nicholas J. A. Harvey, Jelani Nelson, Krzysztof Onak.
IEEE Information Theory Workshop (ITW 08), Porto, Portugal, May 2008. PDF Slides.
IEEE Symposium on Foundations of Computer Science (FOCS 08), Philadelphia, PA, October 2008. PDF.

We conclude a sequence of work by giving near-optimal sketching and streaming algorithms for estimating Shannon entropy in the most general streaming model, with arbitrary insertions and deletions. This improves on prior results that obtain suboptimal space bounds in the general model, and near-optimal bounds in the insertion-only model without sketching. Our high-level approach is simple: we give algorithms to estimate Rényi and Tsallis entropy, and use them to extrapolate an estimate of Shannon entropy. The accuracy of our estimates is proven using approximation theory arguments and extremal properties of Chebyshev polynomials, a technique which may be useful for other problems. Our work also yields the best-known and near-optimal additive approximations for entropy, and hence also for conditional entropy and mutual information.

Keywords: Streaming Algorithms, Shannon Entropy, Renyi Entropy, Stable Distributions, Chebyshev Polynomials

Matroid Intersection, Pointer Chasing, and Young's Seminormal Representation of Sn 2008/01
Nicholas J. A. Harvey.
ACM-SIAM Symposium on Discrete Algorithms (SODA 08), San Francisco, CA, January 2008. PDF Slides.
RIMS Kokyuroku Bessatsu, B23(2):81-106, 2010. PDF.

We consider the number of queries needed to solve the matroid intersection problem, a question raised by Welsh (1976). Given two matroids of rank r on n elements, it is known that O(nr1.5) independence queries suffice. However, no non-trivial lower bounds are known for this problem.

We make the first progress on this question. We describe a family of instances of rank r=n/2 based on a pointer chasing problem, and prove that (log2 3) n - o(n) queries are necessary to solve these instances. This gives a constant factor improvement over the trivial lower bound of n for matroids of this rank.

Our proof uses methods from communication complexity and group representation theory. We analyze the communication matrix by viewing it as an operator in the group algebra of the symmetric group and explicitly computing its spectrum.

Keywords: Matroid Intersection, Lower Bound, Communication Complexity, Group Representations, Symmetric Group, Jucys-Murphy Elements
Notes: The abstract is somewhat misleading. For matroids of rank 1, there is a trivial lower bound of 2n-2 queries. Another trivial argument gives a lower bound of n queries for matroids of rank n/2. The matroids in this paper also have rank n/2, but our lower bound is roughly 1.58n queries.

Iteratively Constructing Preconditioners via the Conjugate Gradient Method 2007/06
John Dunagan, Nicholas J. A. Harvey.
39th Annual ACM Symposium on Theory of Computing (STOC 2007), San Diego, CA, June 2007. PDF PS.

We consider the problem of solving a symmetric, positive definite system of linear equations. The most well-known and widely-used method for solving such systems is the preconditioned Conjugate Gradient method. The performance of this method depends crucially on knowing a good preconditioner matrix. We show that the Conjugate Gradient method itself can produce good preconditioners as a by-product. These preconditioners allow us to derive new asymptotic bounds on the time to solve multiple related linear systems.

Keywords: Conjugate Gradient Method, Preconditioning

A "Chicken and Egg" Network Coding Problem 2007/06
Nicholas J. A. Harvey, Robert D. Kleinberg, Chandra Nair, Yunnan Wu.
IEEE International Symposium on Information Theory (ISIT 2007), Nice, France, June 2007. PDF.

We consider the multi-source network coding problem in cyclic networks. This problem involves several difficulties not found in acyclic networks, due to additional causality requirements. This paper highlights the difficulty of these causality conditions by analyzing two example cyclic networks. The networks appear quite similar at first glance, and indeed both have invalid rate-1 network codes that violate causality. However, the two networks are actually quite different: one also has a valid rate-1 network code obeying causality, whereas the other does not. This unachievability result is proven by a new information inequality for causal coding schemes in a simple cyclic network.

Keywords: Network Coding, Causality, Rate, Information Inequalities

Non-Adaptive Fault Diagnosis for All-Optical Networks via Combinatorial Group Testing on Graphs 2007/05
Nicholas J. A. Harvey, Mihai Patrascu, Yonggang Wen, Sergey Yekhanin, Vincent Chan.
26th Annual IEEE Conference on Communications (INFOCOM), Anchorage, Alaska, May 2007. PDF.

We consider the problem of detecting network faults. Our focus is on detection schemes that send probes both proactively and non-adaptively. Such schemes are particularly relevant to all-optical networks, due to these networks' operational characteristics and strict performance requirements. This fault diagnosis problem motivates a new technical framework that we introduce: group testing with graph-based constraints. Using this framework, we develop several new probing schemes to detect network faults. The efficiency of our schemes often depends on the network topology; in many cases we can show that our schemes are optimal or near-optimal by providing tight lower bounds.

Keywords: Networks, Faults, Failures, Detection, Group Testing, Superimposed Codes, All-Optical

An Algebraic Algorithm for Weighted Linear Matroid Intersection 2007/01
Nicholas J. A. Harvey.
ACM-SIAM Symposium on Discrete Algorithms (SODA 07), New Orleans, LA, January 2007. PDF Slides.

We present a new algebraic algorithm for the classical problem of weighted matroid intersection. This problem generalizes numerous well-known problems, such as bipartite matching, network flow, etc. Our algorithm has running time O(nr^{w-1} W^{1+eps}) for linear matroids with n elements and rank r, where w is the matrix multiplication exponent, and W denotes the maximum weight of any element. This algorithm is the fastest known when W is small. Our approach builds on the recent work of Sankowski (2006) for Weighted Bipartite Matching and Harvey (2006) for Unweighted Linear Matroid Intersection.

Keywords: Matroid Intersection, Algebraic Algorithms, Fast Matrix Multiplication

Algebraic Algorithms for Matching and Matroid Problems 2006/10
Nicholas J. A. Harvey.
IEEE Symposium on Foundations of Computer Science (FOCS 06), Berkeley, CA, October 2006. Received the Best Student Paper (Machtey) Award. PDF PS Slides.
SIAM Journal on Computing, 39(2):679-702, 2009. PDF.

We present new algebraic approaches for several well-known combinatorial problems, including non-bipartite matching, matroid intersection, and some of their generalizations. Our work yields new randomized algorithms that are the most efficient known. For non-bipartite matching, we obtain a simple, purely algebraic algorithm with running time O(nω) where n is the number of vertices and ω is the matrix multiplication exponent. This resolves the central open problem of Mucha and Sankowski (FOCS 2004). For matroid intersection, our algorithm has running time O(nrω-1) for matroids with n elements and rank r that satisfy some natural conditions.

Keywords: Non-bipartite Matching, Matroid Intersection, Algebraic Algorithms, Fast Matrix Multiplication
Notes: I recommend reading the journal version of this paper, not the conference version. The algorithms in the journal version are different and simpler. The conference paper sketched algorithms for the basic path-matching and independent matching problems. To keep the journal paper as simple as possible, these are not discussed there. The conference papers also claims that graphic matroid intersection can be solved in O(n2.575) time. When asked to provide the details, I could not reproduce them. Most likely I was mistaken, so I retract the claim.

Conservative Network Coding 2006/09
Nicholas J. A. Harvey, Kamal Jain, Lap Chi Lau, Chandra Nair, Yunnan Wu.
44th Annual Allerton Conference on Communication, Control, and Computing (Allerton 2006), Monticello, IL, September 2006. PDF Slides.

Motivated by practical networking scenarios, we introduce a notion of restricted communication called "conservative networking". Consider a network of lossless links and a number of independent sources. Each node needs to recover a certain subset of the sources. However, each node is "conservative" in that all information it receives can only be a function of the sources it will ultimately recover. For acyclic networks, we show that conservative networking admits a clean characterization: (i) The rates achievable by integer routing, factional routing, and network coding are equal, and (ii) this rate is determined by a simple cut bound. However, this clean characterization does not extend to cyclic networks. We present cyclic examples showing that (i) fractional routing can be strictly better than integer routing, (ii) network coding can be strictly better than fractional routing, and (iii) the cut bound cannot be achieved in general. This work underscores the difficulties generally encountered in cyclic networks.

Keywords: Network Coding, Capacity, Rate, Cuts

Methods for Efficient Network Coding 2006/09
Petar Maymounkov, Nicholas J. A. Harvey, Desmond S. Lun.
44th Annual Allerton Conference on Communication, Control, and Computing (Allerton 2006), Monticello, IL, September 2006. PDF Slides.

Random linear network coding is a capacity-achieving communication scheme for single sessions over lossy wireline or wireless packet networks. From the point of view of efficiently utilizing transmissions, random linear network coding is an attractive strategy. However, it is not presently attractive from the point of view of efficiently utilizing computational resources. A natural code design question arises: can we design a network code that preserves the communication efficiency of a random linear code and achieves better computational efficiency? In this paper, we give an affirmative answer to this question.

The code that we present as our answer achieves significantly better computational efficiency that existing schemes, despite being based primarily on standard techniques in the literature on erasure codes. We define and use the language of "adversarial schedules" to compare the performance of our coding scheme against other schemes with respect to a broad class of network environments in a concise manner.

Keywords: Network Coding, Rateless Codes, Precodes

Lower Bounds for Asymmetric Communication Channels and Distributed Source Coding 2006/01
Micah Adler, Erik D. Demaine, Nicholas J. A. Harvey, Mihai Patrascu.
ACM-SIAM Symposium on Discrete Algorithms (SODA 06), Miami, FL, January 2006. PDF PS.

We prove nearly tight lower bounds on the number of rounds of communication required by efficient protocols over asymmetric channels between a server (with high sending capacity) and one or more clients (with low sending capacity). This scenario captures the common asymmetric communication bandwidth between broadband Internet providers and home users, as well as sensor networks where sensors (clients) have limited capacity because of the high power requirements for long-range transmissions. An efficient protocol in this setting communicates n bits from each of the k clients to the server, where the clients' bits are sampled from a joint distribution D that is known to the server but not the clients, with the clients sending only O(H(D)+k) bits total, where H(D) is the entropy of distribution D. In the single-client case, there are efficient protocols using O(1) rounds in expectation and O(lg n) rounds in the worst case. We prove that this is essentially best possible: with probability 1/2^O(t lg t), any efficient protocol can be forced to use t rounds. In the multi-client case, there are efficient protocols using O(lg k) rounds in expectation. We prove that this is essentially best possible: with probability Omega(1), any efficient protocol can be forced to use Omega(lg k / lg lg k) rounds. Along the way, we develop new techniques of independent interest for proving lower bounds in communication complexity.

Keywords: Sensor Networks, Asymmetric, Lower Bound, Communication Complexity, Round Elimination

On the Capacity of Information Networks 2006/01
Micah Adler, Nicholas J. A. Harvey, Kamal Jain, Robert D. Kleinberg, April Rasala Lehman.
ACM-SIAM Symposium on Discrete Algorithms (SODA 06), Miami, FL, January 2006. PDF PS Slides.
Special Issue of the IEEE Transactions on Information Theory and the IEEE/ACM Transactions on Networking, 52(6):2345-2364, June 2006. PDF.

We consider the capacity of information networks in the absence of interference and noise. An outer bound on the rate region of these networks is presented, answering a question of Song, Yeung and Cai. This outer bound combines properties of entropy with a strong information inequality derived from the structure of the network. This blend of information theoretic and graph theoretic arguments generates many interesting results. For example, we exactly characterize the capacity of directed cycles, solving a question of Kramer and Savari. We also give the first known proof of a gap between the sparsity of an undirected graph and its capacity. Extending this result, we show that multicommodity flow solutions achieve the capacity in an infinite class of undirected graphs, thereby making important progress on a conjecture of Li and Li. This result is in sharp contrast to the situation with directed graphs, where we present a family of graphs in which the gap between the capacity and the rate achievable using multicommodity flows is linear in the size of the graph.

Keywords: Network Coding, Information Theory, Capacity, Multicommodity Flow, Sparsity, Gaps, k-pairs Communication Problems, Multiple Unicast Sessions, Okamura-Seymour Graph, Infomational Dominance, Matrix Transposition Problem
Notes: Very similar results were independently obtained in Jain, Vazirani, Yuval. "On the capacity of multiple unicast sessions in undirected graphs". IEEE/ACM Transactions on Networking, 2006. The paper in SODA is a merge of these two papers, plus additional results obtained with Micah Adler.

The Complexity of Matrix Completion 2006/01
Nicholas J. A. Harvey, David R. Karger, Sergey Yekhanin.
ACM-SIAM Symposium on Discrete Algorithms (SODA 06), Miami, FL, January 2006. PDF PS Slides.

Given a matrix whose entries are a mixture of numeric values and symbolic variables, the matrix completion problem is to assign values to the variables so as to maximize the resulting matrix rank. This problem has deep connections to computational complexity and numerous important algorithmic applications. Determining the complexity of this problem is a fundamental open question in computational complexity. Under different settings of parameters, the problem is variously in P, in RP, or NP-hard. We shed new light on this landscape by demonstrating a new region of NP-hard scenarios. As a special case, we obtain the first known hardness result for matrices in which each variable appears only twice.

Another particular scenario that we consider is the simultaneous matrix completion problem, where one must simultaneously maximize the rank for several matrices that share variables. This problem has important applications in the field of network coding. Recent work has given a simple, greedy, deterministic algorithm for this problem, assuming that the algorithm works over a sufficiently large field. We show an exact threshold for the field size required to find a simultaneous completion efficiently. This result implies that, surprisingly, the simple greedy algorithm is optimal: finding a simultaneous completion over any smaller field is NP-hard.

Keywords: Matrix Completion, Mixed Matrices, NP-hard, Polynomial Identity Testing

Tighter Cut-based Bounds for k-pairs Communication Problems 2005/09
Nicholas J. A. Harvey, Robert D. Kleinberg.
43rd Annual Allerton Conference on Communication, Control, and Computing (Allerton 2005), Monticello, IL, September 2005. PDF PS Slides.

We study the extent to which combinatorial cut conditions determine the maximum network coding rate of k-pairs communication problems. We seek a combinatorial parameter of directed networks which constitutes a valid upper bound on the network coding rate but exceeds this rate by only a small factor in the worst case. (This worst-case ratio is called the gap of the parameter.) We begin by considering vertex-sparsity and meagerness, showing that both of these parameters have a gap which is linear in the network size. Due to the weakness of these bounds, we propose a new bound called informational meagerness. This bound generalizes both vertex-sparsity and meagerness and is potentially the first known combinatorial cut condition with a sublinear gap. However, we prove that informational meagerness does not tightly characterize the network coding rate: its gap can be super-constant.

Keywords: Network Coding, Information Theory, Capacity, Bounds, Sparsity, Informational Dominance

Deterministic Network Coding by Matrix Completion 2005/01
Nicholas J. A. Harvey, David R. Karger, Kazuo Murota.
ACM-SIAM Symposium on Discrete Algorithms (SODA 05), Vancouver, Canada, January 2005. PDF PS Slides.

We present a new deterministic algorithm to construct network codes for multicast problems, a particular class of network information flow problems. Our algorithm easily generalizes to several variants of multicast problems. Our approach is based on a new algorithm for maximum-rank completion of mixed matrices---taking a matrix whose entries are a mixture of numeric values and symbolic variables, and assigning values to the variables so as to maximize the resulting matrix rank. Our algorithm is faster than existing deterministic algorithms and can operate over a smaller field.

Keywords: Network Coding, Multicast, Deterministic Algorithm, Matrix Completion, Maximum Rank, Mixed Matrices
Notes: After this research was done, we discovered that a deterministic algorithm for network coding multicast had already been obtained in Jaggi, Sanders, Chou, Effros, Egner, Jain, and Tolhuizen. "Polynomial time algorithms for multicast network code construction", IEEE Transactions on Information Theory 2005. Their algorithm is quite a bit simpler, especially if you're not familiar with matroids.

My master's thesis is an extended version of this work, containing a much more detailed explanation than the conference version.

FUSE: Lightweight Guaranteed Distributed Failure Notification 2004/12
John Dunagan, Nicholas J. A. Harvey, Michael B. Jones, Dejan Kostic, Marvin Theimer, Alec Wolman.
Sixth Symposium on Operating Systems Design and Implementation (OSDI '04), San Francisco, CA, December 2004. PDF.

FUSE is a lightweight failure notification service for building distributed systems. Distributed systems built with FUSE are guaranteed that failure notifications never fail. Whenever a failure notification is triggered, all live members of the FUSE group will hear a notification within a bounded period of time, irrespective of node or communication failures. In contrast to previous work on failure detection, the responsibility for deciding that a failure has occurred is shared between the FUSE service and the distributed application. This allows applications to implement their own definitions of failure. Our experience building a scalable distributed event delivery system on an overlay network has convinced us of the usefulness of this service. Our results demonstrate that the network costs of each FUSE group can be small; in particular, our overlay network implementation requires no additional liveness-verifying ping traffic beyond that already needed to maintain the overlay, making the steady state network load independent of the number of active FUSE groups.

Keywords: Failure notification, overlay networks

Comparing Network Coding with Multicommodity Flow for the k-pairs Communication Problem 2004/09
Nicholas J. A. Harvey, Robert D. Kleinberg, April Rasala Lehman.
MIT LCS Technical Report 964, September 2004. PDF.

Given a graph G = (V,E) and k source-sink pairs of vertices, this paper investigates the maximum rate r at which all pairs can simultaneously communicate. We view this problem from two perspectives and compare their advantages. In the multicommodity flow formulation, a solution provides dedicated bandwidth r between each source-sink pair. In the information flow formulation, a vertex can transmit a function of the information it received thereby allowing multiple source-sink pairs to share bandwidth. For directed acyclic graphs with n vertices, we show that the rate achievable in the information flow formulation can be a multiplicative factor n larger than the rate achievable in the multicommodity flow formulation. It is well known that for undirected graphs with n vertices, in the multicommodity flow formulation, the maximum rate achievable can be an O(1/log |V|) multiplicative factor smaller than the value of the sparsest cut. We extend this result to show that the maximum rate achievable in the information flow setting can be an O(1/log |V|) multiplicative factor smaller than the sparsest cut value. For directed acyclic graphs G, we define a parameter called the value of the most meager cut which is an upper bound for the maximum rate achievable in the information flow setting. We also present an example illustrating that this upper bound is not always tight.

Keywords: Network Coding, Multicommodity Flow, Gaps

The Post-Order Heap 2004/05
Nicholas J. A. Harvey, Kevin C. Zatloukal.
Third International Conference on Fun with Algorithms (FUN 2004), Elba, Italy, May 2004. PDF PS Slides.
Source code is available.

We propose the post-order heap, a new data structure for implementing priority queues. Our structure is a simple variant of the binary heap that requires only O(1) amortized time for Insert operations and O(log n) time in the worst case for Delete-Min operations. Like the binary heap, the post-order heap is an implicit data structure, meaning that a structure containing n elements can be stored using only the first n locations of an array and O(1) additional words of storage.

Keywords: Priority Queue, Heap, Amortized Analysis, Implicit Data Structure
Notes: After writing the initial version of this paper, we discovered that an implicit data structure with better runtime bounds had been obtained 15 years earlier in S. Carlsson, J. I. Munro, and P. V. Poblete. "An implicit binomial queue with constant insertion time". SWAT 1988, 1-13. Our structure is somewhat simpler since we only implement a binary heap, not a binomial queue.

Family Trees: An ordered dictionary with optimal congestion, locality, degree, and search time 2004/01
Kevin C. Zatloukal, Nicholas J. A. Harvey.
ACM-SIAM Symposium on Discrete Algorithms (SODA 04), New Orleans, LA, January 2004. PDF PS Slides.

We consider the problem of storing an ordered dictionary data structure over a distributed set of nodes. In contrast to traditional sequential data structures, distributed data structures should ideally have low congestion. We present a novel randomized data structure, called a family tree, to solve this problem. A family tree has optimal expected congestion, uses only a constant amount of state per node, and supports searches and node insertion/deletion in expected O(log n) time on a system with n nodes. Furthermore, a family tree supports keys from any ordered domain. Because the keys are not hashed, searches have good locality in the sense that intermediate nodes on the search path have keys that are not far outside of the range between the source and destination.

Keywords: Distributed Data Structure, Ordered Dictionary, Peer-to-peer, Congestion, Locality, Randomization
Notes: A related and simpler design can be found in Goodrich, Nelson, and Sun. "The rainbow skip graph: A fault-tolerant constant-degree distributed data structure", SODA 2006.

Deterministic SkipNet 2003/07
Nicholas J. A. Harvey, J. Ian Munro.
ACM Symposium on Principles of Distributed Computing (PODC 2003), Boston, MA, July 2003. PDF PS.
Information Processing Letters, 90(4):205-208, May 2004. PDF.

We present a deterministic scalable overlay network. In contrast, most previous overlays use randomness or hashing (pseudo-randomness) to achieve a uniform distribution of data and routing traffic.

Keywords: Peer-to-Peer, Deterministic, Scalable, Locality, Self-Configuring, Distributed System

Semi-Matchings for Bipartite Graphs and Load Balancing 2003/07
Nicholas J. A. Harvey, Richard E. Ladner, Laszlo Lovasz, Tami Tamir.
Workshop on Algorithms and Data Structures (WADS 2003), Ottawa, Canada, July 2003. PDF PS Slides.
Journal of Algorithms, 59(1):53-78, 2006. PDF PS.

We consider the problem of fairly matching the left-hand vertices of a bipartite graph to the right-hand vertices. We refer to this problem as the semi-matching problem; it is a relaxation of the known bipartite matching problem. We present a way to evaluate the quality of a given semi-matching and show that, under this measure, an optimal semi-matching balances the load on the right hand vertices with respect to any Lp-norm. In particular, when modeling a job assignment system, an optimal semi-matching achieves the minimal makespan and the minimal flow time for the system.

The problem of finding optimal semi-matchings is a special case of certain scheduling problems for which known solutions exist. However, these known solutions are based on general network optimization algorithms, and are not the most efficient way to solve the optimal semi-matching problem. To compute optimal semi-matchings efficiently, we present and analyze two new algorithms. The first algorithm generalizes the Hungarian method for computing maximum bipartite matchings, while the second, more efficient algorithm is based on a new notion of cost-reducing paths. Our experimental results demonstrate that the second algorithm is vastly superior to using known network optimization algorithms to solve the optimal semi-matching problem. Furthermore, this same algorithm can also be used to find maximum bipartite matchings and is shown to be roughly as efficient as the best known algorithms for this goal.

Keywords: Graph Algorithms, Bipartite Matching, Load Balancing, Scheduling, Assignment, Flow Time
Notes: The analysis of our algorithm ASM2 is due to David Bruce Wilson. A much nicer algorithm for the Optimal Semi Matching problem appears in Jittat Fakcharoenphol, Bundit Laekhanukit, Danupon Nanongkai. "Faster Algorithms for Semi-Matching Problems", ICALP 2010.

SkipNet: A Scalable Overlay Network with Practical Locality Properties 2003/03
Nicholas J. A. Harvey, Michael B. Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman.
Fourth USENIX Symposium on Internet Technologies and Systems (USITS '03), Seattle, WA, March 2003. Received the Best Paper Award. PDF PS HTML Slides.
Long version, October 2003. PDF PS.
Source code is available.

Scalable overlay networks such as Chord, CAN, Pastry, and Tapestry have recently emerged as flexible infrastructure for building large peer-to-peer systems. In practice, such systems have two disadvantages: They provide no control over where data is stored and no guarantee that routing paths remain within an administrative domain whenever possible. SkipNet is a scalable overlay network that provides controlled data placement and guaranteed routing locality by organizing data primarily by string names. SkipNet allows for both fine-grained and coarse-grained control over data placement: Content can be placed either on a pre-determined node or distributed uniformly across the nodes of a hierarchical naming subtree. An additional useful consequence of SkipNet's locality properties is that partition failures, in which an entire organization disconnects from the rest of the system, can result in two disjoint, but well-connected overlay networks.

Keywords: Peer-to-Peer, Scalable, Locality, Self-Configuring, Range Query, Distributed System
Notes: Essentially the same idea was independently discovered by Aspnes and Shah, and published in James Aspnes, Gauri Shah, "Skip graphs". SODA 2003, 384-393.

Efficient Recovery From Organizational Disconnect in SkipNet 2003/02
Nicholas J. A. Harvey, Michael B. Jones, Marvin Theimer, Alec Wolman.
2nd International Workshop on Peer-to-Peer Systems (IPTPS '03), Berkeley, CA, February 2003. PDF PS.

SkipNet is a scalable overlay network that provides controlled data placement and routing locality guarantees by organizing data primarily by lexicographic ordering of string names. A key side-effect of the SkipNet design is that all nodes from an organization form one or a few contiguous overlay segments. When an entire organization disconnects from the rest of the system, repair of only a few pointers quickly enables efficient routing throughout the disconnected organization; full repair is done as a subsequent background task. These same operations can be later used to efficiently reconnect an organization's SkipNet back into the global one.

Keywords: Peer-to-Peer, Fault Tolerance, Failures, Locality, Self-Configuring