Nando de Freitas

NIPS 2004 - Fast N-Body Learning

Nonparametric Belief Propagation with Fast Methods. Alexander Ihler (MIT)

Nonparametric Belief Propagation (NBP) provides a Monte Carlo (sample-based) approximation to Belief Propagation (BP) for continuous, non-Gaussian distributions defined on arbitrary graphical models. NBP approximates each BP message using a kernel density estimate, specifically a Gaussian mixture distribution (and is thus similar to regularized particle filtering on Markov Chains). We have applied NBP successfully in several problems, including distributed inference in sensor networks and video tracking of articulated models. Because NBP makes heavy use of kernel density estimates, fast methods and approximations play a key role in its practical use. In particular, drawing samples from the product of d Gaussian mixtures is an extremely costly procedure (naively O(N^d)). Using multipole-like methods one may reduce this cost to (asymptotically) O(N). Fast density approximation techniques may also be applied to render the algorithm more computationally tractable.

Distance Transforms of Sampled Functions. Daniel Huttenlocher (Cornell University)

A distance transform of a binary image specifies for each 0-valued pixel the distance to the nearest 1-valued pixel (or vice versa). This tutorial will describe a generalization of distance transforms to arbitrary sampled functions rather than binary functions and will then provide simple, efficient algorithms for computing them. These algorithms are applicable to computing classical distance transforms of binary images as well as to solving a broad class of minimization problems involving a cost function with both local and spatial terms. Alternatively the techniques can be viewed in terms of the minimum convolution of two functions, which is an important operation in grayscale morphology. The methods are also applicable to Viterbi decoding, belief propagation and optimal control.

Fast High-Dimensional Feature Indexing for Object Recognition. David Lowe (UBC)

The best current methods for object recognition extract many high-dimensional local features from each image and then compare each feature against a large database of features from many training images. This talk will discuss practical methods for solving this high-dimensional indexing problem. The most successful methods found so far have been those using approximate variants of k-d tree lookup. These can find almost all useful matches while providing at least 4 orders of magnitude efficiency improvement over the best known methods for exact matching. One reason for the success is that matches are useful only when they are significantly closer than would be predicted by the local density of non-matching neighbors. The result is that the high-dimensional feature indexing problem can be solved in practice even for real-time systems and for very large databases.

Tutorial on Statistical N-Body Problems and Proximity Data Structures. Alex Gray (CMU)

First I will give an overview of adaptive tree data structures which can be used to make statistical computations go fast. I will describe the results of a recently-completed thorough experimental comparison of the best-known data structures for nearest-neighbor search, comparing over 30 different methods. I'll survey some important generalized N-body problems which arise in statistical learning, including kernel density estimation, all-nearest-neighbors, Gaussian process regression, local polynomial regression, n-point correlations, nonparametric Bayes classification, and several others. Then I will describe the dual-tree (or multi-tree) framework for developing fast linear-time algorithms for statistical N-body problems, including how it generalizes and extends work in computational geometry, spatial databases, and computational physics. I'll also explain why this new approach is more appropriate suited to statistical problems than the classical approaches used for N-body problems in physics. I'll then give several examples of how to use the framework to create fast methods for the various problems above requiring different twists, highlighting the new idea needed for each. These examples will also illustrate several exciting open questions which I will pose for the workshop.

Fast High-dimensional Function Integration, without Markov Chains. Alex Gray (CMU)

I will present a new method for integration of high-dimensional functions, which is not based on Markov chain theory. Instead, I develop from 'first principles' (of Monte Carlo theory), a new framework in which the integration problem is cast explicitly as a problem of statistical estimation, or 'reconstruction' of the function from samples. The method uses a nonparametric regression estimator which directly minimizes integration error, rather than some loss function which is disconnected from the goal, as have been proposed by some. The main computational sub-problem is approached using custom-designed N-body algorithms based on a recently-developed generalized framework for such problems. The resulting method addresses the two main issues of the Metropolis-Hastings method: Its parameter selection is automatic and optimal, and it produces iid samples, making it easier to understand its behavior. It is also more general - for example it can directly integrate the normalizing constant arising in Bayesian inference and statistical mechanics. A head-to-head experimental comparison with MCMC on canonical challenge functions will illustrate the new method's practical viability. More largely, characterizing integration as statistical learning opens the door to a host of powerful mathematical and computational tools beyond those of Markov chain theory.

Tutorial on Fast Multipole Methods and the FGT. Ramani Duraiswami (University of Maryland)

The fast multipole method has been called one of the ten most significant numerical algorithms discovered in the 20th century, and won its inventors, Vladimir Rokhlin and Leslie Greengard, the Steele prize. The algorithm allows the product of particular dense matrices with a vector to be evaluated in O(N logN) operations, when direct multiplication requires O(N2) operations. Coupled with advances in iterative methods for solution of linear systems, they allow O(N logN) time complexity and O(N) memory complexity solution of problems that hitherto required O(N3) time complexity and O(N2) memory complexity. I will discuss the general ideas of the method, and provide some specific examples of its application.

One particular application, and that which is important to density estimation techniques, is the fast Gauss transform. Unfortunately, the cost of a direct extension of the FGT to higher-dimensional problems grows exponentially with dimension (the curse of dimensionality). We introduce an improved version of the fast Gauss transform to efficiently estimate sums of Gaussians in higher dimensions. Our proposed algorithm incorporates three innovations that dramatically improve the performance: a multivariate Taylor expansion scheme, a pyramid-like data structure to store and manipulate the expansions, and an adaptive space subdivision technique based on the k-center algorithm. These innovations result in an algorithm that is much faster and has been demonstrated in dimensions up to ten.

The fast Gauss transform has been applied to the kernel density estimation (KDE), to the mean shift algorithm for clustering, where it improves the quadratic complexity of each iteration, and to kernel machines. We present results from these applications.