Empirical Testing of Fast Kernel Density Estimation Algorithms

ID
TR-2005-03
Authors
Dustin Lang, Mike Klaas and Nando de Freitas
Publishing date
May 19, 2005
Length
6 pages
Abstract
We present results of experiments testing the Fast Gauss Transform, Improved Fast Gauss Transform, and Dual-Tree methods (using kd-tree and Anchors Hierarchy data structures) for fast Kernel Density Estimation (KDE). We examine the performance of these methods with respect to data set size, dimension, allowable error, and data set structure ("clumpiness"), measured in terms of CPU time and memory usage. This is the first multi-method comparison in the literature. The results are striking, challenging several claims that are commonly made about these methods. The results are useful for researchers considering fast methods for KDE problems. Along the way, we provide a corrected error bound and a parameter-selection regime for the IFGT algorithm.