There are two main categories of distance functions in measuring
similarity of time serein data . The first category consists of
the Lp-norms. They are metric distance functions but cannot
support local time shifting. The second category consists of
distance functions which are capable of handling local time
shifting. Unfortunately, they are all non-metric distance
functions. The first contribution of this paper is the proposal
of a new distance function, which we call ERP (``Edit distance
with Real Penalty''). ERP represents a marriage of L1-norm and the
edit distance, which can support local time shifting, and is a
metric distance function.
The second contribution of the paper is the development of pruning
strategies for large time series databases. Given that ERP is a
metric, one way to prune is to apply the triangle inequality.
Another way to prune is to develop a lower bound on the ERP
distance. We propose such a lower bound, which has the nice
computational property that it can be efficiently indexed with a
standard B+-tree. Moreover, we show that these two ways of pruning
can be used simultaneously for ERP distances. Specifically, the
false positives obtained from the B+-tree can be further minimized
by applying the triangle inequality. Based on extensive
experimentation with existing benchmarks and techniques, we show
that this combination delivers superb pruning power and search
time performance.