- - -
What is multidimensional scaling?
Multidimensional scaling (MDS) is a dimensionality reduction technique for
exploratory data analysis. Its primary effect is to "squash" data into a
visualizable space. For example, your data may be 7 dimensional, but be
distributed in a clustered way. One can process the data with MDS to produce a
set of two-dimensional coordinates that can be plotted and visually analyzed.
There are essentially two ways to delinate
- metric and nonmetric scaling
- classical and distance scaling
Glimmer doesn't handle nonmetric scaling, so I only describe the
classical/distance dichotomy. Classical scaling is a method for
linearly projecting the data into a subspace and is essentially equivalent to principal
components analysis. The projection is optimal in the least-squared error sense and
Distance scaling is a little more nuanced than classical scaling. Rather than
finding the least-squares projection, distance scaling seeks to minimize the total
least-squared differences between the distances in the source coordinates of the data and
the low-dimensional "squashed" coordinates.
In many cases this strategy offers qualitative
advantages to classical scaling, but the optimal solution is NP-Hard. Worse,
the objective function (called "stress") is multi-modal, so local minima exist.
What is Glimmer?
Glimmer is an algorithm for performing
metric distance scaling. It uses the GPU to reduce the total computation
time and it employs a hierarchical approach to improve the quality of the final
solution. For more details, please consult the journal article
How do I get Glimmer?
Glimmer can be downloaded here
. It is offered as a
compressed visual c++ project. The "cpusources" directory contains the CPU
code written in c++ and the "gpusources" directory contains the GPU shader code
written in GLSL.
How do I build/use Glimmer?
You will need to have an need to have a somewhat recent NVIDIA card, with the
latest drivers and the latest version (as of
Visual C++. The README has instructions on using glimmer. The
download contains sample datasets and command lines including DOS scripts
to reproduce the figures in the paper.
- - -