[Imager Theses and Major Essays] [Imager] [UBC Computer Science]


Imager

Christopher G. Healey

Effective Visualization of Large, Multidimensional Datasets


Degree:  Ph.D.
Type:  thesis
Year:  1996
Supervisors: Kellogg S. Booth and James T. Enns
Electronic:  [PDF], 1285811 bytes
Hardcopy: 252 pages

Abstract

A new method for assisting with the visualization of large multidimensional datasets is proposed. We classify datasets with more than one million elements as large. Multidimensional data elements are elements with two or more dimensions, each of which is at least binary. Multidimensional data visualization involves representation of multidimensional data elements in a low dimensional environment, such as a computer screen or printed media. Traditional visualization techniques are not well suited to solving this problem.

Our data visualization techniques are based in large part on a field of cognitive psychology called preattentive processing. Preattentive processing is the study of visual features that are detected rapidly and with little effort by the human visual system. Examples include hue, orientation, form, intensity, and motion. We studied ways of extending and applying research results from preattentive processing to address our visualization requirements. We used our investigations to build visualization tools that allow a user to very rapidly and accurately perform exploratory analysis tasks. These tasks include searching for target elements, identifying boundaries between groups of common elements, and estimating the number of elements that have a specific visual feature. Our experimental results were positive, suggesting that dynamic sequences of frames can be used to explore large amounts of data in a relatively short period of time.

Recent work in both scientific visualization and database systems has started to address the problems inherent in managing large scientific datasets. One promising technique is knowledge discovery, "the nontrivial extraction of implicit, previously unknown, and potentially useful information from data". We hypothesise that knowledge discovery can be used as a filter to reduce the amount of data sent to the visualization tool. Data elements that do not belong to a user-chosen group of interest can be discarded, the dimensionality of individual data elements can be compressed, and previously unknown trends and relationships can be discovered and explored.

We illustrate how our techniques can be used by applying them to real-world data and tasks. This includes the visualization of simulated salmon migration results, computerized tomography medical slices, and environmental datasets that track ocean and atmospheric conditions.


@PhdThesis{Healey1996,
	author = {Christopher G. Healey, Ph.D},
	title = {Effective Visualization of Large, Multidimensional Datasets},
	school = {UBC},
	year = {1996},
	supervisor = {Kellogg S. Booth and James T. Enns},
}