Image Databases
There is considerable research interest in efficiently indexing,
querying, and retrieving images stored in databases.
Images can be queried in a number of ways:
- using identifiers (e.g., the keyword "Rembrandt" when searching
for paintings by Rembrandt)
- using "features" that appear in the image, such as:
- colours (and perhaps the relative locations of those colours
in the image)
- objects (e.g., houses, faces)
- shapes (e.g., circularity, eccentricity, large/small area)
- texture (e.g., rough vs. smooth -- green tree vs. green
car)
Queries in relational databases rely on exact matches, whereas
queries in image databases rely on approximate matches; hence, we
need some notion of similarity. This can be achieved through the
use of histograms. A k-bin colour histogram defines k colour ranges
and stores a count of the number of pixels in the image falling
into each colour range. A k-bin histogram can be represented by
a k-dimensional vector, and a similarity measure such as an L-1
norm (Manhattan distance) or an L-2 norm (Euclidean distance) can
be used.
We can view a k-bin histogram as a point in a k-dimensional space.
A "fixed radius" or "nearest neighbour" search can then be used
to identify the top-n candidates. A naive approach would be to compare
the histograms of all images in the database to the target (query)
histogram. If the radius is too large, we may get too many hits;
if the radius is too small, we may dismiss some good candidates.
With respect to correctness, we do not want false dismissals, but
are prepared to deal with false positives.
A natural question to the above approach concerns the number of
bins to use. This also introduces a trade-off between accuracy and
efficiency. For example, expensive, fine-grained computations may
utilize 256 bins; however, it would be good to use a suitable abstraction
(e.g., 3 bins), and then use a filtering technique. Ideally, the
filtered candidates should be a small subset of the database, but
also a superset of the answer set. Fine-grained computations can
then be limited to the filtered list, thus realizing potentially
large savings since the time to perform the filtering plus the time
to perform the fine-grained computations would typically be far
less than a naive approach.
Other methods besides histograms can be used, such as Principal
Component Analysis. PCA uses eigenvalues and is related to the Singular
Value Decomposition. There are several motivations to this work.
Firstly, some dimension of the search space may be dependent on
others. Secondly, we want to index on properties other than just
colour. With PCA, we can use a new coordinate system to capture
the most varied dimension as our first axis, the second most varied
dimension as our second axis, and so on. Whereas our previous approach
may require a 256-dimensional space, PCA results in a compressed
space with typically far fewer dimensions.
At this point, we should point out another area of concern with
respect to indexing, namely that of high dimensionality. Studies
have shown that when the number of dimensions exceeds approximately
20, then indexing is not very useful.
Variables to Consider
When performing image database queries, some or all of the following
variables come into play:
- M = number of images in the database
- N = number of bins in the full histogram (e.g., 256)
- time to compute L-x norm for 2 N-bin histograms
- time to compute L-x norm for 2 k-bin histograms (k << N)
- page size
- number of N-bin histograms in 1 page
- number of k-bin histograms in 1 page
- number of page accesses
- amount of buffering
- value r for a fixed radius search
- value n for desired number of nearest neighbours
- fraction of images that survive a given level of filtering
- number of levels of filtering
- type of index structure (e.g., R-tree, k-d tree, TV-tree)
- density of the image search space. Some databases contain many
images of the same kind; others contain images from a single domain
(e.g., flowers, cars).
On-going or Future Work
- query language issues (constructs for composing queries; expressiveness;
user-friendliness; human-computer interaction; etc.)
- query processing and optimization issues (e.g., range queries,
incremental ("like this") processing)
- applications to video: content-based querying and retrieval
|