Computational Architectures for Responsive Vision: the Vision Engine

ID
TR-91-25
Authors
James J. Little, Rod Barman, Stewart Kingdon and Jiping Lu
Publishing date
November 1991
Length
10 pages
Abstract

To respond actively to a dynamic environment, a vision system must process perceptual data in real time, and in multiple modalities. The structure of the computational load varies across the levels of vision, requiring multiple architectures. We describe the Vision Engine, a system with a pipelined early vision architecture, Datacube image processors, connected to a MIMD intermediate vision system, a set of Transputers. The system uses a controllable eye/head for tasks involving motion, stereo and tracking.

A simple pipeline model describes image transformation through multiple functional stages in early vision. Later processing (e.g., segmentation, edge linking, perceptual organization) cannot easily proceed on a pipeline architecture. A MIMD architecture is more appropriate for the irregular data and functional parallelism of later visual processing.

The Vision Engine is designed for general vision tasks. Early vision processing, both optical flow and stereo, is implemented in near real-time using the Datacube, producing dense vector fields with confidence measures, transferred at near video rates to Transputer subsystem. We describe a simple implementation combining, in the Transputer system, stereo and motion information from the Datacube.