Next: Conclusions Up: Three-Dimensional Object Recognition Previous: Directions for future

Related research on model-based vision

The methods used in the SCERPO system are based on a considerable body of previous research in model-based vision. The pathbreaking early work of Roberts [24] demonstrated the recognition of certain polyhedral objects by exactly solving for viewpoint and object parameters. Matching was performed by searching for correspondences between junctions found in the scene and junctions of model edges. Verification was then based upon exact solution of viewpoint and model parameters using a method that required seven point-to-point correspondences. Unfortunately, this work was poorly incorporated into later vision research, which instead tended to emphasize non-quantitative and much less robust methods such as line-labeling.

The ACRONYM system of Brooks [6] used a general symbolic constraint solver to calculate bounds on viewpoint and model parameters from image measurements. Matching was performed by looking for particular sizes of elongated structures in the image (known as ribbons) and matching them to potentially corresponding parts of the model. The bounds given by the constraint solver were then used to check the consistency of all potential matches of ribbons to object components. While providing an influential and very general framework, the actual calculation of bounds for such general constraints was mathematically difficult and approximations had to be used that did not lead to exact solutions for viewpoint. In practice, prior bounds on viewpoint were required which prevented application of the system to full three-dimensional ranges of viewpoints.

Goad [12] has described the use of automatic programming methods to precompute a highly efficient search path and viewpoint-solving technique for each object to be recognized. Recognition is performed largely through exhaustive search, but precomputation of selected parameter ranges allows each match to place tight viewpoint constraints on the possible locations of further matches. Although the search tree is broad at the highest levels, after about 3 levels of matching the viewpoint is essentially constrained to a single position and little further search is required. The precomputation not only allows the fast computation of the viewpoint constraints at runtime, but it also can be used at the lowest levels to perform edge-detection only within the predicted bounds and at the minimum required resolution. This research has been incorporated in an industrial computer vision system by Silma Inc. which has the remarkable capability of performing all aspects of three-dimensional object recognition within as little as 1 second on a single microprocessor. Because of their extreme runtime efficiency, these precomputation techniques are likely to remain the method of choice for industrial systems dealing with small numbers of objects.

Other closely related research on model-based vision has been performed by Shirai [26] and Walter & Tropf [28]. There has also been a substantial amount of research on the interpretation of range data and matching within the three-dimensional domain. While we have argued here that most instances of recognition can be performed without the preliminary reconstruction of depth, there may be industrial applications in which the measurement of many precise three-dimensional coordinates is of sufficient importance to require the use of a scanning depth sensor. Grimson & Lozano-Pérez [13] have described the use of three-dimensional search techniques to recognize objects from range data, and describe how these same methods could be used with tactile data, which naturally occurs in three-dimensional form. Further significant research on recognition from range data has been carried out by Bolles et al. [5] and Faugeras [10]. Schwartz & Sharir [25] have described a fast algorithm for finding an optimal least-squares match between arbitrary curve segments in two or three dimensions. This method has been combined with the efficient indexing of models to demonstrate the recognition of large numbers of two-dimensional models from their partially obscured silhouettes. This method also shows much promise for extension to the three-dimensional domain using range data.

Next: Conclusions Up: Three-Dimensional Object Recognition Previous: Directions for future

David Lowe
Fri Feb 6 14:13:00 PST 1998