An Implementation of the Parallelism in Visual Object Interpretation

Roger A. Browse and David B. Skillicorn


Abstract

Image understanding requires the use of specification of object structure and appearance in terms of extractable image properties. In this paper we describe experimentation with a "schemata-based" approach to visual interpretation which requires the construction of a recognition network which is subsequently coded in the programming language OCCAM in order to obtain a full expression of the parallelism inherent in the interpretation process.

Introduction

The task of computational vision may be separated into two stages. In the first stage, a representative set of image structures, or features, is extracted from the digitized image (such as Marr's primal sketch). In the second stage, the requirements of object models are matched against image features in order to obtain an interpretation of the scene.

There are several important technical issues associated with this second stage of processing:

An initial problem in the pursuit of these issues is the definition of a set of image features upon which interpretation is to be based. Some approaches develop the context-free image structures to the parameters of generalized 3-D shape primitives before consideration of object models. The "Connectionist" movement, on the other hand, considers the notion that much more simple image structures may activate object and scene models.

One approach to scene interpretation is found in "schemata-based" vision systems. These systems rely on line-drawing inputs to provide a discrete feature level from which issues of object model matching may be investigated.

One further characteristic of schemata-based systems is their use of parallelism across the active model possibilities. This form of parallelism is clearly required in the case that a feature has been encountered which could be a part of a large number of modeled objects. Also, it is worth noting that this type of "interpretation possibility" parallelism has a counterpart in observed behaviour of humans. The implementation of this parallelism has largely taken the form of "suspend and resume" process control structure. While these types of implementations are adequate for the investigation of some interpretation issues, it is obvious that an implementation is required in a system which has expression in a physically implemented parallel architecture.

Given a hierarchy of object models, we create a recognition network consisting of nodes, corresponsing to the objects, and edges indicating their component hierarchy relationships. At the lowest level, features such as lines are presented to object nodes that recognize simple aggregates such as squares. The existence of these objects causes messages to be passed to higher level object recognition nodes that determine if criteria for the existence of more complicated objects are met. Eventually, the nodes at the top of the object recognition network indicate the presence of complex objects.

The recognition network is implemented by a set of OCCAM processes. OCCAM implements Hoare's Communicating Sequential Processes. The recognition network, implemented in this way, exploits the maximum potential parallelism, if there is sufficient hardware to support it, since OCCAM processes consist of single operations. For example, checking matchings of a single feature with all other features to which it may relate can be done in constant time because OCCAM allows a fully parallel search.

The translation between the object models and the corresponding processes is sufficiently algorithmic for implementation as a description compiler. A database of compiled descriptions could also be interfaced to the real time system allowing recognition of objects to proceed more quickly once a probability of their existence was known.