The Rapid Recovery of
Three-Dimensional Structure From Line Drawings
Ronald A. Rensink, Dept. of Computer
Science, University of British Columbia, Vancouver, Canada.
PhD Dissertation, Computer
Science Department, University of British Columbia. 1992. [pdf] [UBC CS Technical Report 92-25 (September 1992)]
A
computational theory is developed that explains how line drawings of polyhedral
objects can be interpreted rapidly and in parallel at early levels of human
vision. The key idea is that a time-limited process can correctly recover much
of the three-dimensional structure of these objects when split into concurrent
streams, each concerned with a single aspect of scene structure.
The
work proceeds in five stages. The first extends the framework of Marr to allow
a process to be analyzed in terms of resource limitations. Two main concerns are
identified: (i) reducing the amount of nonlocal
information needed, and (ii) making effective use of whatever information is
obtained. The second stage traces the difficulty of line interpretation to a
small set of constraints. When these are removed, the remaining constraints can
be grouped into several relatively independent sets. It is shown that each set
can be rapidly solved by a separate processing stream, and that co-ordinating
these streams can yield a low-complexity ``approximation'' that captures much
of the structure of the original constraints. In particular, complete recovery
is possible in logarithmic time when objects have rectangular corners and the
scene-to-image projection is orthographic. The third stage is concerned with
making good use of the available information when a fixed time limit exists.
This limit is motivated by the need to obtain results within a time independent
of image content, and by the need to limit the propagation of inconsistencies.
A minimal architecture is assumed, viz., a spatiotopic
mesh of simple processors. Constraints are developed to guide the course of the
process itself, so that candidate interpretations are considered in order of
their likelihood. The fourth stage provides a specific algorithm for the recovery
process, showing how it can be implemented on a cellular automaton. Finally,
the theory itself is tested on various line drawings. It is shown that much of
the three-dimensional structure of a polyhedral scene can indeed by recovered
in very little time. It is also shown that the theory can explain the rapid
interpretation of line drawings at early levels of human vision.