Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Integration of visual modules: an extension of the Marr paradigm
Aloimonos J., Shulman D., Academic Press Prof., Inc., San Diego, CA, 1989. Type: Book (9780120530205)
Date Reviewed: May 1 1990

The authors have produced a rare book in the area of computer image analysis. Rather than a survey of the field or a collection of papers, it is a research monograph. The book presents several important ideas but suffers from major problems in editing and content. As indicated by the title, the major theme of the book is how combining different modules (or algorithms) is useful for computational computer vision. The authors present two approaches for combining modules: first, given an algorithm that performs a specific task, generalize to a theory, and second, given a theory, find a specific implementation. They refer to these approaches as bottom-up and top-down, respectively, which is confusing since these terms are generally used in computer vision literature to denote image (data) driven processing and model (goal) driven processing.

The first two chapters and the final chapter provide a good overview of the authors’ proposal for future research. Chapters 3, 4, and 5 provide a good discussion of techniques to combine two specific modules to more reliably extract three-dimensional shape information from images. These chapters cover extracting shape from shading and motion, extracting shape from texture and motion, and extracting shape from contour and stereo. They give a mathematical discussion of the combination techniques with the additional constraints derived from using two modules. Chapter 6 discusses general combination issues for existing analysis algorithms. These four chapters constitute the details of the bottom-up approach, that is, how to derive a theory from implementations.

Chapter 7 is taken mostly from an earlier journal paper [1] that discusses an approach called active vision--the observer or camera moves in a controlled way to best explore and analyze the environment. Some of this material is redundant because the focus of the previous chapters was on using active vision in the form of known motion.

Chapter 8, containing roughly one-fourth of the book, describes the top-down approach--given a theory of combinations, how would it be implemented? This theoretical discussion focuses mostly on regularization as a unifying theory of combining data, though other techniques are discussed. The authors address important issues such as discontinuous regularization and using approximations to the best solution.

This book is highly theoretical and primarily a proposal of what should be done rather than a discussion of results, so an evaluation of the content is difficult. Only after 30 years of research (the time frame suggested in the book) can we determine if the proposal was right. The general idea of combining modules is not unique to this book and is common in all proposals for research in the field, so the issue becomes whether the proposed theories for combination will be most beneficial.

Some of the terminology used is problematical. The confusing use of “top-down” and “bottom-up” was mentioned earlier. The authors also use “motion without correspondence” where they really mean regional (or global) feature analysis for motion prediction assuming that a regional (or global) correspondence is given. Using “without correspondence” for this kind of analysis is misleading, even though common in some of the recent literature in the field.

Apparently the publisher provided little or no editorial guidance or assistance for this book. A hardcover book should not have the kinds of problems in grammar, style, and layout that are exhibited here. The apparent title of the book is Integration of visual modules but the running head is Integration of visual models. The latter title might be more appropriate for the content--combining models rather than modules. Many of the page numbers in the table of contents are wrong by one or two pages (for example, all but one of the 15 entries for chapter 8). Also, some section titles are different in the table of contents from those found in the text. How these problems could occur in a computer-formatted book is not clear, but they represent a lack of care by the authors and the publisher. Individually, these problems are merely distracting and would be expected in a rough draft, not a finished document. Similar problems begin to affect the content, as in Figure 2.1 (p. 25), which shows proposed combinations of modules; connections are clearly left out, and retinal motion is to be derived from stereo alone.

This book contains an important idea--combination of different processing techniques gives more constraints and simplifies the problem. This concept is important, but it may not require an entire book. Since the book is a theoretical proposal for long-term research, with no examples of the application of these techniques to realistic images, it has little immediate utility. Whether this proposal is important will not be determined for many years. The authors agree that other lines of research are important, so the book will not radically change research directions in the field.

Reviewer:  Keith Price Review #: CR114123
1) Aloimonos, J.; Weiss, I.; and Bandopadhay, A.Active vision. Int. J. Comput. Vision 1 (1987), 333–356.
Bookmark and Share
 
Stereo (I.4.8 ... )
 
 
Architecture And Control Structures (I.2.10 ... )
 
 
Depth Cues (I.4.8 ... )
 
 
Motion (I.2.10 ... )
 
 
Shape (I.2.10 ... )
 
 
Texture (I.2.10 ... )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Stereo": Date
Three-dimensional surface mapping simulator: theory, capabilities and operation
Schalkoff R., Labuz J. Image and Vision Computing 3(1): 36-39, 1985. Type: Article
Aug 1 1985
Applying temporal constraints to the dynamic stereo problem
Jenkin M. (ed), Tsotsos J. Computer Vision, Graphics, and Image Processing 33(1): 16-32, 1986. Type: Article
Apr 1 1987
Representing stereo data with the Delaunay triangulation
Faugeras O., Le Bras-Mehlman E., Boissonnat J. Artificial Intelligence 44(1-2): 41-87, 1990. Type: Article
May 1 1991
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy