Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Protocols from perceptual observations
Needham C., Ferreira L., Magee D., Devin V., Hogg D., Cohn A.  Artificial Intelligence 167 (1-2): 103-136, 2005. Type: Article
Date Reviewed: May 8 2006

This is a very interesting paper on the integration of sub-symbolic and symbolic systems. One of the main features of the described system is its ability to learn, both under unsupervised and supervised training. The authors have achieved an important step in the quest for artificial intelligence: from visual and acoustic inputs, give the system the ability to learn how to correlate what is important in the shown sequence, and then select an appropriate action for the perceived input signals. This was achieved in real-time, with real data, through inexpensive hardware--two personal computers, Web cameras, and a microphone.

Simple real-world scenarios were used to demonstrate the principles through card games using a pack of cards with pictures of objects with different attributes (for instance, color and shape). The system uses Prolog as a high-level formalism to represent objects and relationships, while PROGOL is used for inductive learning, working directly from raw visual and acoustic data (color, shape, and single word utterances). An attention mechanism based on motion analysis is used to select key frames, and objects’ attributes are clustered using unsupervised learning where different classes are denoted by the attribute labels. Finally, a supervised learning method is applied over the object’s attributes (a vector quantization-based nearest neighbor classifier is used). Audio signals are processed in a similar fashion using K-means clustering over the set of utterances. For each utterance, a symbolic data stream is created. In order to relate an object’s attributes to the uttered word, it is necessary to keep track of time. Once an utterance is classified, it is backtracked to the particular video segment so that audio and visual symbols can be correlated. On the issue of linking perception to action, actions are defined as utterances; the system will choose to play back a sequence of video showing a person speaking the selected word according to current perceived visual signals.

The authors should be commended for tackling the difficult issue of symbol grounding. The burning question is how sensory projections can give rise to iconic representations, such that symbols can be attached to these providing a semantic interpretation of the world. A clear answer is provided, and its limitations are highlighted in this paper.

Reviewer:  Marcos Rodrigues Review #: CR132749 (0703-0299)
Bookmark and Share
Perceptual Reasoning (I.2.10 ... )
Modeling And Recovery Of Physical Attributes (I.2.10 ... )
Clustering (I.5.3 )
Vision And Scene Understanding (I.2.10 )
Would you recommend this review?
Other reviews under "Perceptual Reasoning": Date
Photoplethysmogram-based cognitive load assessment using multi-feature fusion model
Zhang X., Lyu Y., Qu T., Qiu P., Luo X., Zhang J., Fan S., Shi Y.  ACM Transactions on Applied Perception 16(4): 1-17, 2019. Type: Article
Oct 8 2021
Spatial reasoning and planning: geometry, mechanism, and motion (advanced information processing)
Liu J., Daneshmend L.,  Springer-Verlag, 2004. 180 pp. Type: Book (9783540406709)
Aug 11 2004
Computing perceptual organization in computer vision
Sarkar S., Boyer K. (ed),  World Scientific Publishing Co., Inc., River Edge, NJ, 1994.Type: Book (9789810218324)
Mar 1 1996

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2022 ThinkLoud, Inc.
Terms of Use
| Privacy Policy