Computing Reviews

Zero-shot visual recognition via bidirectional latent embedding
Wang Q., Chen K. International Journal of Computer Vision124(3):356-383,2017.Type:Article
Date Reviewed: 07/18/18

Humans are remarkably good at learning to recognize new object categories from just a few examples, a task still unreachable by machines. Unlike state-of-the-art visual recognition systems that typically require thousands of examples to learn a new category, zero-shot learning aims at emulating this human ability by learning to recognize classes unseen during training.

To do so, the human brain exploits the intrinsic semantic relatedness between different classes, which allows for the proper relation of visual representations to the underlying semantics. In computer vision, one of the most successful approaches for bridging the semantic gap between visual and semantic features is to learn a common representation space, commonly known as embedding space, where both types of features are projected.

Following this approach, the authors propose a stagewise bidirectional framework to learn the embedding space consisting of bottom-up and top-down stages. The former aims at creating a latent space that preserves the intrinsic structure of the visual data while promoting the discriminative capability. The latter aims at embedding in the same latent space semantic representations of unseen classes. The embedding is achieved by the use of landmarks defined in the bottom-up stage, which are the coordinates of class labels in the latent space.

Comparative evaluation has demonstrated the prominence of the proposed approach over several benchmark datasets for the tasks of object and action recognition.

Besides the technical contribution, the paper provides a concise and systematic review of the state of the art on zero-shot learning, giving the reader a clear view of where and how the proposed approach fits into this big picture. This paper is worth reading.

Reviewer:  Mariella Dimiccoli Review #: CR146157 (1811-0602)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy