Computing Reviews, the leading online review service for computing literature.

Search

A discriminatively learned CNN embedding for person reidentification
Zheng Z., Zheng L., Yang Y. ACM Transactions on Multimedia Computing, Communications, and Applications14 (1):1-20,2017.Type:Article

Date Reviewed: May 4 2018

For knowing whether the same person appears in two images (reidentification), one can either use identification models (that is, classifying the person to its identity) or verification models (that is, classifying whether both the images are of the same person). Both methods have their own advantages and disadvantages. The two models, an identification model and a verification model, are different concerning their inputs, feature extraction, and loss function used to train them. The verification model forces the two images belonging to the same person to be mapped using the nearby points in the resultant feature space. In contrast, the identification network tries to identify the person rather than discriminating it from the other person. A verification neural network does not consider the relationship between the given image pair and other images of the dataset, whereas the identification model tunes different features to classify a person accurately. In the first result, the authors show that just using the verification model is worse than just using the identification model for the reidentification task. This paper combines the two models to get more discriminative features. Specifically, the authors take some well-known image classification networks (such as CaffeNet, VGG16, and ResNet-50), use input of their last layers as nonlinear embedding functions of the images, and feed these embeddings to two models, simultaneously minimizing identification-loss as well as verification loss. Thus, for a pair of images, the network predicts the identity of the images and whether they belong to the same person. The authors show that their method leads to up to 5 to 11 percent improvement (in different networks) in Rank 1 accuracy compared to using only the identification model, and up to 8 to 21 percent improvement compared to using only the verification loss model. The paper is well written with a well-articulated problem statement, differences compared to prior work, measurement parameters, results, and different aspects of results. The authors have described both models in detail with their salient points. The authors compare the different loss functions used by the two models: cross-entropy loss (as identification loss) and contrastive loss (as verification loss). The results show that their method achieves 45 percent Rank 1 accuracy even using images from low-resolution cameras. The presentation format of the formulas could have been improved; they were mentioned without any proper explanation that would be useful for a general reader. Overall, though, this is a nice work with good results.

Reviewer: Rajeev Gupta	Review #: CR146017 (1807-0403)

Image Representation (I.4.10 )

Would you recommend this review?

yes

Other reviews under "Image Representation":	Date

On detecting all saddle points in 2D images Kuijper A. Pattern Recognition Letters 25(15): 1665-1672, 2004. Type: Article	Jul 14 2005

General adaptive neighborhood image processing Debayle J., Pinoli J. Journal of Mathematical Imaging and Vision 25(2): 267-284, 2006. Type: Article	Mar 29 2007

Human skeleton tracking from depth data using geodesic distances and optical flow Schwarz L., Mkhitaryan A., Mateus D., Navab N. Image and Vision Computing 30(3): 217-226, 2012. Type: Article	Aug 26 2013

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy