25 April 2019 CARF-Net: CNN attention and RNN fusion network for video-based person reidentification
Kajal Kansal, Subramanyam Venkata, Dilip K. Prasad, Mohan Kankanhalli
Author Affiliations +
Abstract
Video-based person reidentification is a challenging and important task in surveillance-based applications. Toward this, several shallow and deep networks have been proposed. However, the performance of existing shallow networks does not generalize well on large datasets. To improve the generalization ability, we propose a shallow end-to-end network which incorporates two stream convolutional neural networks, discriminative visual attention and recurrent neural network with triplet and softmax loss to learn the spatiotemporal fusion features. To effectively use both spatial and temporal information, we apply spatial, temporal, and spatiotemporal pooling. In addition, we contribute a large dataset of airborne videos for person reidentification, named DJI01. It includes various challenging conditions, such as occlusion, illuminationchanges, people with similar clothes, and the same people on different days. We perform elaborate qualitative and quantitative analyses to demonstrate the robust performance of the proposed model.
© 2019 SPIE and IS&T 1017-9909/2019/$25.00 © 2019 SPIE and IS&T
Kajal Kansal, Subramanyam Venkata, Dilip K. Prasad, and Mohan Kankanhalli "CARF-Net: CNN attention and RNN fusion network for video-based person reidentification," Journal of Electronic Imaging 28(2), 023036 (25 April 2019). https://doi.org/10.1117/1.JEI.28.2.023036
Received: 27 December 2018; Accepted: 2 April 2019; Published: 25 April 2019
Lens.org Logo
CITATIONS
Cited by 3 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Mars

Optical flow

RGB color model

Cameras

Visualization

Performance modeling

Back to Top