3 January 2020 Video-based person re-identification with parallel spatial–temporal attention module
Jun Kong, Zhende Teng, Min Jiang, Hongtao Huo
Author Affiliations +
Abstract

We propose a parallel network with spatial–temporal attention for video-based person re-identification. Many previous video-based person re-identification methods use two-dimensional convolutional neural networks to extract spatial features, then, temporal features are extracted by temporal pooling or recurrent neural networks. Unfortunately, these series networks will cause the loss of spatial information when extracting temporal information. Different from previous methods, our parallel network can extract temporal and spatial features simultaneously, which can effectively reduce the loss of space information. In addition, we design a global temporal attention module, which obtains the attention weight through the correlation between the current frame and all the frames in the sequence. At the same time, the temporal module can act on the information extraction of spatial module. In this way, we can increase the temporal and spatial constraints. Experiments show that our method can effectively improve the re-id accuracy, better than the state-of-the-art methods.

© 2020 SPIE and IS&T 1017-9909/2020/$28.00 © 2020 SPIE and IS&T
Jun Kong, Zhende Teng, Min Jiang, and Hongtao Huo "Video-based person re-identification with parallel spatial–temporal attention module," Journal of Electronic Imaging 29(1), 013001 (3 January 2020). https://doi.org/10.1117/1.JEI.29.1.013001
Received: 6 August 2019; Accepted: 13 December 2019; Published: 3 January 2020
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Mars

Feature extraction

Convolution

3D modeling

Optical flow

Image processing

RELATED CONTENT


Back to Top