How to solve the scale variation and background interference faced by crowd counting algorithms in practical applications is still an open problem. In this paper, to tackle the above problems, we propose the Attention-guided Feature Fusion Network (AFFNet) to learn the mapping between the crowd image and density map. In this network, the Channel-attentive Receptive Field Block (CRFB) is constructed by parallel convolutional layers with different expansion rates to extract multi-scale features. By adopting attention masks generated by high-level features to adjust low-level features, the Feature Fusion Module (FFM) can alleviate the background interference problem at the feature level. In addition, the Double Branch Module (DBM) generates a density estimation map, which further erases the background interference problem at the density level. Extensive experiments conducted on several challenging benchmark datasets including ShanghaiTech, UCF-QNRF and JHU-CROWD++ demonstrate our proposed method is superior to the state-of-the-art approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.