Recently, Transformer-based methods have achieved excellent results in various computer vision tasks, including Single Image Super-Resolution (SISR). In SwinIR, the mechanism of cross-window connection and local self-attention of Swin Transformer are introduced into the SISR task, achieving breakthrough improvements. However, the local self-attention mechanism of Swin Transformer has a limited spatial range of input pixels, which limits the ability of the super-resolution network to extract features in a wide range. Aiming at this problem, an enhanced CNN and Transformer hybrid module is designed for feature extraction by combining self-attention, spatial attention and channel attention. Taking advantage of their complementary strengths, the range of activated pixels is expanded while still maintaining a strong capability for local feature characterization. In addition, simply extending the activation pixel range without constraints is not conducive to reconstruction. Aiming at this problem, the Neural Window Fully-connected Conditional Random Fields (NeW FC-CRFs) are integrated for feature fusion. The shallow features are inputted into NeW FC-CRFs along with deep features, allowing for the utilization of multi-level information during the fusion process. In summary, we propose the Hybrid Attention Super Resolution Network with Conditional Random Field (HANCRF). Extensive experiments show that HANCRF achieves competitive results with a small number of parameters.
Visible-Infrared person re-identification technology aims to match the target persons across the visible and infrared modalities. In this paper, we propose a visible-infrared person re-identification method based on a modal-identity dual-central loss. Modal-identity dual-central loss constrains the network to extract modal shared features by pulling in the infrared modal center and visible modal center of the same identity person, while pushing away the identity centers of different person to maintain inter-class discriminability. In addition, to extract more discriminative information, we propose a feature pyramid integration network based on efficient channel attention. Specifically, the network fuses high-level features and fine-grained low-level features to build a multi-scale feature map, and introduces an efficient channel attention module to enhance the salient features of person. Extensive experiments have been conducted to validate our proposed method on the SYSU-MM01 and RegDB datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.