With the trustworthiness of multimedia data has been challenged by editing tools, image forgery localization aims to identify regions in images that have been modified. Although the existing techniques provide reasonably good results for image forgery localization, with emerging new editing techniques, such models must be retrained and it is highly dependent on the real tampering localization maps. In this paper, we propose an attention-based fusion network that combines the RGB image and noise residual yielding excellent results. Noise residual is commonly regarded as camera model fingerprint, and forgery localization can be detected as deviations from the expected regular pattern. The model consists of three parts: feature extraction, attentional feature fusion, and feature output. The feature extraction module is used to extract RGB image features and noise residuals separately, and the attentional feature fusion module is used to suppress the high frequency components, supplement and enhance model-related artifacts by combining the aforementioned features. Finally, the last module generates images with one channel as the camera model fingerprint. In order to avoid dependence on tampering localization maps, the model is trained with pairs of image patches coming from the same or different camera sensors by means of Siamese network. Experiment results obtained from several datasets show that the proposed technique successfully identifies modified regions, improves the quality of camera model fingerprints, and achieves significantly better performance when compared to the existing techniques.
Accurate crowd counting in congested scenes remain challengeable in the trade-off of efficiency and generalization. For solving this issue, we propose a mobile-friendly solution for the network deployment in high response speed demand scenarios. In order to introduce the profound potential of global crowd representations to lightweight counting model, this work suggests a novel crowd counting aimed mobile vision transformers architecture (CCMTNet), which strives for enhancing the efficiency of the model universality in real-time crowd counting tasks on resource constrained computing devices. The framework of linear CNN network interpolation structure with self-attention blocks endows the model with the ability of local feature extraction and global high-dimensional crowd information processing with low computational cost. In addition, several experimental networks with different scales based on the proposed architecture are comprehensively verified to balance the accuracy loss as compressing the computing costs. Extensive experiments on three mainstream datasets for crowd counting tasks well demonstrate the effectiveness of this proposed network. Particularly, CCMTNet achieves the feasibility of reconciling the counting accuracy and efficiency in comparisons with traditional lightweight CNN networks.
Makeup transfer aims to extract a specific makeup from a face and transfer it to another face, which can be widely used in portrait beauty, and cosmetics marketing. At present, existing methods can achieve the transfer of the entire facial makeup, but the quality of makeup transfer is not excellent because there may be a mismatch between the two images. In this paper, we propose a facial makeup transfer network based on the Laplacian pyramid, which can better preserve the facial structure from the source image and achieve high-quality transfer results. The model consists of three parts: makeup feature extraction, facial structure feature extraction, and makeup fusion. The makeup extraction part is used to extract the facial makeup from the reference image. And the facial structure feature extraction part is used to extract facial structure from the source image, in order to solve the loss of facial details when extracting facial structural features, we used the method based on Laplacian Pyramid. The makeup fusion part will fuse the facial makeup with facial structure features. Many experiments on the MT dataset have shown that this method can transfer makeup successfully without changing the original facial structure, and achieve advanced performance in various makeup styles.
In recent years, great progress has been made in the study of crowd counting. Although the crowd counting networks being proposed to solve different problems have achieved satisfactory counting results, the differences of crowd density and scale in the same scene still degrade the overall counting performance. In order to deal with this problem, we propose a Multi-Scale Attention Grading Crowd Counting Network (MSAGNet), which focuses on different crowd densities in the scene by attention mechanism and fuses multi-scale information to reduce scale differences. Specifically, the grading attention feature obtaining module focuses on different densities of people in the scene by attention mechanism, and adaptively assigns corresponding weights to different density regions. Dense regions are given more weights, allowing the model to focus more on that part making the training of that region more accurate and effective. In addition, the multi-scale density feature fusion module fuses the feature maps containing density information to generate the final feature maps. The obtained feature maps contain attention information at different scales, which are density mapped to obtain the estimated density maps. This method can focus on different density regions in the same scene, and simultaneously fuse multi-scale information and attention weight, which can effectively solve the problem of counting dense regions that is difficult to calculate. Extensive experiments on existing crowd counting datasets (UCF_CC_50, ShanghaiTech, UCF-QNRF) show that our method can effectively improve the counting performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.