There are certain inevitable challenges in human pose estimation tasks based on deep learning methods, such as large amount of network parameters and high computational complexity. This paper proposes a lightweight network to reduce the scale of model parameters and computational complexity, meanwhile improve the accuracy of the human pose estimation task. The new method takes the high-resolution networks HRNet-32 as the basic framework and replaces the basic module with the MBConv lightweight module. The attention mechanism is incorporated into the network to model the context information, so as to improve the perception ability and the feature extraction ability of the module and improve the accuracy of human pose estimation. The experimental results on COCO2017 show that the proposed network can detect human key points with high precision even when the amount of parameters is reduced by 56%, which verifies that the proposed method has good lightweight performance.
At present, human pose estimation with depth images faces some challenges. Methods based on deep learning perform well but rely on massive amounts of data, while traditional machine learning methods are simple to implement but depend on feature extraction and have low accuracy. To deal with them, this paper proposes a novel method based on the Manifold Gaussian Process, which combines tomographic image denoising and feature fusion to solve human pose estimation with depth images. The experimental prediction accuracy on ITOP datasets outperforms other machine learning methods, achieving 83.3% and 77.9% for full body from the front view and top view respectively, which proves the effectiveness of Manifold Gaussian Process on human pose estimation with depth images.
Most presented facial expression recognition methods laid more emphasis on the facial features extracted from expression images, but ignored the coupled relationship between facial expression features and identity features. This paper proposes a novel expression recognition method based on spatial feature disentanglement. Expression features and identity features are encoded with deep neural network independently under a multi-task framework. A latent space discriminator is designed to disentangle spatial features and weaken the impact of identity features on expression recognition. The experimental identification accuracy on CK+ and RaFD datasets could achieve 99.69% and 97.64% respectively, which verifies that the proposed method has better generalization ability and strong robustness.
In order to improve the recognition rate of various postures, this paper proposes a method of facial correction based on Gaussian Process which build a nonlinear regression model between the front and the side face with combined kernel function. The face images with horizontal angle from -45° to +45° can be properly corrected to front faces. Finally, Support Vector Machine is employed for face recognition. Experiments on CAS PEAL R1 face database show that Gaussian process can weaken the influence of pose changes and improve the accuracy of face recognition to certain extent.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.