Complete 3D information of the object is required in many fields. However, single-view observation always leads to the loss of 3D information. We introduce a learning based approach to simultaneously estimate the pose and shape of a given object from a single depth map.To address the problem, a depth map is firstly converted to be the corresponding partial point cloud, then an autoencoder-based network is proposed to learn this pose estimation as well as shape completion process. In the learning paradigm ,we utilize a novel pose representation, structured point list (SPL) to describe objects pose, which enables the network to understand the pose of the input object relative to the perspective. Compared with directly shape reconstruction, we find that adding SPL estimation as an intermediate supervision can both improve the accuracy of reconstruction and accelerate the convergence speed for training. Our method achieved SOTA results on both rigid and non-rigid objects reconstructions.
|