Reliable and stable visual perception systems are needed for humanoid robotic assistants to perform complex grasping and manipulation tasks. The recognition of the object and its precise 6D pose are required. This paper addresses the challenge of detecting and positioning a textureless known object, by estimating its complete 6D pose in cluttered scenes. A 3D perception system is proposed in this paper, which can robustly recognize CAD models in cluttered scenes for the purpose of grasping with a mobile manipulator. Our approach uses a powerful combination of two different camera technologies, Time-Of-Flight (TOF) and RGB, to segment the scene and extract objects. Combining the depth image and gray image to recognize instances of a 3D object in the world and estimate their 3D poses. The full pose estimation process is based on depth images segmentation and an efficient shape-based matching. At first, the depth image is used to separate the supporting plane of objects from the cluttered background. Thus, cluttered backgrounds are circumvented and the search space is extremely reduced. And a hierarchical model based on the geometry information of a priori CAD model of the object is generated in the offline stage. Then using the hierarchical model we perform a shape-based matching in 2D gray images. Finally, we validate the proposed method in a number of experiments. The results show that utilizing depth and gray images together can reach the demand of a time-critical application and reduce the error rate of object recognition significantly.
|