Paper
13 June 2024 PasteFusion: innovating multimodal sensor fusion for enhanced 3D object detection
Yuhong Yuan, Kai Zhang, Mingbo Yang, Shuxiang Li, Yu Liang
Author Affiliations +
Proceedings Volume 13180, International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2024); 131803Q (2024) https://doi.org/10.1117/12.3034162
Event: International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2024), 2024, Guangzhou, China
Abstract
In the domain of autonomous driving technology, integrating diverse sensors is essential for precise 3D object detection. The fusion of the multimodal sensor data, offering unique insights like semantic richness from cameras, spatial accuracy from LiDAR, and velocity assessments from radars, is vital but poses significant challenges due to differences in format, scale, and perspective. Present research methods, including aligning multi-sensor data to a consistent perspective and developing a unified bird's eye view (BEV), attempt to mitigate these challenges while striving to preserve data integrity. To further progress in this area, we introduce PasteFusion, an innovative multimodal fusion framework tailored for 3D object detection within autonomous driving systems. This framework uniquely merges LiDAR and image data using a deformable attention mechanism combined with convolutional fusion, thereby obviating the need for intricate transformations and monocular depth estimation. Furthermore, we present the Paste-Aug algorithm, an advanced approach for harmonizing image and point cloud data augmentation. This algorithm effectively addresses occlusions in the image domain, thus reducing computational demands. Our approach has demonstrated significant improvements in 3D object detection, as evidenced by outstanding results on the nuScenes validation set, achieving a mAP of 69.4% and a NDS of 72.4%.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Yuhong Yuan, Kai Zhang, Mingbo Yang, Shuxiang Li, and Yu Liang "PasteFusion: innovating multimodal sensor fusion for enhanced 3D object detection", Proc. SPIE 13180, International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2024), 131803Q (13 June 2024); https://doi.org/10.1117/12.3034162
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Object detection

Point clouds

LIDAR

Image fusion

Cameras

Deformation

3D image processing

RELATED CONTENT


Back to Top