Paper
21 June 2024 A semantic segmentation method for road scenes based on small samples
Fang Zuo, Zhenxing Luo, Zhihua Gan
Author Affiliations +
Proceedings Volume 13167, International Conference on Remote Sensing, Mapping, and Image Processing (RSMIP 2024); 131671F (2024) https://doi.org/10.1117/12.3029815
Event: International Conference on Remote Sensing, Mapping and Image Processing (RSMIP 2024), 2024, Xiamen, China
Abstract
Previous convolutional neural networks (CNNs) in the field of semantic segmentation, particularly in road scenes, suffer from overfitting, insensitivity to positional information, and limited robustness due to convolution and pooling operations. In this paper, we propose a multi-scale multi-feature fusion self-attention network (MSMA-Net) based on the U-Net architecture. The Decoder stage of the U-Net structure is removed, retaining only the first four sampling layers of the Encoder. The final features from each layer are simultaneously fed into a multi-scale pyramid pooling structure, where different scale pooling operations merge the features into a unified dimension. The output features are then passed through a Transformer Encoder stage and the MLP Head to produce the final classification results. The proposed method is trained on the Cityscapes and CamVid datasets, with half of the data randomly selected for training. The achieved average intersection over union (mIoU) scores are 77.9% and 72.2% on Cityscapes and CamVid, respectively, demonstrating significant advantages over other networks.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Fang Zuo, Zhenxing Luo, and Zhihua Gan "A semantic segmentation method for road scenes based on small samples", Proc. SPIE 13167, International Conference on Remote Sensing, Mapping, and Image Processing (RSMIP 2024), 131671F (21 June 2024); https://doi.org/10.1117/12.3029815
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Semantics

Education and training

Transformers

Roads

Image processing

Image classification

Back to Top