A semantic segmentation method for road scenes based on small samples

Fang Zuo; Zhenxing Luo; Zhihua Gan

doi:10.1117/12.3029815

21 June 2024 A semantic segmentation method for road scenes based on small samples

Fang Zuo, Zhenxing Luo, Zhihua Gan

Proceedings Volume 13167, International Conference on Remote Sensing, Mapping, and Image Processing (RSMIP 2024); 131671F (2024) https://doi.org/10.1117/12.3029815
Event: International Conference on Remote Sensing, Mapping and Image Processing (RSMIP 2024), 2024, Xiamen, China

Abstract

Previous convolutional neural networks (CNNs) in the field of semantic segmentation, particularly in road scenes, suffer from overfitting, insensitivity to positional information, and limited robustness due to convolution and pooling operations. In this paper, we propose a multi-scale multi-feature fusion self-attention network (MSMA-Net) based on the U-Net architecture. The Decoder stage of the U-Net structure is removed, retaining only the first four sampling layers of the Encoder. The final features from each layer are simultaneously fed into a multi-scale pyramid pooling structure, where different scale pooling operations merge the features into a unified dimension. The output features are then passed through a Transformer Encoder stage and the MLP Head to produce the final classification results. The proposed method is trained on the Cityscapes and CamVid datasets, with half of the data randomly selected for training. The achieved average intersection over union (mIoU) scores are 77.9% and 72.2% on Cityscapes and CamVid, respectively, demonstrating significant advantages over other networks.

(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.

Citation Download Citation

Fang Zuo, Zhenxing Luo, and Zhihua Gan "A semantic segmentation method for road scenes based on small samples", Proc. SPIE 13167, International Conference on Remote Sensing, Mapping, and Image Processing (RSMIP 2024), 131671F (21 June 2024); https://doi.org/10.1117/12.3029815

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
9 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Image segmentation

Semantics

Education and training

Transformers

Roads

Image processing

Image classification

Show All Keywords

Keywords/Phrases

Search In:

Publication Years