Recently, high spatial resolution remote sensing image scene classification has had a wide range of applications and has become one of the hotspots in the field of remote sensing research. Due to the complexity of the scenes in remote sensing images, it is impossible to annotate all ground object classes at once. To adapt to different application scenarios, high spatial resolution remote sensing image scene classification models need to have zero-shot generalization ability for unseen classes. To improve the zero-shot generalization ability of classification models, the existing methods often start from the perspective of image features, thus ignoring the high-order semantic information in the scene. In fact, the association between higher-order semantic information in the scene is very important for the generalization ability of the classification model. People often use image information and its corresponding higher-order semantic information to complete remote sensing image scene understanding. Therefore, this work proposes a text guided remote sensing image pre-training model for zero-shot classification of high spatial resolution remote sensing image scenes. First, the transformer model is used to extract the embedded features of text and remote sensing images. Then, based on the aligned text and remote sensing image data, a contrast learning method is used to train the model to learn the correspondence between text and image features. After the model training is completed, the nearest neighbor method is used to complete zero-shot classification on the target data. The effectiveness of the proposed method was verified on three remote sensing image scene classification benchmark datasets.
In recent years, deep-learning-based hyperspectral image (HSI) processing and analysis have made significant progress. However, models with high performance require sufficient training samples because scarce labeled samples limit their generalization ability. To solve this problem, we adopt a self-supervised learning strategy and conduct self-training for a neural network model by obtaining different views of the same sample (positive pairs). As a result, the network can learn representative features for classification from unlabeled samples. In addition, to increase the spatial receptive field compared with the use of conventional convolutions, we use the transformer to capture long-distance dependencies for feature enhancement and adequately combine their advantages. Experimental results on two publicly available HSI datasets demonstrate that the proposed method can extract robust features through self-training on unlabeled samples and can be adapted to HSI classification tasks under the small sample conditions.
To fully use the contextual information of hyperspectral images (HSIs), we propose a U-shaped network model combined with attention mechanism to achieve image-level HSI classification. First, the entire HSI is input into the network for end-to-end training, and the classification results of the entire scene are directly output. Then, the context information is used to improve the classification accuracy, while reducing many redundant calculations. Second, to improve the classification accuracy, considering two dimensions (i.e., space and channel), a hybrid attention module, mixing spatial and channel, is designed. Third, three datasets of the University of Pavia, Indian Pines, and Salinas are selected for the classification experiments. The experimental results show that, compared with other methods, the proposed method can obtain higher classification accuracy, and its training and testing efficiency is higher.
Recently, convolutional neural networks have greatly improved the performance of hyperspectral image (HSI) classification. However, these methods mainly use local spatial–spectral information for the HSI classification and require a large number of labeled samples to ensure high classification accuracy. In our study, we propose a multiscale nested U-Net (MsNU-Net) to capture the global context information and improve the HSI classification accuracy with a small number of labeled samples. We took an HSI as the input and constructed a nested U-Net to complete the classification. Because scale is very important for image recognition, we propose a simple but effective multiscale loss function. Apart from introducing multiscale features into the network, this method uses Gaussian filters to construct multiscale data, inputs the multiscale data into the nested U-Net with shared parameters, and calculates the sum of loss functions of different scales as the final loss function. Furthermore, it introduces different scales of global context information, thus improving the classification accuracy. To demonstrate its effectiveness, we carried out classification experiments on four widely used HSIs. The results show that this method could achieve a higher classification accuracy than the compared methods when only a small number of labeled samples is available. Furthermore, the codes of the proposed method will be made available freely at https://github.com/liubing220524/MsNU-Net.
Recently, deep learning models based on convolutional neural networks (CNN) remain dominant in hyperspectral image (HSI) classification. However, there are some problems in CNN models, such as not good at modeling the long-distance dependencies and obtaining global context information. Different from the existing CNN-based models, an innovative classification method based on the transformer model is proposed to further improve the classification accuracy of HSI. Specifically, the proposed method first extracts the extended morphological profile (EMP) features of HSI to make full use of the spatial and spectral information while effectively reducing the number of bands. Next, a deep network model is constructed by introducing the transformer-iN-transformer (TNT) modules to carry out end-to-end classification. The outer and inner transformer models in the TNT module can extract the patch-level and pixel-level features, respectively, to make full use of the global and local information in the input EMP cubes. Experimental results on three public HSI data sets show that the proposed method can achieve better classification performance than the existing CNN-based models. In addition, using the transformer-based deep model without convolution to classify HSI provides a new idea for related research.
Deep learning has been widely used in hyperspectral image (HSI) classification. However, a deep learning model is a data-driven machine learning method, and collecting labeled data is quite time-consuming for an HSI classification task, which means that a deep learning model needs a lot of labeled data and cannot deal with the small sample problem. We explore the small sample classification problem of HSI with graph convolutional network (GCN). First, HSI with a small number of labeled samples are treated as a graph. Then, the GCN (an efficient variant of convolutional neural networks) operates directly on the graph constructed from the HSI. GCN utilizes the adjacency nodes in graph to approximate the convolution. In other words, graph convolution can use both labeled and unlabeled nodes. Therefore, our method is a semisupervised method. Three HSI are used to assess the performance of the proposed method. The experimental results show that the proposed method outperforms the traditional semisupervised methods and advanced deep learning methods.
The deep learning methods have recently been successfully explored for hyperspectral image classification. However, it may not perform well when training samples are scarce. A deep transfer learning method is proposed to improve the hyperspectral image classification performance in the situation of limited training samples. First, a Siamese network composed of two convolutional neural networks is designed for local image descriptors extraction. Subsequently, the pretrained Siamese network model is reused to transfer knowledge to the hyperspectral image classification tasks by feeding deep features extracted from each band into a recurrent neural network. Indeed, a deep convolutional recurrent neural network is constructed for hyperspectral image classification by this way. Finally, the entire network is tuned by a small number of labeled samples. The important characteristic of the designed model is that the deep convolutional recurrent neural network provides a way of utilizing the spatial–spectral features without dimension reduction. Furthermore, the transfer learning method provides an opportunity to train such deep model with limited labeled samples. Experiments on three widely used hyperspectral datasets demonstrate that the proposed transfer learning method can improve the classification performance and competitive classification results can be achieved when compared with state-of-the-art methods.
Recently, hyperspectral image (HSI) classification has become a focus of research. However, the complex structure of an HSI makes feature extraction difficult to achieve. Most current methods build classifiers based on complex handcrafted features computed from the raw inputs. The design of an improved 3-D convolutional neural network (3D-CNN) model for HSI classification is described. This model extracts features from both the spectral and spatial dimensions through the application of 3-D convolutions, thereby capturing the important discrimination information encoded in multiple adjacent bands. The designed model views the HSI cube data altogether without relying on any pre- or postprocessing. In addition, the model is trained in an end-to-end fashion without any handcrafted features. The designed model was applied to three widely used HSI datasets. The experimental results demonstrate that the 3D-CNN-based method outperforms conventional methods even with limited labeled training samples.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.