首页 | 本学科首页   官方微博 | 高级检索  
     

自监督学习下小样本遥感图像场景分类
引用本文:张睿,杨义鑫,李阳,王家宝,苗壮,李航,王梓祺.自监督学习下小样本遥感图像场景分类[J].中国图象图形学报,2022,27(11):3371-3381.
作者姓名:张睿  杨义鑫  李阳  王家宝  苗壮  李航  王梓祺
作者单位:陆军工程大学指挥控制工程学院, 南京 210007
基金项目:国家青年科学基金项目(61806220);江苏省自然科学基金项目(BK20200581)
摘    要:目的 卷积神经网络(convolutional neural network, CNN)在遥感场景图像分类中广泛应用,但缺乏训练数据依然是不容忽视的问题。小样本遥感场景分类是指模型只需利用少量样本训练即可完成遥感场景图像分类任务。虽然现有基于元学习的小样本遥感场景图像分类方法可以摆脱大数据训练的依赖,但模型的泛化能力依然较弱。为了解决这一问题,本文提出一种基于自监督学习的小样本遥感场景图像分类方法来增加模型的泛化能力。方法 本文方法分为两个阶段。首先,使用元学习训练老师网络直到收敛;然后,双学生网络和老师网络对同一个输入进行预测。老师网络的预测结果会通过蒸馏损失指导双学生网络的训练。另外,在图像特征进入分类器之前,自监督对比学习通过度量同类样本的类中心距离,使模型学习到更明确的类间边界。两种自监督机制能够使模型学习到更丰富的类间关系,从而提高模型的泛化能力。结果 本文在NWPU-RESISC45(North Western Polytechnical University-remote sensing image scene classification)、AID(aerial ima...

关 键 词:小样本学习  遥感场景分类  自监督学习  蒸馏学习  对比学习
收稿时间:2021/6/23 0:00:00
修稿时间:2021/10/17 0:00:00

Self-supervised learning based few-shot remote sensing scene image classification
Zhang Rui,Yang Yixin,Li Yang,Wang Jiabao,Miao Zhuang,Li Hang,Wang Ziqi.Self-supervised learning based few-shot remote sensing scene image classification[J].Journal of Image and Graphics,2022,27(11):3371-3381.
Authors:Zhang Rui  Yang Yixin  Li Yang  Wang Jiabao  Miao Zhuang  Li Hang  Wang Ziqi
Affiliation:Command and Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China
Abstract:Objective Convolutional neural networks (CNNs) have been widely used in remote sensing scene image classification, but data-driven models are restricted by the data scarcity-related over fitting and low robustness issue. The problems of few labeled samples are still challenged to train model for remote sensing scene image classification task. Therefore, it is required to design an effective algorithm that can adapt to small-scale data. Few-shot learning can be used to improve the generalization ability of model. Current meta-learning-based few-shot remote sensing scene image classification methods can resilient the data-intensive with no higher robustness. A challenging issue of the remote sensing scene samples is derived of small inter-class variation and large intra-class variation, which may lead to low robustness for few-shot learning. Our research is focused on a novel self-supervised learning framework for few-shot remote sensing scene image classification, which can improve the generalization ability of the model via rich intra-class relationships learnt. Method Our self-supervised learning framework is composed of three modules in relation to data preprocessing, feature extraction and loss function. 1) Data preprocessing module is implemented for resizing and normalization for all inputs, and the supporting set and the query set are constructed for few-shot learning. The supporting set is concerned about small scale labeled images, but the query set has no labels-relevant samples. Few-shot learning method attempts to classify the query samples of using same group-derived supporting set. Furthermore, data preprocessing module can construct a numerous of multiple supporting sets and query sets. 2) Feature extraction module is aimed to extract the features from the inputs, consisting of the supporting features and the query features. The distilled "student-related" knowledge has dual-based feature extraction networks. The "teacher-related" feature extraction module is based on ResNet-50, and the "student-related" dual module has two Conv-64 networks. 3) Loss function module can produce three losses-relevant like few-shot, knowledge distillation and self-supervised contrast. The few-shot loss uses the inherent labels to update the parameters of the "student-related" network, which is produced by metric-based meta-learning. Knowledge-distilled loss is originated from KL (Kullback-Leibler) loss, which calculates the similarity of probability distribution between the "student-related" dual networks and the teachers-related network using the soft labels. The knowledge distillation learning is based on two-stage training process. The "teacher-related" network is used for metric based meta-learning. Then, the "student-related" networks and the "teacher-related" network are trained with the same data, and the output of the "teacher-related" network is used to guide the learning of the "student-related" network by knowledge distillation loss. Additionally, the self-supervised contrastive loss is calculated by measuring the distance between the centers of two classes. We use the self-supervised contrastive loss to perform instance discrimination pretext task through reducing the distances from same classes, and amplifying the different ones. The two self-supervising mechanisms can enable the model to learn richer inter-class relationships, which can improve the generalization ability. Result Our method is evaluated on North Western Polytechnical University-remote sensing image scene classification (NWPU-RESISC45) dataset, aerial image dataset (AID), and UC merced land use dataset (UCMerced LandUse), respectively. The 5-way 1-shot task and 5-way 5-shot task is carried out on each dataset. Our method is also compared to other five methods, and our benchmark is Relation Net*, which is a metric-based meta-learning method. For the 5-way 1-shot task, it can achieve 72.72%±0.15%, 68.62%±0.76%, and 68.21%±0.65% on the three datasets, respectively, which is 4.43%, 1.93%, and 0.68% higher than Relation Net*. For the 5-way 5-shot task, our result is 3.89%, 2.99%, and 1.25% higher than Relation Net*. The confusion matrix is visualized on the AID and UCMerced LandUse as well. The confusion matrix shows that our self-supervised method can reduce the error outputs from the indistinguishable classes. Conclusion We develop a self-supervised method to resolve the data scarcity-derived problem of low robustness, which consists of a dual-based "student-related" knowledge distillation mechanism and a self-supervised contrastive learning mechanism. Dual-based "student-related" knowledge distillation uses the soft labels of the "teacher-related" network as the supervision information of the "student-related" network, which can improve the robustness of few-shot learning through richer inter-class relationship and intra-class relationship. The self-supervised contrastive learning method can evaluate the similarity of different class center in a representation space, making the model to learn a class center better. The feasibility of self-supervised distillation and contrastive learning is clarified. It is necessary to integrate self-supervised transfer learning tasks with few-shot remote sensing scene image classification further.
Keywords:few-shot learning  remote sensing scene classification  self-supervised learning  distillation learning  contrastive learning
点击此处可从《中国图象图形学报》浏览原始摘要信息
点击此处可从《中国图象图形学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号