首页 | 本学科首页   官方微博 | 高级检索  
     

基于视觉注意力机制的多源遥感图像语义分割
引用本文:谭大宁,刘瑜,姚力波,丁自然,路兴强. 基于视觉注意力机制的多源遥感图像语义分割[J]. 信号处理, 2022, 38(6): 1180-1191. DOI: 10.16798/j.issn.1003-0530.2022.06.005
作者姓名:谭大宁  刘瑜  姚力波  丁自然  路兴强
作者单位:1.海军航空大学信息融合研究所, 山东 烟台 264001
基金项目:国家自然科学基金62022092中国博士后科学基金2020M680631
摘    要:近年来,随着空间感知技术的不断发展,对多源遥感图像的融合处理需求也逐渐增多,如何有效地提取多源图像中的互补信息以完成特定任务成为当前的研究热点。针对多源遥感图像融合语义分割任务中,多源图像的信息冗余和全局特征提取难题,本文提出一种将多光谱图像(Multispectral image, MS)、全色图像(Panchromatic image, PAN)和合成孔径雷达 (Synthetic Aperture Radar, SAR)图像融合的基于Transformer的多源遥感图像语义分割模型Transformer U-Net (TU-Net)。该模型使用通道交换网络(Channel-Exchanging-Network, CEN)对融合支路中的多源遥感特征图进行通道交换,以获得更好的信息互补性,减少数据冗余。同时在特征图拼接后通过带注意力机制的Transformer模块对融合特征图进行全局上下文建模,提取多源遥感图像的全局特征,并以端到端的方式分割多源图像。在MSAW数据集上的训练和验证结果表明,相比目前的多源融合语义分割算法,在F1值和Dice系数上分别提高了3.31%~11.47%和4.87%~8.55%,对建筑物的分割效果提升明显。 

关 键 词:多源遥感图像   语义分割   图像融合   注意力机制
收稿时间:2021-08-30

Semantic Segmentation of Multi-source Remote Sensing Images Based on Visual Attention Mechanism
Affiliation:1.The Institute of Information Fusion, Naval Aviation University, Yantai, Shandong 264001, China2.Zhongke Satellite (Shandong) Technology Group Co. LTD, Jinan, Shandong 250199, China
Abstract:? ?In recent years, with the continuous development of spatial sensing technology, the demand for fusion processing of multi-source remote sensing images has gradually increased. How to effectively extract complementary information from multi-source images to complete specific tasks has become a research hotspot. Aiming at the problems of information redundancy and global feature extraction of multi-source images in the task of semantic segmentation, this paper proposes a model named Transformer U-Net (TU-Net) based on Transformer module for multi-spectral image (MS), panchromatic image (PAN) and Synthetic Aperture Radar (SAR) fusion segmentation. The model uses Channel-Exchanging-Network (CEN) to exchange the multi-source remote sensing feature maps in the fusion branches, so as to obtain better information complementarity and reduce data redundancy. At the same time, after the feature maps were concatenated, the global context of the fusion feature map is modeled by Transformer module with attention mechanism, the global features of multi-source remote sensing images are extracted, and the multi-source images are segmented in an end-to-end manner. The training and verification results on MSAW dataset show that compared with the current multi-source fusion semantic segmentation algorithms, the F1 value and Dice coefficient are improved by 3.31%~11.47% and 4.87%~8.55% respectively, which significantly improves the segmentation effect of buildings. 
Keywords:
点击此处可从《信号处理》浏览原始摘要信息
点击此处可从《信号处理》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号