首页 | 本学科首页   官方微博 | 高级检索  
     

基于注意力机制的多尺度融合航拍影像语义分割
引用本文:郑顾平,王 敏,李 刚. 基于注意力机制的多尺度融合航拍影像语义分割[J]. 图学学报, 2018, 39(6): 1069. DOI: 10.11996/JG.j.2095-302X.2018061069
作者姓名:郑顾平  王 敏  李 刚
作者单位:华北电力大学控制与计算机工程学院,河北 保定 071003
基金项目:国家自然科学基金项目(51407076);中央高校基本科研业务费专项资金(2018MS075)
摘    要:航拍影像同一场景不同对象尺度差异较大,采用单一尺度的分割往往无法达到最 佳的分类效果。为解决这一问题,提出一种基于注意力机制的多尺度融合模型。首先,利用不 同采样率的扩张卷积提取航拍影像的多个尺度特征;然后,在多尺度融合阶段引入注意力机制, 使模型能够自动聚焦于合适的尺度,并为所有尺度及每个位置像素分别赋予权重;最后,将加 权融合后的特征图上采样到原图大小,对航拍影像的每个像素进行语义标注。实验结果表明, 与传统的 FCN、DeepLab 语义分割模型及其他航拍影像分割模型相比,基于注意力机制的多尺 度融合模型不仅具有更高的分割精度,而且可以通过对各尺度特征对应权重图的可视化,分析 不同尺度及位置像素的重要性。

关 键 词:语义分割  多尺度融合  注意力机制  卷积神经网络  

Semantic Segmentation of Multi-Scale Fusion Aerial Image Based on Attention Mechanism
ZHENG Guping,WANG Min,LI Gang. Semantic Segmentation of Multi-Scale Fusion Aerial Image Based on Attention Mechanism[J]. Journal of Graphics, 2018, 39(6): 1069. DOI: 10.11996/JG.j.2095-302X.2018061069
Authors:ZHENG Guping  WANG Min  LI Gang
Affiliation:School of Computer and Control Engineering, North China Electric Power University, Baoding Hebei 071003, China
Abstract: In aerial images, there is significant difference between the scales of different objects in the same scene, single-scale segmentation often hardly achieves the best classification effect. In order to solve the problem, we proposes a multi-scale fusion model based on attention mechanism. Firstly, extract multi-scale features of the aerial image using dilated convolutions with different sampling rates; then utilize the attention mechanism in the multi-scale fusion stage, so that the model can automatically focus on the appropriate scale, and learn to put different weights on all scale and each pixel location; finally, the weighted sum of feature map is sampled to the original image size, and each pixel of aerial image is semantically labeled. The experiment demonstrates that compared with the traditional FCN and DeepLab method, and other aerial image segmentation model, the multi-scale fusion model based on attention mechanism not only has higher segmentation accuracy, but also can analyze the importance of different scales and pixel location by visualizing the weight map corresponding to each scale feature.
Keywords:semantic segmentation  multi-scale fusion  attention mechanism  convolutional neural network  
本文献已被 CNKI 等数据库收录!
点击此处可从《图学学报》浏览原始摘要信息
点击此处可从《图学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号