基于注意力机制的多尺度融合航拍影像语义分割 Semantic Segmentation of Multi-Scale Fusion Aerial Image Based on Attention Mechanism期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于注意力机制的多尺度融合航拍影像语义分割

引用本文：	郑顾平,王敏,李刚. 基于注意力机制的多尺度融合航拍影像语义分割[J]. 图学学报, 2018, 39(6): 1069. DOI: 10.11996/JG.j.2095-302X.2018061069

作者姓名：	郑顾平王敏李刚

作者单位：	华北电力大学控制与计算机工程学院，河北保定 071003

基金项目：	国家自然科学基金项目(51407076)；中央高校基本科研业务费专项资金(2018MS075)

摘要：	航拍影像同一场景不同对象尺度差异较大，采用单一尺度的分割往往无法达到最佳的分类效果。为解决这一问题，提出一种基于注意力机制的多尺度融合模型。首先，利用不同采样率的扩张卷积提取航拍影像的多个尺度特征；然后，在多尺度融合阶段引入注意力机制，使模型能够自动聚焦于合适的尺度，并为所有尺度及每个位置像素分别赋予权重；最后，将加权融合后的特征图上采样到原图大小，对航拍影像的每个像素进行语义标注。实验结果表明，与传统的 FCN、DeepLab 语义分割模型及其他航拍影像分割模型相比，基于注意力机制的多尺度融合模型不仅具有更高的分割精度，而且可以通过对各尺度特征对应权重图的可视化，分析不同尺度及位置像素的重要性。
关键词：	语义分割多尺度融合注意力机制卷积神经网络
Semantic Segmentation of Multi-Scale Fusion Aerial Image Based on Attention Mechanism

ZHENG Guping,WANG Min,LI Gang. Semantic Segmentation of Multi-Scale Fusion Aerial Image Based on Attention Mechanism[J]. Journal of Graphics, 2018, 39(6): 1069. DOI: 10.11996/JG.j.2095-302X.2018061069

Authors:	ZHENG Guping WANG Min LI Gang

Affiliation:	School of Computer and Control Engineering, North China Electric Power University, Baoding Hebei 071003, China

Abstract:	In aerial images, there is significant difference between the scales of different objects in the same scene, single-scale segmentation often hardly achieves the best classification effect. In order to solve the problem, we proposes a multi-scale fusion model based on attention mechanism. Firstly, extract multi-scale features of the aerial image using dilated convolutions with different sampling rates; then utilize the attention mechanism in the multi-scale fusion stage, so that the model can automatically focus on the appropriate scale, and learn to put different weights on all scale and each pixel location; finally, the weighted sum of feature map is sampled to the original image size, and each pixel of aerial image is semantically labeled. The experiment demonstrates that compared with the traditional FCN and DeepLab method, and other aerial image segmentation model, the multi-scale fusion model based on attention mechanism not only has higher segmentation accuracy, but also can analyze the importance of different scales and pixel location by visualizing the weight map corresponding to each scale feature.

Keywords:	semantic segmentation multi-scale fusion attention mechanism convolutional neural network
本文献已被 CNKI 等数据库收录！
	点击此处可从《图学学报》浏览原始摘要信息
	点击此处可从《图学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏