首页 | 本学科首页   官方微博 | 高级检索  
     

基于高度有效驱动注意力与多层级特征融合的城市街景语义分割北大核心CSCD
引用本文:赵迪,孙鹏,陈奕博,熊炜,刘粤,李利荣.基于高度有效驱动注意力与多层级特征融合的城市街景语义分割北大核心CSCD[J].光电子.激光,2022(10):1038-1046.
作者姓名:赵迪  孙鹏  陈奕博  熊炜  刘粤  李利荣
作者单位:湖北工业大学 电气与电子工程学院,湖北 武汉 430068,湖北工业大学 电气与电子工程学院,湖北 武汉 430068,湖北工业大学 电气与电子工程学院,湖北 武汉 430068,湖北工业大学 电气与电子工程学院,湖北 武汉 430068 ;襄阳湖北工业大学 产业研究院,湖北 襄阳 441100 ;美国南卡罗来纳大学 计算机科学与工程系,南卡罗来纳 哥伦比亚29201,湖北工业大学 电气与电子工程学院,湖北 武汉 430068,湖北工业大学 电气与电子工程学院,湖北 武汉 430068 ;襄阳湖北工业大学 产业研究院,湖北 襄阳 441100
基金项目:国家自然科学基金(61571182,61601177)、国家留学基金(201808420418)、湖北省自然科学基金(2019CFB530)、湖北省科 技厅重大专项(2019ZYYD020)和襄阳湖北工业大学产业研究院科研项目(XYYJ2022C05)和资助项目
摘    要:针对DeepLabv3+网络在进行城市街景图像分割任务时,没有充分利用到网络中多层级特征信息,导致分割结果存在大目标有孔洞、边缘目标分割不够精细等不足;并且考虑到城市街景数据具有天然的空间位置特殊性,本文提出在DeepLabv3+网络的基础上引入高度有效驱动注意力机制(height-driven efficient attention model,HEAM)与多层级特征融合模块(multi-stage feature fusion model,MFFM),将HEAM嵌入特征提取网络与空洞空间金字塔池化(atrous spatial pyramid pooling,ASPP)结构中,使其对目标关注更多垂直方向上的空间位置信息;MFFM通过融合多层特征图,在网络中形成多条融合支路依次连接到网络解码端,采用逐次上采样提高解码时像素上的连续性。将改进的网络通过CamVid城市街景数据集验证测试,实验结果表明,该网络能有效改善DeepLabv3+的不足,并且合理运用了数据集的位置先验性,增强了分割效果,在CamVid测试集上平均交并比(mean intersection over union,MIoU)达到了68.2%。

关 键 词:DeepLabv3+  城市街景  注意力机制  语义分割  特征融合
收稿时间:2022/1/15 0:00:00
修稿时间:2022/3/3 0:00:00

Urban street view semantic segmentation based on height-driven effective attention and multi-stage feature fusion
ZHAO Di,SUN Peng,CHEN Yibo,XIONG Wei,LIU Yue and LI Lirong.Urban street view semantic segmentation based on height-driven effective attention and multi-stage feature fusion[J].Journal of Optoelectronics·laser,2022(10):1038-1046.
Authors:ZHAO Di  SUN Peng  CHEN Yibo  XIONG Wei  LIU Yue and LI Lirong
Affiliation:School of Electrical and Electronic Engineering,Hubei University of Techn ology,Wuhan,Hubei 430068, China,School of Electrical and Electronic Engineering,Hubei University of Techn ology,Wuhan,Hubei 430068, China,School of Electrical and Electronic Engineering,Hubei University of Techn ology,Wuhan,Hubei 430068, China,School of Electrical and Electronic Engineering,Hubei University of Techn ology,Wuhan,Hubei 430068, China;Xiangyang Industrial Research Institute,Hubei University of Technolog y,Xiangyang,Hubei 441003, China;Department of Computer Science and Engineering,University of South Ca rolina,Columbia,SC 29201, USA,School of Electrical and Electronic Engineering,Hubei University of Techn ology,Wuhan,Hubei 430068, China and School of Electrical and Electronic Engineering,Hubei University of Techn ology,Wuhan,Hubei 430068, China;Xiangyang Industrial Research Institute,Hubei University of Technolog y,Xiangyang,Hubei 441003, China
Abstract:Deeplabv3+ network does not make full use of multi-stage feature info rmation in urban street view image segmentation,which leads to the shortcomings of large targets with holes, imprecise segmentation of edge target and so on.Considering the natural spatial position particularity of urban street view data,this paper proposes to introduce a height-driven effective at tention model (HEAM) and a multi-stage feature fusion model (MFFM) on the basis of Deeplabv3+ network,and it emb eds HEAM into the feature extraction network and atrous spatial pyramid pooling (ASPP) structure,which makes it pay attention to m ore spatial position information in the vertical direction.MFFM integrates multi-layer feature imag es to form multiple branches in the network and connect them to the network decoding end in turn.Su ccessive up- sampling is used to improve the continuity of pixels during decoding.The improv ed network is verified and tested by CamVid urban street view data set.The results show that the network can effectively improve the deficiency of DeepLabv3+,and the location priori of the data set is properly used to enhance the segmentation effect.Mean intersection over union ( MIoU) on CamVid test set reaches 68.2%.
Keywords:DeepLabv3+  urban street view  attention mechanism  semantic segmentation  featu re fusion
本文献已被 维普 等数据库收录!
点击此处可从《光电子.激光》浏览原始摘要信息
点击此处可从《光电子.激光》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号