首页 | 本学科首页   官方微博 | 高级检索  
     

一种有效融合多尺度特征的图像语义分割方法
引用本文:许光宇,汤伟建.一种有效融合多尺度特征的图像语义分割方法[J].光电子.激光,2022,33(3):264-271.
作者姓名:许光宇  汤伟建
作者单位:安徽理工大学 计算机科学与工程学院, 安徽 淮南 232001,安徽理工大学 计算机科学与工程学院, 安徽 淮南 232001
基金项目:国家自然科学基金(61471004)和安徽理工大学博士专项基金(ZX942)资助项目
摘    要:卷积神经网络在高级计算机视觉任务中展现出强 大的特征学习能力,已经在图像语义 分割任务 中取得了显著的效果。然而,如何有效地利用多尺度的特征信息一直是个难点。本文提出一 种有效 融合多尺度特征的图像语义分割方法。该方法包含4个基础模块,分别为特征融合模块(feature fusion module,FFM)、空 间信息 模块(spatial information module,SIM)、全局池化模块(global pooling module,GPM)和边界细化模块(boundary refinement module,BRM)。FFM采用了注意力机制和残差结构,以提高 融合多 尺度特征的效率,SIM由卷积和平均池化组成,为模型提供额外的空间细节信息以 辅助定 位对象的边缘信息,GPM提取图像的全局信息,能够显著提高模型的性能,BRM以残差结构为核心,对特征图进行边界细化。本文在全卷积神经网络中添加4个基础模块, 从而有 效地利用多尺度的特征信息。在PASCAL VOC 2012数据集上的实验结 果表明该方法相比全卷积神 经网络的平均交并比提高了8.7%,在同一框架下与其他方法的对比结 果也验证了其性能的有效性。

关 键 词:卷积神经网络    图像语义分割    多尺度特征    特征融合    注意力机制
收稿时间:2021/6/7 0:00:00

An image semantic segmentation method effectively fusing multi-scale features
XU Guangyu and TANG Weijian.An image semantic segmentation method effectively fusing multi-scale features[J].Journal of Optoelectronics·laser,2022,33(3):264-271.
Authors:XU Guangyu and TANG Weijian
Affiliation:School of Computer Science and Engineering,Anhui University of Science and Tech nology,Huainan,Anhui 232001, China and School of Computer Science and Engineering,Anhui University of Science and Tech nology,Huainan,Anhui 232001, China
Abstract:Convolutional neural networks show strong feature learning ability in a dvanced computer vision and have achieved remarkable effect in image semantic segmentation tasks.Ho wever,how to use the multi-scale feature information effectively is always a difficulty.This paper proposes an effective image semantic segmentation method which integrates multi-scale features.The propose d method consists of four basic modules,which are feature fusion module (FFM),spatial information module (SIM),global pooling module (GPM) and boundary refinement module (BRM).FFM adopts attention mechanis m and residual structure to improve the efficiency of multi-scale feature fusion.SIM includes convolution and average pooling opearaitons,and its purpose is to assist in locating the edge informati on of the object by providing additional spatial details.GPM extracts the global information of the image,wh ich can significantly improve the performance of the model.BRM takes the residual structure as the co re to refine the boundary of the feature map.Four basic modules are added into the full convolutional neu ral network to effectively utilize the multi-scale feature information.Experimental results on PASCAL VOC 2012dataset show that mean intersection over union of the proposed method is 8.7% higher than that of full convolutional neural network.The results of comparison with other methods in the same framework also verify the effectiveness of the proposed method.
Keywords:convolutional neural network  image semantic segmentation  multi-scale feature  feature fusion  attention mechanism
点击此处可从《光电子.激光》浏览原始摘要信息
点击此处可从《光电子.激光》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号