首页 | 本学科首页   官方微博 | 高级检索  
     

基于上下文和浅层空间编解码网络的图像语义分割方法
引用本文:罗会兰,黎宵.基于上下文和浅层空间编解码网络的图像语义分割方法[J].自动化学报,2022,48(7):1834-1846.
作者姓名:罗会兰  黎宵
作者单位:1.江西理工大学信息工程学院 赣州 341000
基金项目:国家自然科学基金(61862031,61462035);;江西省自然科学基金(20171BAB202014);;江西省主要学科学术和技术带头人培养计划(20213BCJ22004)资助~~;
摘    要:当前图像语义分割研究基本围绕如何提取有效的语义上下文信息和还原空间细节信息两个因素来设计更有效算法. 现有的语义分割模型, 有的采用全卷积网络结构以获取有效的语义上下文信息, 而忽视了网络浅层的空间细节信息; 有的采用U型结构, 通过复杂的网络连接利用编码端的空间细节信息, 但没有获取高质量的语义上下文特征. 针对此问题, 本文提出了一种新的基于上下文和浅层空间编解码网络的语义分割解决方案. 在编码端, 采用二分支策略, 其中上下文分支设计了一个新的语义上下文模块来获取高质量的语义上下文信息, 而空间分支设计成反U型结构, 并结合链式反置残差模块, 在保留空间细节信息的同时提升语义信息. 在解码端, 本文设计了优化模块对融合后的上下文信息与空间信息进一步优化. 所提出的方法在3个基准数据集CamVid、SUN RGB-D和Cityscapes上取得了有竞争力的结果.

关 键 词:语义分割    二分支策略    语义上下文信息    浅层空间细节信息    反U型结构
收稿时间:2019-05-15

Image Semantic Segmentation Method Based on Context and Shallow Space Encoder-decoder Network
Affiliation:1.School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou 341000
Abstract:Recently, the research on image semantic segmentation basically focuses on how to extract effective semantic context information and restore spatial information in order to get more efficient algorithms. Some models use a fully convolutional network structure to obtain effective semantic context information. This kind of framework does not use the spatial details in the shallow layers of the networks. To effectively restore the spatial details for the decoder, some researches utilize the U-shape structure of complex network connections. But they could not obtain high-quality semantic context features. To better combine context information and space information, a novel semantic segmentation framework is proposed in this paper. A two-branch strategy is adopted on the encoder. One branch is called contextual branch, which is constructed with a proposed semantic context module to obtain high-quality semantic context information. And the other branch is spatial branch, which is designed as an inverse U-shaped structure with the proposed chain-reverse residual module to enhance semantic information and preserve spatial details. Moreover, a refinement module is proposed to add to the decoder to further refine the fusion features of context information and spatial information. The proposed approach achieves competitive results on the CamVid, SUN RGB-D and Cityscapes benchmarks.
Keywords:
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号