首页 | 本学科首页   官方微博 | 高级检索  
     

轻量金字塔解码结构的单目深度估计网络
引用本文:贾瑞明,李彤.轻量金字塔解码结构的单目深度估计网络[J].计算机应用研究,2021,38(1):293-297.
作者姓名:贾瑞明  李彤
作者单位:北方工业大学信息学院,北京100144;北方工业大学信息学院,北京100144;北方工业大学信息学院,北京100144;北方工业大学信息学院,北京100144
基金项目:北方工业大学学生科技活动资助项目;国家自然科学基金面上项目
摘    要:针对单目深度估计网络庞大的参数量和计算量,提出一种轻量金字塔解码结构的单目深度估计网络,可以在保证估计精度的情况下降低网络模型的复杂度、减少运算时间。该网络基于编解码结构,以端到端的方式估计单目图像的深度图。编码端使用ResNet50网络结构;在解码端提出了一种轻量金字塔解码模块,采用深度空洞可分离卷积和分组卷积以提升感受野范围,同时减少了参数量,并且采用金字塔结构融合不同感受野下的特征图以提升解码模块的性能;此外,在解码模块之间增加跳跃连接实现知识共享,以提升网络的估计精度。在NYUD v2数据集上的实验结果表明,与结构注意力引导网络相比,轻量金字塔解码结构的单目深度估计网络在误差RMS的指标上降低约11.0%,计算效率提升约84.6%。

关 键 词:单目深度估计  卷积神经网络  编解码结构  轻量金字塔解码
收稿时间:2019/9/28 0:00:00
修稿时间:2020/12/12 0:00:00

Monocular depth estimation based on light-weight pyramid decoder convolution neural network
jiaruiming and litong.Monocular depth estimation based on light-weight pyramid decoder convolution neural network[J].Application Research of Computers,2021,38(1):293-297.
Authors:jiaruiming and litong
Affiliation:(School of Information Science&Technology,North China University of Technology,Beijing 100144,China)
Abstract:This paper proposed a light-weight pyramid decoder convolution neural network(LPDNet)for monocular depth estimation,which could reduce the complexity and the computation time of the network model while ensuring the estimation accuracy.LPDNet was based on encoder-decoder structure to estimate the depth map of a monocular image in an end-to-end manner.The encoder network adopted ResNet50.The main part of decoder network was light-weight pyramid decoder(LPD)module,which learned representations from a large receptive field with fewer parameters by using depth-wise dilated separable convolutions and group convolutions.LPD module fused feature maps of different receptive fields through pyramid structure.Besides,in order to perform better knowledge sharing for estimation accuracy,it added deconvolution skip connection between adjacent decoder modules.Experiments on NYUD v2 dataset demonstrate that compared with the structured attention guided network in CVPR2018,the error of LPDNet is reduced by about 11.0%in RMS,and computational efficiency is about 84.6%higher.
Keywords:monocular depth estimation  convolution neural network  encoder-decoder  light-weight pyramid decoder
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号