首页 | 本学科首页   官方微博 | 高级检索  
     

基于编解码卷积神经网络的单张图像深度估计
引用本文:贾瑞明,刘立强,刘圣杰,崔家礼. 基于编解码卷积神经网络的单张图像深度估计[J]. 图学学报, 2019, 40(4): 718. DOI: 10.11996/JG.j.2095-302X.2019040718
作者姓名:贾瑞明  刘立强  刘圣杰  崔家礼
作者单位:北方工业大学信息学院,北京,100144;北方工业大学信息学院,北京,100144;北方工业大学信息学院,北京,100144;北方工业大学信息学院,北京,100144
基金项目:北京市教委面上基金(KM201510009005);北方工业大学学生科技活动项目(110051360007)
摘    要:摘 要:针对传统方法在单目视觉图像深度估计时存在鲁棒性差、精度低等问题,提出一 种基于卷积神经网络(CNN)的单张图像深度估计方法。首先,提出层级融合编码器-解码器网络, 该网络是对端到端的编码器-解码器网络结构的一种改进。编码器端引入层级融合模块,并通过 对多层级特征进行融合,提升网络对多尺度信息的利用率。其次,提出多感受野残差模块,其 作为解码器的主要组成部分,负责从高级语义信息中估计深度信息。同时,多感受野残差模块 可灵活地调整网络感受野大小,提高网络对多尺度特征的提取能力。在 NYUD v2 数据集上完 成网络模型有效性验证。实验结果表明,与多尺度卷积神经网络相比,该方法在精度 δ<1.25 上 提高约 4.4%,在平均相对误差指标上降低约 8.2%。证明其在单张图像深度估计的可行性。

关 键 词:CNN  编码器-解码器  深度估计  单目视觉

Single Image Depth Estimation Based on Encoder-Decoder Convolution Neural Network
JIA Rui-ming,LIU Li-qiang,LIU Sheng-jie,CUI Jia-li. Single Image Depth Estimation Based on Encoder-Decoder Convolution Neural Network[J]. Journal of Graphics, 2019, 40(4): 718. DOI: 10.11996/JG.j.2095-302X.2019040718
Authors:JIA Rui-ming  LIU Li-qiang  LIU Sheng-jie  CUI Jia-li
Affiliation:(School of Information Science and Technology, North China University of Technology, Beijing 100144, China)
Abstract:Abstract: Focusing on the poor robustness and lower accuracy in traditional methods of estimating depth in monocular vision, a method based on convolution neural network (CNN) is proposed for predicting depth from a single image. At first, fused-layers encoder-decoder network is presented. This network is an improvement of the end-to-end encoder-decoder network structure. Fused-layers block is added to encoder network, and the network utilization of multi-scale information is improved by this block with fusing multi-layers feature. Then, a multi-receptive field res-block is proposed, which is the main component of the decoder and used for estimating depth from high-level semantic information. Meanwhile, the network capacity of multi-scale feature extraction is enhanced because the size of receptive field is flexible to change in multi-receptive field res-block. The validation of proposed network is conducted on NYUD v2 dataset, and compared with multi-scale convolution neural network, experimental results show that the accuracy of proposed method is improved by about 4.4% in δ<1.25 and average relative error is reduced by about 8.2%. The feasibility of proposed method in estimating depth from a single image is proved.
Keywords:Keywords: CNN  encoder-decoder  depth estimation  monocular vision  
本文献已被 万方数据 等数据库收录!
点击此处可从《图学学报》浏览原始摘要信息
点击此处可从《图学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号