基于金字塔池化网络的道路场景深度估计方法 Depth Estimation of Monocular Road Images Based on Pyramid Scene Analysis Network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于金字塔池化网络的道路场景深度估计方法

引用本文：	周武杰,潘婷,顾鹏笠,翟治年.基于金字塔池化网络的道路场景深度估计方法[J].电子与信息学报,2019,41(10):2509-2515.

作者姓名：	周武杰潘婷顾鹏笠翟治年

作者单位：	浙江科技学院信息与电子工程学院杭州 310023;浙江大学信息与电子工程学院杭州 310027;浙江科技学院信息与电子工程学院杭州 310023

基金项目：	国家自然科学基金;浙江省自然科学基金

摘要：	针对从单目视觉图像中估计深度信息时存在的预测精度不够准确的问题，该文提出一种基于金字塔池化网络的道路场景深度估计方法。该方法利用4个残差网络块的组合提取道路场景图像特征，然后通过上采样将特征图逐渐恢复到原始图像尺寸，多个残差网络块的加入增加网络模型的深度；考虑到上采样过程中不同尺度信息的多样性，将提取特征过程中各种尺寸的特征图与上采样过程中相同尺寸的特征图进行融合，从而提高深度估计的精确度。此外，对4个残差网络块提取的高级特征采用金字塔池化网络块进行场景解析，最后将金字塔池化网络块输出的特征图恢复到原始图像尺寸并与上采样模块的输出一同输入预测层。通过在KITTI数据集上进行实验，结果表明该文所提的基于金字塔池化网络的道路场景深度估计方法优于现有的估计方法。
关键词：	单目视觉深度估计神经网络金字塔池化网络
收稿时间：	2018-10-12
Depth Estimation of Monocular Road Images Based on Pyramid Scene Analysis Network

Wujie ZHOU,Ting PAN,Pengli GU,Zhinian ZHAI.Depth Estimation of Monocular Road Images Based on Pyramid Scene Analysis Network[J].Journal of Electronics & Information Technology,2019,41(10):2509-2515.

Authors:	Wujie ZHOU Ting PAN Pengli GU Zhinian ZHAI

Affiliation:	1.School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China2.College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China

Abstract:	Considering the problem that the prediction accuracy is not accurate enough when the depth information is recovered from the monocular vision image, a method of depth estimation of road scenes based on pyramid pooling network is proposed. Firstly, using a combination of four residual network blocks, the road scene image features are extracted, and then through the sampling, the features are gradually restored to the original image size, and the depth of the residual block is increased. Considering the diversity of information in different scales, the features with same sizes extracted from the sampling process and the feature extraction process are merged. In addition, pyramid pooling network blocks are added to the advanced features extracted by four residual network blocks for scene analysis, and the feature graph output of pyramid pooling network blocks is finally restored to the original image size and input prediction layer together with the output of the upper sampling module. Through experiments on KITTI data set, the results show that the proposed method is superior to the existing method.

Keywords:
本文献已被万方数据等数据库收录！
	点击此处可从《电子与信息学报》浏览原始摘要信息
	点击此处可从《电子与信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏