首页 | 本学科首页   官方微博 | 高级检索  
     

基于场景对象注意与深度图融合的深度估计
引用本文:温静,杨洁.基于场景对象注意与深度图融合的深度估计[J].计算机工程,2023,49(2):222-230.
作者姓名:温静  杨洁
作者单位:山西大学 计算机与信息技术学院, 太原 030006
基金项目:山西省基础性研究计划(201901D211176)。
摘    要:现有单目深度估计算法主要从单幅图像中获取立体信息,存在相邻深度边缘细节模糊、明显的对象缺失问题。提出一种基于场景对象注意机制与加权深度图融合的单目深度估计算法。通过特征矩阵相乘的方式计算特征图任意两个位置之间的相似特征向量,以快速捕获长距离依赖关系,增强用于估计相似深度区域的上下文信息,从而解决自然场景中对象深度信息不完整的问题。基于多尺度特征图融合的优点,设计加权深度图融合模块,为具有不同深度信息的多视觉粒度的深度图赋予不同的权值并进行融合,融合后的深度图包含深度信息和丰富的场景对象信息,有效地解决细节模糊问题。在KITTI数据集上的实验结果表明,该算法对目标图像预估时σ<1.25的准确率为0.879,绝对相对误差、平方相对误差和对数均方根误差分别为0.110、0.765和0.185,预测得到的深度图具有更加完整的场景对象轮廓和精确的深度信息。

关 键 词:场景对象注意  加权深度图融合  上下文信息  深度估计  三维重建
收稿时间:2022-03-22
修稿时间:2022-05-27

Depth Estimation Based on Scene Object Attention and Depth Map Fusion
WEN Jing,YANG Jie.Depth Estimation Based on Scene Object Attention and Depth Map Fusion[J].Computer Engineering,2023,49(2):222-230.
Authors:WEN Jing  YANG Jie
Affiliation:School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China
Abstract:The existing monocular depth estimation algorithm mainly obtains stereo information from a single image.This approach leads to blurred details of adjacent depth edges and apparent missing objects.A monocular depth estimation algorithm based on scene object attention mechanism and weighted depth map fusion is proposed.The similarity feature vector between any two positions of feature map is calculated by multiplying the feature matrix to rapidly capture the long-distance dependency relationship.The dependency between any two positions in the image can enhance the context information used to estimate the similar depth area, thus, solving the incomplete object depth information in the natural scene.Based on the advantages of multi-scale feature map fusion, weighted depth map fusion module is designed.The multi-vision granularity depth map with different depth information data is assigned different weights for fusion.The fused depth map contains depth information and rich-scene object information for effectively solving the problem of fuzzy details.The experimental results on the KITTI dataset show that an accuracy rate of the proposed algorithm for target image prediction is 0.879 at σ<1.25, and the absolute relative error, square relative error, and logarithmic root mean square error are 0.110, 0.765, and 0.185, respectively.The predicted depth map has a complete scene object contour and accurate depth information.
Keywords:scene object attention  weighted depth map fusion  context information  depth estimation  three-dimensional reconstruction  
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号