首页 | 本学科首页   官方微博 | 高级检索  
     

基于空间结构化推理深度融合网络的RGB-D场景解析
引用本文:王泽宇,吴艳霞,张国印,布树辉.基于空间结构化推理深度融合网络的RGB-D场景解析[J].电子学报,2018,46(5):1253-1258.
作者姓名:王泽宇  吴艳霞  张国印  布树辉
作者单位:1. 哈尔滨工程大学计算机科学与技术学院, 黑龙江哈尔滨 150001; 2. 西北工业大学航空学院, 陕西西安 710072
摘    要:为了弥补RGB-D场景解析中卷积神经网络空间结构化学习能力的不足,本文基于深度学习提出空间结构化推理深度融合网络,内嵌的结构化推理层有机地结合条件随机场和空间结构化推理模型,该层能够较为全面而准确地学习物体所处三维空间的物体分布以及物体间的三维空间位置关系.在此基础上,网络的特征融合层巧妙地利用深度置信网络和改进的条件随机场,该层可以根据融合生成的物体综合语义信息和物体间语义相关性信息完成深度结构化学习.实验结果表明,在标准RGB-D数据集NYUDv2和SUNRGBD上,空间结构化推理深度融合网络分别实现最优的平均准确率53.8%和54.6%,从而有助于实现机器人任务规划、车辆自动驾驶等智能计算机视觉任务.

关 键 词:RGB-D场景解析  深度学习  卷积神经网络  条件随机场  空间结构化推理模型  深度置信网络  计算机视觉  机器人任务规划  车辆自动驾驶  
收稿时间:2017-05-19

RGB-D Scene Parsing Based on Spatial Structured Inference Deep Fusion Networks
WANG Ze-yu,WU Yan-xia,ZHANG Guo-yin,BU Shu-hui.RGB-D Scene Parsing Based on Spatial Structured Inference Deep Fusion Networks[J].Acta Electronica Sinica,2018,46(5):1253-1258.
Authors:WANG Ze-yu  WU Yan-xia  ZHANG Guo-yin  BU Shu-hui
Affiliation:1. College of Computer Science and Technology, Harbin Engineering University, Harbin, Heilongjiang 150001, China; 2. School of Aeronautics, Northwestern Polytechnical University, Xi'an, Shaanxi 710072, China
Abstract:In order to make up the drawbacks that convolutional neural networks lack the ability of spatial structured learning in RGB-D scene parsing,we propose spatial structured inference deep fusion networks (SSIDFNs) on the basis of deep learning,the embedded structural inference layer organically combines conditional random fields (CRFs) and spatial structured inference model,which is able to learn the three-dimensional spatial distributions of objects and three-dimensional spatial relationships among objects in a more comprehensive and accurate way.Furthermore,the feature fusion layer takes both advantages of deep belief networks and improved CRFs,which is able to achieve deep structured learning according to the comprehensive semantic information of objects and semantic correlation in formation among objects.The experimental results demonstrate that the proposed SSIDFNs achieve the best mean accuracy 53.8% and 54.6% on the standard RGB-D datasets NYUDv2 and SUNRGBD respectively,which will be helpful to implement intelligent computer vision tasks,such as robot task planning and self-driving cars.
Keywords:RGB-D scene parsing  deep learning  convolutional neural networks  conditional random fields  spatial structured inference model  deep belief networks  computer vision  robot task planning  self-driving cars  
点击此处可从《电子学报》浏览原始摘要信息
点击此处可从《电子学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号