首页 | 本学科首页   官方微博 | 高级检索  
     

基于可变形和深度可分离卷积的立体匹配
引用本文:高会敏,徐志京. 基于可变形和深度可分离卷积的立体匹配[J]. 光电子.激光, 2021, 32(11): 1180-1187
作者姓名:高会敏  徐志京
作者单位:上海海事大学信息工程学院,上海201306
基金项目:国家自然科学基金(61673259) 和航空科学基金(201955015001)资助项目 (上海海事大学 信息工程学院,上海 201306)
摘    要:针对传统卷积神经网络(convolutional neural network, CNN)在立体匹配过程中存在信息 损失和耗时等问题,提出了基于可变形和深度可 分离卷积的立体匹配算法。在特征提取过程中,利用可变形卷积和可变形卷积核构建残差网 络,完成自适 应学习,扩大有效感受野,从而适应物体的不同形变,获取更详细的特征,减少信息损失, 提高了匹配精 度。在特征聚合阶段,采用深度可分离卷积构建深度可分离聚合网络,在空间维度和通道维 度分别进行卷 积运算,以降低参数量和计算复杂度,保证了匹配实时性。在相关的数据集上进行测试,实 验结果表明, 算法的网络运行时间缩短为1.60 s,在KITTI 2015和 KITTI 2012数据集上三像素错误率分别为2.84%和 2.79%,在SceneFlow数据集上端点误差为1.59 %。相比其他基准网络,减少了网络模型的运算量同时算法精度有很大提升。

关 键 词:深度学习  可变形卷积  深度可分离卷积  卷积神经网络(convolutional neural network  CNN)  立体匹配
收稿时间:2021-04-06

Stereo matching based on deformable and depth separable convolution
Affiliation:College of Information Engineering,Shanghai Maritime University,Shanghai 201306,China and College of Information Engineering,Shanghai Maritime University,Shanghai 201306,China
Abstract:In order to solve the problems of info rmation loss and time consuming in stereo matching of traditional convolutional neural network (CNN),a stereo matching algorithm based on deformable a nd depth separable convolution is designed in this paper.In the process of featu re extraction,deformable convolution and deformable convolution kernel are used to construct residual network to complete adaptive learning and expand the effective receptive field. So as to adapt to different deformations of the object,and to obtain more detai led feature,reduce information loss and improve the matching accuracy.In the process of feature aggregation,the de pth separable convolution is used to construct depth separable aggregation network,the convolution operation is c arried out in the spatial dimension and channel dimension respectively to reduces number of parameters and computati onal complexity,and ensures real-time matching.Tests on related datasets,the experimental result shows th at the network running time of the algorithm is reduced to 1.60s,the three-pixel error rates are 2.84% and 2.79% respectively on KIKKT 2015and KITTI 2012datasets.The endpoints error rate is 1.59% on SceneFlow dataset.Com pared with other benchmark networks,the computational complexity of the network model is reduced and the a ccuracy of the algorithm is greatly improved.
Keywords:deep learning   deformable convolution   depth separable convolution   c onvolutional neural network (CNN)   stereo matching
本文献已被 万方数据 等数据库收录!
点击此处可从《光电子.激光》浏览原始摘要信息
点击此处可从《光电子.激光》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号