首页 | 本学科首页   官方微博 | 高级检索  
     

基于多尺度融合网络的视频快照压缩感知重建
引用本文:陈勋豪,杨莹,黄俊茹,孙玉宝. 基于多尺度融合网络的视频快照压缩感知重建[J]. 计算机与现代化, 2021, 0(12): 58-64. DOI: 10.3969/j.issn.1006-2475.2021.12.010
作者姓名:陈勋豪  杨莹  黄俊茹  孙玉宝
作者单位:南京信息工程大学江苏省大数据分析技术重点实验室,江苏 南京 210044
基金项目:国家自然科学基金资助项目(U2001211, 61672292)
摘    要:视频快照压缩感知基于压缩感知理论,仅在一次曝光过程中将多帧画面投影至二维快照测量,进而实现高速成像。为了从二维快照测量信号恢复出原视频信号,经典的重建算法基于视频的稀疏性先验进行迭代优化求解,但重建质量较低,且耗时过长。深度学习因优异的学习能力而受到广泛关注,基于深度学习的视频快照压缩重建方法也得到关注,但现有深度方法缺乏对于时空特征的有效表达,重建质量仍有待进一步提高。本文提出视频快照压缩感知重建的多尺度融合重构网络(MSF-Net),该网络从横向的卷积深度和纵向的分辨率2个维度展开,分辨率维度利用三维卷积进行不同尺度的视频特征的提取,横向维度利用伪三维卷积残差模块对同分辨率尺度的特征图进行层级提取,并通过不同尺度下的特征交叉融合来学习视频的时空特征。实验结果表明,本文方法能够同时提升重建质量与重建速度。

关 键 词:视频快照  压缩感知; 深度学习; 多尺度融合  
收稿时间:2021-12-24

Video Snapshot Compressed Sensing Reconstruction Based on Multi-scale Fusion Network
CHEN Xun-hao,YANG Ying,HUANG Jun-ru,SUN Yu-bao. Video Snapshot Compressed Sensing Reconstruction Based on Multi-scale Fusion Network[J]. Computer and Modernization, 2021, 0(12): 58-64. DOI: 10.3969/j.issn.1006-2475.2021.12.010
Authors:CHEN Xun-hao  YANG Ying  HUANG Jun-ru  SUN Yu-bao
Abstract:Video snapshot compressed sensing is based on the theory of compressed sensing, which only projects multiple frames to a two-dimensional snapshot measurement during one exposure process to achieve high-speed imaging. In order to recover the original video signal from the two-dimensional snapshot measurement signal, the classical reconstruction algorithm is based on the sparsity of the video prior to iterative optimization solution, but the reconstruction quality is low and time-consuming. Deep learning has attracted much attention because of its excellent learning ability as well as video snapshot compression reconstruction methods that developed based on it. However, the existing deep methods lack effective expression of spatiotemporal features, and the reconstruction quality still needs to be further improved. This paper proposes a multi-scale fusion reconstruction network (MSF-Net) for compressed sensing reconstruction of video snapshots. The network expands from the two dimensions of horizontal convolution depth and vertical resolution. The resolution dimension uses three-dimensional convolution to perform different scales. In the extraction of video features, the horizontal dimension uses the pseudo three-dimensional convolution residual module to extract hierarchically the feature maps of the same resolution scale, and learns the spatiotemporal features of the video through the cross fusion of features at different scales. Experimental results show that this method can improve the reconstruction quality and reconstruction speed at the same time.
Keywords:video snapshot  compressed sensing  deep learning  multi-scale fusion  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机与现代化》浏览原始摘要信息
点击此处可从《计算机与现代化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号