基于多尺度融合网络的视频快照压缩感知重建 Video Snapshot Compressed Sensing Reconstruction Based on Multi-scale Fusion Network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于多尺度融合网络的视频快照压缩感知重建

引用本文：	陈勋豪,杨莹,黄俊茹,孙玉宝.基于多尺度融合网络的视频快照压缩感知重建[J].计算机与现代化,2021,0(12):58-64.

作者姓名：	陈勋豪杨莹黄俊茹孙玉宝

作者单位：	南京信息工程大学江苏省大数据分析技术重点实验室,江苏南京 210044

基金项目：	国家自然科学基金资助项目（U2001211， 61672292）

摘要：	视频快照压缩感知基于压缩感知理论，仅在一次曝光过程中将多帧画面投影至二维快照测量，进而实现高速成像。为了从二维快照测量信号恢复出原视频信号，经典的重建算法基于视频的稀疏性先验进行迭代优化求解，但重建质量较低，且耗时过长。深度学习因优异的学习能力而受到广泛关注，基于深度学习的视频快照压缩重建方法也得到关注，但现有深度方法缺乏对于时空特征的有效表达，重建质量仍有待进一步提高。本文提出视频快照压缩感知重建的多尺度融合重构网络（MSF-Net），该网络从横向的卷积深度和纵向的分辨率2个维度展开，分辨率维度利用三维卷积进行不同尺度的视频特征的提取，横向维度利用伪三维卷积残差模块对同分辨率尺度的特征图进行层级提取，并通过不同尺度下的特征交叉融合来学习视频的时空特征。实验结果表明，本文方法能够同时提升重建质量与重建速度。
关键词：	视频快照压缩感知深度学习多尺度融合
收稿时间：	2021-12-24
Video Snapshot Compressed Sensing Reconstruction Based on Multi-scale Fusion Network

CHEN Xun-hao,YANG Ying,HUANG Jun-ru,SUN Yu-bao.Video Snapshot Compressed Sensing Reconstruction Based on Multi-scale Fusion Network[J].Computer and Modernization,2021,0(12):58-64.

Authors:	CHEN Xun-hao YANG Ying HUANG Jun-ru SUN Yu-bao

Abstract:	Video snapshot compressed sensing is based on the theory of compressed sensing, which only projects multiple frames to a two-dimensional snapshot measurement during one exposure process to achieve high-speed imaging. In order to recover the original video signal from the two-dimensional snapshot measurement signal, the classical reconstruction algorithm is based on the sparsity of the video prior to iterative optimization solution, but the reconstruction quality is low and time-consuming. Deep learning has attracted much attention because of its excellent learning ability as well as video snapshot compression reconstruction methods that developed based on it. However, the existing deep methods lack effective expression of spatiotemporal features, and the reconstruction quality still needs to be further improved. This paper proposes a multi-scale fusion reconstruction network (MSF-Net) for compressed sensing reconstruction of video snapshots. The network expands from the two dimensions of horizontal convolution depth and vertical resolution. The resolution dimension uses three-dimensional convolution to perform different scales. In the extraction of video features, the horizontal dimension uses the pseudo three-dimensional convolution residual module to extract hierarchically the feature maps of the same resolution scale, and learns the spatiotemporal features of the video through the cross fusion of features at different scales. Experimental results show that this method can improve the reconstruction quality and reconstruction speed at the same time.

Keywords:	video snapshot compressed sensing deep learning multi-scale fusion
本文献已被万方数据等数据库收录！
	点击此处可从《计算机与现代化》浏览原始摘要信息
	点击此处可从《计算机与现代化》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏