首页 | 本学科首页   官方微博 | 高级检索  
     

基于注意力融合网络的视频超分辨率重建
引用本文:卞鹏程,郑忠龙,李明禄,何依然,王天翔,张大伟,陈丽媛. 基于注意力融合网络的视频超分辨率重建[J]. 计算机应用, 2021, 41(4): 1012-1019. DOI: 10.11772/j.issn.1001-9081.2020081292
作者姓名:卞鹏程  郑忠龙  李明禄  何依然  王天翔  张大伟  陈丽媛
作者单位:浙江师范大学 数学与计算机科学学院, 浙江 金华 321004
基金项目:国家自然科学基金资助项目;浙江省自然科学基金资助项目
摘    要:基于深度学习的视频超分辨率方法主要关注视频帧内和帧间的时空关系,但以往的方法在视频帧的特征对齐和融合方面存在运动信息估计不精确、特征融合不充分等问题.针对这些问题,采用反向投影原理并结合多种注意力机制和融合策略构建了一个基于注意力融合网络(AFN)的视频超分辨率模型.首先,在特征提取阶段,为了处理相邻帧和参考帧之间的多...

关 键 词:超分辨率  注意力机制  特征融合  反向投影  视频重建
收稿时间:2020-08-24
修稿时间:2020-09-18

Attention fusion network based video super-resolution reconstruction
BIAN Pengcheng,ZHENG Zhonglong,LI Minglu,HE Yiran,WANG Tianxiang,ZHANG Dawei,CHEN Liyuan. Attention fusion network based video super-resolution reconstruction[J]. Journal of Computer Applications, 2021, 41(4): 1012-1019. DOI: 10.11772/j.issn.1001-9081.2020081292
Authors:BIAN Pengcheng  ZHENG Zhonglong  LI Minglu  HE Yiran  WANG Tianxiang  ZHANG Dawei  CHEN Liyuan
Affiliation:College of Mathematics and Computer Science, Zhejiang Normal University, Jinhua Zhejiang 321004, China
Abstract:Video super-resolution methods based on deep learning mainly focus on the inter-frame and intra-frame spatio-temporal relationships in the video, but previous methods have many shortcomings in the feature alignment and fusion of video frames, such as inaccurate motion information estimation and insufficient feature fusion. Aiming at these problems, a video super-resolution model based on Attention Fusion Network(AFN) was constructed with the use of the back-projection principle and the combination of multiple attention mechanisms and fusion strategies. Firstly, at the feature extraction stage, in order to deal with multiple motions between neighbor frames and reference frame, the back-projection architecture was used to obtain the error feedback of motion information. Then, a temporal, spatial and channel attention fusion module was used to perform the multi-dimensional feature mining and fusion. Finally, at the reconstruction stage, the obtained high-dimensional features were convoluted to reconstruct high-resolution video frames. By learning different weights of features within and between video frames, the correlations between video frames were fully explored, and an iterative network structure was adopted to process the extracted features gradually from coarse to fine. Experimental results on two public benchmark datasets show that AFN can effectively process videos with multiple motions and occlusions, and achieves significant improvements in quantitative indicators compared to some mainstream methods. For instance, for 4-times reconstruction task, the Peak Signal-to-Noise Ratio(PSNR) of the frame reconstructed by AFN is 13.2% higher than that of Frame Recurrent Video Super-Resolution network(FRVSR) on Vid4 dataset and 15.3% higher than that of Video Super-Resolution network using Dynamic Upsampling Filter(VSR-DUF) on SPMCS dataset.
Keywords:super-resolution  attention mechanism  feature fusion  back-projection  video reconstruction  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号