首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度神经网络的视频播放速度识别
引用本文:陈荣源,姚剑敏,严群,林志贤.基于深度神经网络的视频播放速度识别[J].计算机应用,2022,42(7):2043-2051.
作者姓名:陈荣源  姚剑敏  严群  林志贤
作者单位:福州大学 物理与信息工程学院,福州 350108
晋江市博感电子科技有限公司,福建 晋江 362201
基金项目:国家重点研发计划项目(2016YFB0401503);;广东省科技重大专项(2016B090906001);;福建省科技重大专项(2014HZ0003-1);
摘    要:针对目前的视频播放速度识别算法大多存在的提取精度差、模型参数量巨大的问题,提出了一种双支轻量化视频播放速度识别网络。首先,该网络是基于SlowFast双支网络架构组建的一个三维(3D)卷积网络;其次,为了弥补S3D-G网络在视频播放速度识别任务中存在的参数量大、浮点运算数多的缺陷,进行了轻量化的网络结构调整;最后,在网络结构中引入了高效通道注意力(ECA)模块,以通过通道注意力模块生成重点关注的内容对应的通道范围,这有助于提高视频特征提取的准确性。在Kinetics-400数据集上将所提网络与S3D-G、SlowFast网络进行对比实验。实验结果表明,所提网络在精确度差不多的情况下,模型大小和模型参数均比SlowFast减少了大约96%,浮点运算数减少到5.36 GFLOPs,显著提高了运行速度。

关 键 词:深度神经网络  视频播放速度识别  双支网络  通道注意力  轻量化模型  
收稿时间:2021-05-17
修稿时间:2021-10-14

Video playback speed recognition based on deep neural network
Rongyuan CHEN,Jianmin YAO,Qun YAN,Zhixian LIN.Video playback speed recognition based on deep neural network[J].journal of Computer Applications,2022,42(7):2043-2051.
Authors:Rongyuan CHEN  Jianmin YAO  Qun YAN  Zhixian LIN
Affiliation:College of Physics and Information Engineering,Fuzhou University,Fuzhou Fujian 350108,China
Jinjiang RichSense Electronic Technology Company Limited,Jinjiang Fujian 362201,China
Abstract:Most of the current video playback speed recognition algorithms have poor extraction accuracy and many model parameters. Aiming at these problems, a dual-branch lightweight video playback speed recognition network was proposed. First, this network was a Three Dimensional (3D) convolutional network constructed on the basis of the SlowFast dual-branch network architecture. Secondly, in order to deal with the large number of parameters and many floating-point operations of S3D-G (Separable 3D convolutions network with Gating mechanism) network in video playback speed recognition tasks, a lightweight network structure adjustment was carried out. Finally, the Efficient Channel Attention (ECA) module was introduced in the network structure to generate the channel range corresponding to the focused content through the channel attention module, which helped to improve the accuracy of video feature extraction. In experiments, the proposed network was compared with S3D-G, SlowFast networks on the Kinetics-400 dataset. Experimental results show that with similar accuracy, the proposed network reduces both model size and model parameters by about 96% compared to SlowFast network, and the number of floating-point operations of the network is reduced to 5.36 GFLOPs, which means the running speed is increased significantly.
Keywords:deep neural network  video playback speed recognition  dual-branch network  channel attention  lightweight model  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号