自注意力机制的视频摘要模型 Self-Attention Based Video Summarization期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

自注意力机制的视频摘要模型

引用本文：	李依依,王继龙.自注意力机制的视频摘要模型[J].计算机辅助设计与图形学学报,2020,32(4):652-659.

作者姓名：	李依依王继龙

作者单位：	清华大学网络科学与网络空间研究院北京 100084;北京市第四中学北京 100034;清华大学网络科学与网络空间研究院北京 100084

基金项目：	国家发改委下一代互联网技术研发;产业化和规模商用专项

摘要：	针对如何高效地识别出视频中具有代表性的内容问题,提出了一种对不同的视频帧赋予不同重要性的视频摘要算法.首先使用长短期记忆网络来建模视频序列的时序关系,然后利用自注意力机制建模视频中不同帧的重要性程度并提取全局特征,最后通过每一帧回归得到的重要性得分进行采样,并使用强化学习策略优化模型参数.其中,强化学习的动作定义为每一帧选或者不选,状态定义为当前这个视频的选择情况,反馈信号使用多样性和代表性代价.在2个公开数据集SumMe和TVSum中进行视频摘要实验,并使用F-度量来衡量这2个数据集上不同视频摘要算法的准确度,实验结果表明,提出的视频摘要算法结果要优于其他算法.
关键词：	视频摘要自注意力机制递归神经网络强化学习
Self-Attention Based Video Summarization

Li Yiyi,Wang Jilong.Self-Attention Based Video Summarization[J].Journal of Computer-Aided Design & Computer Graphics,2020,32(4):652-659.

Authors:	Li Yiyi Wang Jilong

Affiliation:	(Institute for Network Sciences and Cyberspace,Tsinghua University,Beijing 100084;Beijing No.4 High School,Beijing 100034)

Abstract:	Video summarization aims to identify the most representative contexts in videos.In this paper,we propose a new video summarization method which assigns different importance to video frames.Specifically,we exploit bidirectional LSTMs to capture temporal information of video frames and then employ self-attention mechanism to pay different attention on different frames for extracting their global features.Finally,we sample an action for each frame by using the corresponding regression score and apply the reinforcement learning strategy to optimize parameters in our model,where actions are defined as select or not select the current frame,states are defined as actions for the whole video,and the reward is defined as the sum of representative and diversity costs.We conduct video summarization experiments on two public video summarization datasets including SumMe and TVSum and evaluate the performance by using F-measure.Experimental results demonstrate that our proposed video summarization method has achieved the superior performance,comparing to the state-of-the-arts.

Keywords:	video summarization self-attention mechanism recurrent neural network reinforcement learning
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏