多头注意力与语义视频标注 Multi-Head Attention and Semantic Video Captioning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

多头注意力与语义视频标注

引用本文：	石开,胡燕. 多头注意力与语义视频标注[J]. 计算机工程与应用, 2020, 56(6): 133-139. DOI: 10.3778/j.issn.1002-8331.1811-0306

作者姓名：	石开胡燕

作者单位：	武汉理工大学计算机学院,武汉 430070

基金项目：	湖北省自然科学基金重点类项目

摘要：	在序列到序列的视频标注模型中,视频信息在经过编码之后被大幅压缩导致解码器端不能充分利用。为了解决这一问题,在模型中引入多头注意力机制和语义信息。多头注意力使得模型在生成不同的单词时可以焦距编码端视频信息的不同部分。语义信息由语义探测单元通过多标签分类方式生成视频的语义概率信息方式引入,给解码端提供额外指导,改进后的模型仍然是端到端的。实验结果表明,改进后的模型标注效果取得了显著的提升,采用的改进方法对提升标注能力有明显作用。
关键词：	视频标注多头注意力语义信息
Multi-Head Attention and Semantic Video Captioning

SHI Kai,HU Yan. Multi-Head Attention and Semantic Video Captioning[J]. Computer Engineering and Applications, 2020, 56(6): 133-139. DOI: 10.3778/j.issn.1002-8331.1811-0306

Authors:	SHI Kai HU Yan

Affiliation:	School of Computer, Wuhan University of Technology, Wuhan 430070, China

Abstract:	In the sequence-to-sequence video captioning model,the video information is greatly compressed after being encoded,resulting in the decoder side cannot fully utilized the video information.To solve this problem,a multi-head attention mechanism and semantic information are introduced into the model.The multi-head attention allows the model to focus different parts of the video information when generate different words.The semantic information is introduced by the semantic detection unit through the multi-label classification approach to generate the semantic probability information of the video,which provides additional guidance to the decoding end.The modified model is still training in end-to-end.The experimental results show that the modified model captioning effect has been significantly improved,and the modified method has a significant effect on improving the captioning ability.

Keywords:	video captioning multi-head attention semantic information
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏