首页 | 本学科首页   官方微博 | 高级检索  
     

多头注意力与语义视频标注
引用本文:石开,胡燕. 多头注意力与语义视频标注[J]. 计算机工程与应用, 2020, 56(6): 133-139. DOI: 10.3778/j.issn.1002-8331.1811-0306
作者姓名:石开  胡燕
作者单位:武汉理工大学 计算机学院,武汉 430070
基金项目:湖北省自然科学基金重点类项目
摘    要:在序列到序列的视频标注模型中,视频信息在经过编码之后被大幅压缩导致解码器端不能充分利用。为了解决这一问题,在模型中引入多头注意力机制和语义信息。多头注意力使得模型在生成不同的单词时可以焦距编码端视频信息的不同部分。语义信息由语义探测单元通过多标签分类方式生成视频的语义概率信息方式引入,给解码端提供额外指导,改进后的模型仍然是端到端的。实验结果表明,改进后的模型标注效果取得了显著的提升,采用的改进方法对提升标注能力有明显作用。

关 键 词:视频标注  多头注意力  语义信息  

Multi-Head Attention and Semantic Video Captioning
SHI Kai,HU Yan. Multi-Head Attention and Semantic Video Captioning[J]. Computer Engineering and Applications, 2020, 56(6): 133-139. DOI: 10.3778/j.issn.1002-8331.1811-0306
Authors:SHI Kai  HU Yan
Affiliation:School of Computer, Wuhan University of Technology, Wuhan 430070, China
Abstract:In the sequence-to-sequence video captioning model,the video information is greatly compressed after being encoded,resulting in the decoder side cannot fully utilized the video information.To solve this problem,a multi-head attention mechanism and semantic information are introduced into the model.The multi-head attention allows the model to focus different parts of the video information when generate different words.The semantic information is introduced by the semantic detection unit through the multi-label classification approach to generate the semantic probability information of the video,which provides additional guidance to the decoding end.The modified model is still training in end-to-end.The experimental results show that the modified model captioning effect has been significantly improved,and the modified method has a significant effect on improving the captioning ability.
Keywords:video captioning  multi-head attention  semantic information
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号