首页 | 本学科首页   官方微博 | 高级检索  
     

基于双重注意力孪生网络的实时视觉跟踪
引用本文:杨康,宋慧慧,张开华. 基于双重注意力孪生网络的实时视觉跟踪[J]. 计算机应用, 2019, 39(6): 1652-1656. DOI: 10.11772/j.issn.1001-9081.2018112419
作者姓名:杨康  宋慧慧  张开华
作者单位:江苏省大数据分析技术重点实验室(南京信息工程大学),南京,211800;大气环境与装备技术协同创新中心(南京信息工程大学),南京,211800
基金项目:国家自然科学基金资助项目(61872189,61876088);江苏省自然科学基金资助项目(BK20170040);江苏省研究生科研与实践创新计划项目(SJCX19_0311)。
摘    要:为了解决全卷积孪生网络(SiamFC)跟踪算法在跟踪目标经历剧烈的外观变化时容易发生模型漂移从而导致跟踪失败的问题,提出了一种双重注意力机制孪生网络(DASiam)去调整网络模型并且不需要在线更新。首先,主干网络使用修改后表达能力更强的并适用于目标跟踪任务的VGG网络;然后,在网络的中间层加入一个新的双重注意力机制去动态地提取特征,这种机制由通道注意机制和空间注意机制组成,分别对特征图的通道维度和空间维度进行变换得到双重注意特征图;最后,通过融合两个注意机制的特征图进一步提升模型的表征能力。在三个具有挑战性的跟踪基准库即OTB2013、OTB100和2017年视觉目标跟踪库(VOT2017)实时挑战上进行实验,实验结果表明,以40 frame/s的速度运行时,所提算法在OTB2013和OTB100上的成功率指标比基准SiamFC分别高出3.5个百分点和3个百分点,并且在VOT2017实时挑战上面超过了2017年的冠军SiamFC,验证了所提出算法的有效性。

关 键 词:卷积神经网络  视觉跟踪  注意力机制  孪生网络
收稿时间:2018-12-07
修稿时间:2019-01-10

Real-time visual tracking based on dual attention siamese network
YANG Kang,SONG Huihui,ZHANG Kaihua. Real-time visual tracking based on dual attention siamese network[J]. Journal of Computer Applications, 2019, 39(6): 1652-1656. DOI: 10.11772/j.issn.1001-9081.2018112419
Authors:YANG Kang  SONG Huihui  ZHANG Kaihua
Affiliation:1. Jiangsu Key Laboratory of Big Data Analysis Technology(Nanjing University of Information Science and Technology), Nanjing Jiangsu 211800, China;2. Collaborative Innovation Center of Atmospheric Environment and Equipment Technology(Nanjing University of Information Science and Technology), Nanjing Jiangsu 211800, China
Abstract:In order to solve the problem that Fully-Convolutional Siamese network (SiamFC) tracking algorithm is prone to model drift and results in tracking failure when the tracking target suffers from dramatic appearance changes, a new Dual Attention Siamese network (DASiam) was proposed to adapt the network model without online updating. Firstly, a modified Visual Geometry Group (VGG) network which was more expressive and suitable for the target tracking task was used as the backbone network. Then, a novel dual attention mechanism was added to the middle layer of the network to dynamically extract features. This mechanism was consisted of a channel attention mechanism and a spatial attention mechanism. The channel dimension and the spatial dimension of the feature maps were transformed to obtain the double attention feature maps. Finally, the feature representation of the model was further improved by fusing the feature maps of the two attention mechanisms. The experiments were conducted on three challenging tracking benchmarks:OTB2013, OTB100 and 2017 Visual-Object-Tracking challenge (VOT2017) real-time challenges. The experimental results show that, running at the speed of 40 frame/s, the proposed algorithm has higher success rates on OTB2013 and OTB100 than the baseline SiamFC by the margin of 3.5 percentage points and 3 percentage points respectively, and surpass the 2017 champion SiamFC in the VOT2017 real-time challenge, verifying the effectiveness of the proposed algorithm.
Keywords:convolutional neural network   visual tracking   attention mechanism   siamese network
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号