首页 | 本学科首页   官方微博 | 高级检索  
     

自适应混合注意力深度跨模态哈希
引用本文:柳兴华,曹桂涛,林秋斌,曹文明.自适应混合注意力深度跨模态哈希[J].计算机应用,2022,42(12):3663-3670.
作者姓名:柳兴华  曹桂涛  林秋斌  曹文明
作者单位:深圳大学 电子与信息工程学院, 广东 深圳 518060
广东省多媒体信息服务工程技术研究中心(深圳大学), 广东 深圳 518060
华东师范大学 软件工程学院, 上海 200062
基金项目:国家自然科学基金资助项目(61771322)
摘    要:针对现有哈希方法在特征学习过程中无法区分各区域特征信息的重要程度和不能充分利用标签信息来深度挖掘模态间相关性的问题,提出了自适应混合注意力深度跨模态哈希检索(AHAH)模型。首先,通过自主学习得到的权重将通道注意力和空间注意力有机结合来强化对特征图中相关目标区域的关注度,同时弱化对不相关目标区域的关注度;其次,通过对模态标签进行统计分析,并使用所提出的相似度计算方法将相似度量化为0~1的数字以更精细地表示模态间的相似性。在4个常用的数据集MIRFLICKR-25K、NUS-WIDE、MSCOCO和IAPR TC-12上,当哈希码长度为16 bit时,与最先进的方法多标签语义保留哈希(MLSPH)相比,所提方法的检索平均准确率均值(mAP)分别提高了2.25%、1.75%、6.8%和2.15%。此外,消融实验和效率分析也证明了所提方法的有效性。

关 键 词:跨模态检索  哈希方法  深度神经网络  自适应  混合注意力
收稿时间:2021-10-22
修稿时间:2021-12-20

Adaptive hybrid attention hashing for deep cross-modal retrieval
Xinghua LIU,Guitao CAO,Qiubin LIN,Wenming CAO.Adaptive hybrid attention hashing for deep cross-modal retrieval[J].journal of Computer Applications,2022,42(12):3663-3670.
Authors:Xinghua LIU  Guitao CAO  Qiubin LIN  Wenming CAO
Affiliation:College of Electronics and Information Engineering,Shenzhen University,Shenzhen Guangdong 518060,China
Guangdong Multimedia Information Service Engineering Technology Research Center (Shenzhen University),Shenzhen Guangdong 518060,China
Software Engineering Institute,East China Normal University,Shanghai 200062,China
Abstract:In feature learning process, the existing hashing methods cannot distinguish the importance of the feature information of each region, and cannot utilize the label information to explore the correlation between modalities. Therefore, an Adaptive Hybrid Attention Hashing for deep cross-modal retrieval (AHAH) model was proposed. Firstly, channel attention and spatial attention were combined by the weights obtained by autonomous learning to strengthen the attention to the relevant target area and weaken the attention to the irrelevant target area. Secondly, the similarity between modalities was expressed more finely through the statistical analysis of modality labels and quantification of similarity degrees to numbers between 0 and 1 by using the proposed similarity measurement method. Compared with the most advanced method Multi-Label Semantics Preserving Hashing (MLSPH) on four commonly used datasets MIRFLICKR-25K, NUS-WIDE, MSCOCO, and IAPR TC-12, when the hash code length is 16 bit, the proposed method has the retrieval mean Average Precision (mAP) increased by 2.25%, 1.75%, 6.8%, and 2.15%, respectively. In addition, ablation experiments and efficiency analysis also prove the effectiveness of the proposed method.
Keywords:cross-modal retrieval  hashing method  deep neural network  adaptive  hybrid attention  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号