首页 | 本学科首页   官方微博 | 高级检索  
     

深度非对称离散跨模态哈希方法
引用本文:王晓雨,王展青,熊威.深度非对称离散跨模态哈希方法[J].计算机应用,2022,42(8):2461-2470.
作者姓名:王晓雨  王展青  熊威
作者单位:武汉理工大学 理学院,武汉 430070
基金项目:中央高校基本科研业务费专项资金资助项目(2019ZY232)
摘    要:大多数深度监督跨模态哈希方法采用对称的方式学习哈希码,导致其不能有效利用大规模数据集中的监督信息;并且对于哈希码的离散约束问题,常采用的基于松弛的策略会产生较大的量化误差,导致哈希码次优。针对以上问题,提出深度非对称离散跨模态哈希(DADCH)方法。首先构造了深度神经网络和字典学习相结合的非对称学习框架,以学习查询实例和数据库实例的哈希码,从而更有效地挖掘数据的监督信息,减少模型的训练时间;然后采用离散优化算法逐列优化哈希码矩阵,降低哈希码二值化的量化误差;同时为充分挖掘数据的语义信息,在神经网络中添加了标签层进行标签预测,并利用语义信息嵌入将不同类别的判别信息通过线性映射嵌入到哈希码中,增强哈希码的判别性。实验结果表明,在IAPR-TC12、MIRFLICKR-25K和NUS-WIDE数据集上,哈希码长度为64 bit时,所提方法在图像检索文本时的平均精度均值(mAP)较近年来提出的先进的深度跨模态检索方法——自监督对抗哈希(SSAH)分别高出约11.6、5.2、14.7个百分点。

关 键 词:跨模态检索  深度神经网络  非对称哈希  语义信息嵌入  离散优化  
收稿时间:2021-06-15
修稿时间:2021-09-15

Deep asymmetric discrete cross-modal hashing method
Xiaoyu WANG,Zhanqing WANG,Wei XIONG.Deep asymmetric discrete cross-modal hashing method[J].journal of Computer Applications,2022,42(8):2461-2470.
Authors:Xiaoyu WANG  Zhanqing WANG  Wei XIONG
Affiliation:School of Science,Wuhan University of Technology,Wuhan Hubei 430070,China
Abstract:Most deep supervised cross-modal hashing methods adopt a symmetric strategy to learn hash code, so that the supervision information in large-scale datasets cannot be used effectively. And for the problem of discrete constraints of hash code, relaxation-based strategy is typically adopted, resulting in large quantization error which leads to the sub-optimal hash code. Aiming at the above problems, a Deep Asymmetric Discrete Cross-modal Hashing (DADCH) method was proposed. Firstly, an asymmetric learning framework combining deep neural networks and dictionary learning was proposed to learn the hash code of query instances and database instances, thereby mining the supervision information of the data more effectively and reducing the training time of the model. Then, the discrete optimization algorithm was used to optimize the hash code matrix column by column to reduce the quantization error of the hash code binarization. At the same time, in order to fully mine the semantic information of the data, a label layer was added to the neural network for label prediction, and the semantic information embedding was used to embed discrimination information of different categories into the hash code through linear mapping to make the hash code more discriminative. Experimental results show that on IAPR-TC12, MIRFLICKR-25K and NUS-WIDE datasets, the mean Average Precision (mAP) of the proposed method on retrieval text by image is about 11.6, 5.2 and 14.7 percentage points higher than that of the advanced deep cross-modal retrieval method — Self-Supervised Adversarial Hashing (SSAH) proposed in recent years respectively.
Keywords:cross-modal retrieval  deep neural network  asymmetric hashing  semantic information embedding  discrete optimization  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号