首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
近年来,随着信息技术的发展,图像、文本、视频、音频等多媒体数据呈现出快速增长的趋势.当处理大量数据时,某些传统检索方法的效率可能会受到影响,并且无法在可接受的时间内获得令人满意的准确性.此外,海量的数据还导致了巨大的存储消耗问题.为了解决上述问题,哈希学习被提出.现有的哈希学习方法首先为数据生成二进制哈希码,并且在学习...  相似文献   

跨模态散列可以将异构的多模态数据映射为语义相似度保持的紧凑二值码,为跨模态检索提供了极大的便利.现有的跨模态散列方法在利用类别标签时,通常使用2个不同的映射来表示散列码和类别标签之间的关系.为更好地捕捉散列码和语义标签之间的关系,提出一种基于双向线性回归的监督离散型跨模态散列方法.该方法仅使用一个稳定的映射矩阵来描述散列码与相应标签之间线性回归关系,提升了跨模态散列学习精度和稳定性.此外,该方法在学习用于生成新样本散列码的模态特定映射时,充分考虑了异构模态的特征分布与语义相似度的保持.在2个公开数据集上与现有方法的实验结果验证了该方法在各种跨模态检索场景下的优越性.  相似文献   

随着不同模态的数据在互联网中的飞速增长,跨模态检索逐渐成为了当今的一个热点研究问题.哈希检索因其快速、有效的特点,成为了大规模数据跨模态检索的主要方法之一.在众多图像-文本的深度跨模态检索算法中,设计的准则多为尽量使得图像的深度特征与对应文本的深度特征相似.但是此类方法将图像中的背景信息融入到特征学习中,降低了检索性能...  相似文献   

跨模态哈希检索以其较高的检索效率和较低的存储成本,在跨模态检索领域受到了广泛的关注.现有的跨模态哈希大多直接从多模态数据中学习哈希码,不能充分利用数据的语义信息,因此无法保证数据低维特征在模态间的分布一致性,解决这个问题的关键之一是要准确地度量多模态数据之间的相似度.为此,提出一种基于对抗投影学习的哈希(adversa...  相似文献   

已有的无监督跨模态哈希(UCMH)方法主要关注构造相似矩阵和约束公共表征空间的结构,忽略了2个重要问题:一是它们为不同模态的数据提取独立的表征用以检索,没有考虑不同模态之间的信息互补;二是预提取特征的结构信息不完全适用于跨模态检索任务,可能会造成一些错误信息的迁移。针对第一个问题,提出一种多模态表征融合结构,通过对不同模态的嵌入特征进行融合,从而有效地综合来自不同模态的信息,提高哈希码的表达能力,同时引入跨模态生成机制,解决检索数据模态缺失的问题;针对第二个问题,提出一种相似矩阵动态调整策略,在训练过程中用学到的模态嵌入自适应地逐步优化相似矩阵,减轻预提取特征对原始数据集的偏见,使其更适应跨模态检索,并有效避免过拟合问题。基于常用数据集Flickr25k和NUS-WIDE进行实验,结果表明,通过该方法构建的模型在Flickr25k数据集上3种哈希位长检索的平均精度均值较DGCPN模型分别提高1.43%、1.82%和1.52%,在NUS-WIDE数据集上分别提高3.72%、3.77%和1.99%,验证了所提方法的有效性。  相似文献   

为了解决跨模态检索算法检索准确率较低、训练时间较长等问题,文中提出联合哈希特征和分类器学习的跨模态检索算法(HFCL).采用统一的哈希码描述语义相同的不同模态数据.在训练阶段,利用标签信息学习具有鉴别性的哈希码.第二阶段基于生成的鉴别性哈希码,采用核逻辑回归学习各模态的哈希函数.在测试阶段,给定任意一个模态查询样本,利用学习的哈希函数生成哈希特征,从数据库中检索与之语义相关的另一模态数据.在3个公开数据集上的实验验证HFCL的有效性.  相似文献   

International Journal of Computer Vision - In this paper, we present a new hashing method to learn compact binary codes for highly efficient image retrieval on large-scale datasets. While the...  相似文献   

Due to its storage efficiency and fast query speed, cross-media hashing methods have attracted much attention for retrieving semantically similar data over heterogeneous datasets. Supervised hashing methods, which utilize the labeled information to promote the quality of hashing functions, achieve promising performance. However, the existing supervised methods generally focus on utilizing coarse semantic information between samples (e.g. similar or dissimilar), and ignore fine semantic information between samples which may degrade the quality of hashing functions. Accordingly, in this paper, we propose a supervised hashing method for cross-media retrieval which utilizes the coarse-to-fine semantic similarity to learn a sharing space. The inter-category and intra-category semantic similarity are effectively preserved in the sharing space. Then an iterative descent scheme is proposed to achieve an optimal relaxed solution, and hashing codes can be generated by quantizing the relaxed solution. At last, to further improve the discrimination of hashing codes, an orthogonal rotation matrix is learned by minimizing the quantization loss while preserving the optimality of the relaxed solution. Extensive experiments on widely used Wiki and NUS-WIDE datasets demonstrate that the proposed method outperforms the existing methods.  相似文献   

深度跨模态哈希算法(deep cross-modal Hash,DCMH)可以结合哈希算法存储成本低、检索速度快的优点,以及深度神经网络提取特征的强大能力,得到了越来越多的关注。它可以有效地将模态的特征和哈希表示学习集成到端到端框架中。然而在现有的DCMH方法的特征提取中,基于全局表示对齐的方法无法准确定位图像和文本中有语义意义的部分,导致在保证检索速度的同时无法保证检索的精确度。针对上述问题,提出了一种基于多模态注意力机制的跨模态哈希网络(HX_MAN),将注意力机制引入到DCMH方法中来提取不同模态的关键信息。利用深度学习来提取图像和文本模态的全局上下文特征,并且设计了一种多模态交互门来将图像和文本模态进行细粒度的交互,引入多模态注意力机制来更精确地捕捉不同模态内的局部特征信息,将带有注意的特征输入哈希模块以获得二进制的哈希码;在实行检索时,将任一模态的数据输入训练模块中来获得哈希码,计算该哈希码与检索库中哈希码的汉明距离,最终根据汉明距离按顺序输出另一种模态的数据结果。实验结果表明:HX_MAN模型与当前现有的DCMH方法相比更具有良好的检索性能,在保证检索速度的同时,能够更准确...  相似文献   

In the era of big data rich in We Media, the single mode retrieval system has been unable to meet people’s demand for information retrieval. This paper proposes a new solution to the problem of feature extraction and unified mapping of different modes: A Cross-Modal Hashing retrieval algorithm based on Deep Residual Network (CMHR-DRN). The model construction is divided into two stages: The first stage is the feature extraction of different modal data, including the use of Deep Residual Network (DRN) to extract the image features, using the method of combining TF-IDF with the full connection network to extract the text features, and the obtained image and text features used as the input of the second stage. In the second stage, the image and text features are mapped into Hash functions by supervised learning, and the image and text features are mapped to the common binary Hamming space. In the process of mapping, the distance measurement of the original distance measurement and the common feature space are kept unchanged as far as possible to improve the accuracy of Cross-Modal Retrieval. In training the model, adaptive moment estimation (Adam) is used to calculate the adaptive learning rate of each parameter, and the stochastic gradient descent (SGD) is calculated to obtain the minimum loss function. The whole training process is completed on Caffe deep learning framework. Experiments show that the proposed algorithm CMHR-DRN based on Deep Residual Network has better retrieval performance and stronger advantages than other Cross-Modal algorithms CMFH, CMDN and CMSSH.  相似文献   

Li  Qi  Sun  Zhenan  He  Ran  Tan  Tieniu 《International Journal of Computer Vision》2020,128(8-9):2204-2222
International Journal of Computer Vision - With the rapid growth of image and video data on the web, hashing has been extensively studied for image or video search in recent years. Benefiting from...  相似文献   

由于不同模态数据之间的异构性以及语义鸿沟等特点,给跨模态数据分析带来巨大的挑战.本文提出了一个新颖的相似度保持跨模态哈希检索算法.利用模态内数据相似性结构使得模态内相似的数据具有相似的残差,从而保证学习到的哈希码能够保持模态内数据的局部结构.同时利用模态间数据的标签,使得来自于不同模态同时具有相同标签的数据对应的哈希码...  相似文献   

In recent years, the development of deep learning has further improved hash retrieval technology. Most of the existing hashing methods currently use Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to process image and text information, respectively. This makes images or texts subject to local constraints, and inherent label matching cannot capture fine-grained information, often leading to suboptimal results. Driven by the development of the transformer model, we propose a framework called ViT2CMH mainly based on the Vision Transformer to handle deep Cross-modal Hashing tasks rather than CNNs or RNNs. Specifically, we use a BERT network to extract text features and use the vision transformer as the image network of the model. Finally, the features are transformed into hash codes for efficient and fast retrieval. We conduct extensive experiments on Microsoft COCO (MS-COCO) and Flickr30K, comparing with baselines of some hashing methods and image-text matching methods, showing that our method has better performance.  相似文献   

Many real world problems require fast and efficient lexical comparison of large numbers of short text strings. Search personalization is one such domain. We introduce the use of feature bit vectors using the hashing trick for improving relevance in personalized search and other personalization applications. We present results of several lexical hashing and comparison methods. These methods are applied to a user’s historical behavior and are used to predict future behavior. Using a single bit per dimension instead of floating point results in an order of magnitude decrease in data structure size, while preserving or even improving quality. We use real data to simulate a search personalization task. A simple method for combining bit vectors demonstrates an order of magnitude improvement in compute time on the task with only a small decrease in accuracy.  相似文献   

为了解决大规模数据集下传统视觉词袋模型生成时间长、内存消耗大且分类精度低等问题,提出了基于监督核哈希(Supervised Hashing with Kernels,KSH)的视觉词袋模型.首先,提取图像的SIFT特征点,构造特征点样本集.然后,学习KSH函数,将距离相近的特征点映射成相同的哈希码,每一个哈希码代表聚类中心,构成视觉词典.最后,利用生成的视觉词典,将图像表示为直方图向量,并应用于图像分类.在标准数据集上的实验结果表明,该模型生成的视觉词典具有较好的区分度,有效地提高了图像分类的精度和效率.  相似文献   

利用分块相似系数构造感知图像Hash   总被引:1,自引:0,他引:1  
提出一种基于图像分块相似系数的感知稳健图像Hash.先对图像预处理,再进行重叠分块,在密钥控制下,利用高斯低通滤波器生成伪随机参考图像块,分别计算每个分块与参考图像块的相关系数得到图像特征序列.依此将相邻两个分块特征值合并以缩短Hash长度,同时对压缩后的特征序列进行重排,进一步提高图像Hash的安全性.最后对归一化特征值进行量化,并运用Huffman方法对其编码,进一步压缩Hash长度.理论分析和实验结果表明,该图像Hash方法对JPEG压缩、适度的噪声干扰、水印嵌入、图像缩放以及高斯低通滤波等常见图像处理有较好的鲁棒性,能有效区分不同图像,冲突概率低,可用于图像篡改检测.  相似文献   

用于图像Hash的视觉相似度客观评价测度   总被引:2,自引:0,他引:2       下载免费PDF全文
由于评价图像Hash性能时,要求对两幅图像是否在感知上相似做出判断,因此针对这一需求,提出了一种衡量感知相似程度的评价测度。该测度的确定是先对图像进行低通滤波,再进行图像重叠分块;然后运用相关系数检测法计算每一对分块的相似程度,并对相似系数归一化,再分别计算若干个最小和最大的归一化相似系数的乘积;最后用最小相似系数乘积与最大相似系数乘积的比值作为感知相似性的测度。实验结果表明,该测度不仅可有效反映图像视觉质量的变化,而且能较好地区分两幅图像是否存在重要的视觉差异,其对感知相似进行评价的性能优于峰值信噪比。  相似文献   

Wang  Qianying  Lu  Ming 《Neural Processing Letters》2019,50(2):1303-1319
Neural Processing Letters - Similarity measurement is crucial for unsupervised learning and semi-supervised learning. Unsupervised methods need a similarity to do clustering. Semi-supervised...  相似文献   

Zhu  Lei  Song  Jiayu  Yang  Zhan  Huang  Wenti  Zhang  Chengyuan  Yu  Weiren 《Neural Processing Letters》2022,54(4):2549-2569
Neural Processing Letters - Privacy-preserving cross-modal retrieval is a significant problem in the area of multimedia analysis. As the amount of data is exploding, cross-modal data analysis and...  相似文献   

哈希学习通过设计和优化目标函数,并结合数据分布,学习得到样本的哈希码表示.在现有哈希学习模型中,线性模型因其高效、便捷的特性得到广泛应用.针对线性模型在哈希学习中的参数优化问题,提出一种基于相似度驱动的线性哈希模型参数再优化方法.该方法可以在不改变现有模型各组成部分的前提下,实现模型参数的再优化,提升模型检索性能.该方法首先通过运行现有哈希算法多次,获得训练集的多个哈希码矩阵,然后基于相似度保持度量标准和融合准则对多个哈希码矩阵进行优化选择,获得训练集的优化哈希矩阵,最后利用该优化哈希矩阵对原模型的参数进行再优化,进而获得更优的哈希学习算法.实验结果表明,该方法对不同的哈希学习算法性能都有较为显著的提升.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号