首页 | 本学科首页   官方微博 | 高级检索  
 共查询到19条相似文献,搜索用时 218 毫秒
为了解决传统哈希算法在图像近邻检索任务中的模糊排序问题,提出了模糊序列感知哈希,旨在学习满足首位区分规则的哈希函数,其可直接利用二值编码本身信息区分模糊序列,从而在近邻检索中无需额外计算比特位权值和加权汉明距离,能以较小的代价区分与查询样本具有相同汉明距离的数据点之间的序列。建立了类似于近邻检索性能评价指标平均准确率的目标函数,其属于序列保持约束条件,能够保证数据点对在汉明空间与欧式空间内具有相同的相对相似性,可确保所提算法适应于近邻检索任务。在训练过程中,对二值编码、汉明距离以及判断函数进行了连续化松弛处理,从而可直接采用批量梯度下降算法优化目标函数,降低了训练复杂度。在三种图像数据集上的对比实验证明,模糊序列感知哈希的近邻检索性能较优。  相似文献   

哈希编码能够节省存储空间、提高检索效率,已引起广泛关注.提出一种成对相似度迁移哈希方法(pairwise similarity transferring hash,PSTH)用于无监督跨模态检索.对于每个模态,PSTH将可靠的模态内成对相似度迁移到汉明空间,使哈希编码继承原始空间的成对相似度,从而学习各模态数据对应的哈希编码;此外,PSTH重建相似度值而不是相似度关系,使得训练过程可以分批进行;与此同时,为缩小不同模态间的语义鸿沟,PSTH最大化模态间成对相似度.在三个公开数据集上进行了大量对比实验,PSTH取得了SOTA的效果.  相似文献   

针对深度哈希跨媒体检索方法中,语义相似的媒体对象的哈希码在汉明空间内的分布不合理问题,提出了一种新的深度哈希跨媒体检索模型.该模型是在汉明空间内利用柯西分布对现有的深度哈希跨媒体关联损失进行改进,使得语义相似的媒体对象哈希码距离较小,语义不相似的媒体对象哈希码较大,进而提高模型的检索效果.同时,本文给出了一种高效的模型求解方法,采用交替迭代方式获得模型的近似最优解.在Flickr-25k数据集,IAPR TC-12数据集和MS COCO数据集上的实验结果表明,该方法可以有效的提高跨媒体检索性能.  相似文献   

目的 哈希检索旨在将海量数据空间中的高维数据映射为紧凑的二进制哈希码,并通过位运算和异或运算快速计算任意两个二进制哈希码之间的汉明距离,从而能够在保持相似性的条件下,有效实现对大数据保持相似性的检索。但是,遥感影像数据除了具有影像特征之外,还具有丰富的语义信息,传统哈希提取影像特征并生成哈希码的方法不能有效利用遥感影像包含的语义信息,从而限制了遥感影像检索的精度。针对遥感影像中的语义信息,提出了一种基于深度语义哈希的遥感影像检索方法。方法 首先在具有多语义标签的遥感影像数据训练集的基础上,利用两个不同配置参数的深度卷积网络分别提取遥感影像的影像特征和语义特征,然后利用后向传播算法针对提取的两类特征学习出深度网络中的各项参数并生成遥感影像的二进制哈希码。生成的二进制哈希码之间能够有效保持原始高维遥感影像的相似性。结果 在高分二号与谷歌地球遥感影像数据集、CIFAR-10数据集及FLICKR-25K数据集上进行实验,并与多种方法进行比较和分析。当编码位数为64时,相对于DPSH(deep supervised Hashing with pairwise labels)方法,在高分二号与谷歌地球遥感影像数据集、CIFAR-10数据集、FLICKR-25K数据集上,mAP(mean average precision)指标分别提高了约2%、6%7%、0.6%。结论 本文提出的端对端的深度学习框架,对于带有一个或多个语义标签的遥感影像,能够利用语义特征有效提高对数据集的检索性能。  相似文献   

深度卷积神经网络学习的图像特征表示具有明显的层次结构.随着层数加深,学习的特征逐渐抽象,类的判别性也逐渐增强.基于此特点,文中提出面向图像检索的深度汉明嵌入哈希编码方式.在深度卷积神经网络的末端插入一层隐藏层,依据每个单元的激活情况获得图像的哈希编码.同时根据哈希编码本身的特征提出汉明嵌入损失,更好地保留原数据之间的相似性.在CIFAR-10、NUS-WIDE基准图像数据集上的实验表明,文中方法可以提升图像检索性能,较好改善短编码下的检索性能.  相似文献   

由于具有低存储成本、高效检索、低标注成本等方面的优势,无监督的哈希技术已经引起了学术界越来越多的关注,并且已经广泛地应用到大规模数据库检索问题中.先前的无监督方法大部分依靠数据集本身的语义结构作为指导信息,要求在哈希空间中,数据的语义信息能够得到保持,从而完成哈希编码的学习.因此如何精确地表示语义结构以及哈希编码成为了无监督哈希方法成功的关键.本文提出一种新的基于自监督学习的策略进行无监督哈希编码学习.具体来讲,本文首先利用对比学习对在目标数据集上对网络进行学习,从而能够构建准确的语义相似性结构.接着,提出一个新的目标损失函数,期望在哈希空间中,数据的局部语义相似性结构能够得到保持,同时哈希编码的辨识力能够得到提升.本文提出的网络框架是端到端可训练的.最后,提出的算法在两个大规模图像检索数据集上进行了测试,大量的实验验证了本文提出的算法的有效性.  相似文献   

针对采用松弛-量化策略的深度哈希方法面临的二值码离散优化的难题,提出一种端到端的基于成对标签的哈希方法来学习更具有判别力的哈希码,通过优化损失函数来解决离散优化丢失信息的问题.引入锚点哈希码概念,以汉明空间中的锚点作为监督信息训练AlexNet网络,将表示图片的二值码拟合至各锚点附近,使用优化后的损失函数计算分类误差和锚点误差,使哈希函数生成具有强判别力的哈希码.在CIFOR-10数据集和ImageNet-100数据集上实验,检索精度优于当前主流方法.  相似文献   

汪海龙  禹晶  肖创柏 《自动化学报》2021,47(5):1077-1086
哈希学习能够在保持数据之间语义相似性的同时, 将高维数据投影到低维的二值空间中以降低数据维度实现快速检索. 传统的监督型哈希学习算法主要是将手工设计特征作为模型输入, 通过分类和量化生成哈希码. 手工设计特征缺乏自适应性且独立于量化过程使得检索的准确率不高. 本文提出了一种基于点对相似度的深度非松弛哈希算法, 在卷积神经网络的输出端使用可导的软阈值函数代替常用的符号函数使准哈希码非线性接近-1或1, 将网络输出的结果直接用于计算训练误差, 在损失函数中使用$\ell_1$范数约束准哈希码的各个哈希位接近二值编码. 模型训练完成之后, 在网络模型外部使用符号函数, 通过符号函数量化生成低维的二值哈希码, 在低维的二值空间中进行数据的存储与检索. 在公开数据集上的实验表明, 本文的算法能够有效地提取图像特征并准确地生成二值哈希码, 且在准确率上优于其他算法.  相似文献   

古凌岚  彭利民 《计算机科学》2016,43(12):213-217
针对传统的基于欧氏距离的相似性度量不能完全反映复杂结构的数据分布特性的问题,提出了一种基于相对密度和流形上k近邻的聚类算法。基于能描述全局一致性信息的流形距离,及可体现局部相似性和紧密度的k近邻概念,通过流形上k近邻相似度度量数据对象间的相似性,采用k近邻的相对紧密度发现不同密度下的类簇,设计近邻点对约束规则搜寻k近邻点对构成的近邻链,归类数据对象及识别离群点。与标准k-means算法、流形距离改进的k-means算法进行了性能比较,在人工数据集和UCI数据集上的仿真实验结果均表明,该算法能有效地处理复杂结构的数据聚类问题,且聚类效果更好。  相似文献   

目的 海量图像检索技术是计算机视觉领域研究热点之一,一个基本的思路是对数据库中所有图像提取特征,然后定义特征相似性度量,进行近邻检索。海量图像检索技术,关键的是设计满足存储需求和效率的近邻检索算法。为了提高图像视觉特征的近似表示精度和降低图像视觉特征的存储空间需求,提出了一种多索引加法量化方法。方法 由于线性搜索算法复杂度高,而且为了满足检索的实时性,需把图像描述符存储在内存中,不能满足大规模检索系统的需求。基于非线性检索的优越性,本文对非穷尽搜索的多索引结构和量化编码进行了探索新研究。利用多索引结构将原始数据空间划分成多个子空间,把每个子空间数据项分配到不同的倒排列表中,然后使用压缩编码的加法量化方法编码倒排列表中的残差数据项,进一步减少对原始空间的量化损失。在近邻检索时采用非穷尽搜索的策略,只在少数倒排列表中检索近邻项,可以大大减少检索时间成本,而且检索过程中不用存储原始数据,只需存储数据集中每个数据项在加法量化码书中的码字索引,大大减少内存消耗。结果 为了验证算法的有效性,在3个数据集SIFT、GIST、MNIST上进行测试,召回率相比近几年算法提升4%~15%,平均查准率提高12%左右,检索时间与最快的算法持平。结论 本文提出的多索引加法量化编码算法,有效改善了图像视觉特征的近似表示精度和存储空间需求,并提升了在大规模数据集的检索准确率和召回率。本文算法主要针对特征进行近邻检索,适用于海量图像以及其他多媒体数据的近邻检索。  相似文献   

Wang  Zhen  Zhang  Long-Bo  Sun  Fu-Zhen  Wang  Lei  Liu  Shu-Shu 《Multimedia Tools and Applications》2019,78(17):24453-24472

Due to its high query speed and low storage cost, binary hashing has been widely used in approximate nearest neighbors (ANN) search. However, the binary bits are generally considered to be equal, which causes data points with different codes to share the same Hamming distance to the query sample. To solve the above distance measure ambiguity, bitwise weights methods were proposed. Unfortunately, in most of the existing methods, the bitwise weights and the binary codes are learnt separately in two stages, and their performances cannot be further improved. In this paper, to effectively address the above issues, we propose an adaptive mechanism that jointly generate the bitwise weights and the binary codes by preserving different types of similarity relationship. As a result, the binary codes are utilized to obtain the initial retrieval results, and they are further re-ranked by the weighted Hamming distance. This ANN search mechanism is termed AR-Rank in this paper. First, this joint mechanism allows the bitwise weights and the binary codes to be used as mutual feedback during the training stage, and they are well adapted to one other when the algorithm converges. Furthermore, the bitwise weights are required to preserve the relative similarity which is consistent with the nature of ANN search task. Thus, the data points can be accurately re-sorted based on the weighted Hamming distances. Evaluations on three datasets demonstrate that the proposed AR-Rank retrieval system outperforms nine state-of-the-art methods.


在图相似性搜索问题中,图编辑距离是较为普遍的度量方法,其计算性能很大程度上决定了图相似性搜索算法的性能。针对传统图编辑距离算法中存在的因大量冗余映射和较大搜索空间导致的性能低下问题,提出了一种改进的图编辑距离算法。该算法首先对图中顶点进行等价划分,以此计算映射编码来判断等价映射;然后定义映射完整性更新等价映射优先级,选出主映射参与扩展;其次,设计高效的启发式函数,提出基于映射编码的下界计算方法,快速得到最优映射。最后,将改进的图编辑距离算法扩展应用于图相似性搜索。在不同数据集上的实验结果表明,该算法具有更好的搜索性能,在搜索空间上最大可降低49%,速度提升了约29%。  相似文献   

Artificial neural network (ANN) training is one of the major challenges in using a prediction model based on ANN. Gradient based algorithms are the most frequent training algorithms with several drawbacks. The aim of this paper is to present a method for training ANN. The ability of metaheuristics and greedy gradient based algorithms are combined to obtain a hybrid improved opposition based particle swarm optimization and a back propagation algorithm with the momentum term. Opposition based learning and random perturbation help population diversification during the iteration. Use of time-varying parameter improves the search ability of standard PSO, and constriction factor guarantees particles convergence. Since several contingent local minima conditions may happen in the weight space, a new cross validation method is proposed to prevent overfitting. Effectiveness and efficiency of the proposed method are compared with several other famous ANN training algorithms on the various benchmark problems.  相似文献   

滕磊  李苑  李智星  胡峰 《计算机应用》2019,39(11):3198-3203
针对目前跨社交网络用户对齐算法存在的网络嵌入效果不佳、负采样方法所生成负例质量无法保证等问题,提出一种基于知识图嵌入的跨社交网络用户对齐(KGEUA)算法。在嵌入阶段,利用部分已知的种子锚用户对进行正例扩充,并提出Near_K负采样方法生成负例,最后利用知识图嵌入方法将两个社交网络嵌入到统一的低维向量空间中。在对齐阶段,针对目前的用户相似度度量方法进行改进,将提出的结构相似度与传统的余弦相似度结合共同度量用户相似度,并提出基于自适应阈值的贪心匹配方法对齐用户,最后将新对齐的用户对加入到训练集中以持续优化向量空间。实验结果表明,提出的算法在Twitter-Foursquare数据集上的hits@30值达到了67.7%,比用户对齐现有最佳算法的结果高出3.3~34.8个百分点,显著提升用户对齐效果。  相似文献   

大数据推荐系统的搜索空间较大导致推荐的响应时间过长。为权衡大数据推荐系统的时间效率和推荐性能,提出一种基于重引力搜索链接预测和评分传播的大数据推荐系统。采用相对相似性指数度量用户的相似性,采用广义Meta Path模型建立相似图;引入社区信息来提高局部链接预测的准确率,从强社区提取优化的子图来实现局部链接的预测,通过重引力搜索对子图做优化处理,从而缩小搜索空间;设计基于传染病模型的网络传播策略,根据已有的模式探索隐藏的模式。基于公开数据集的实验结果表明,该算法有效地提高了推荐系统的准确率和覆盖率,并且响应时间在可接受的范围内。  相似文献   

Document similarity search is to find documents similar to a given query document and return a ranked list of similar documents to users, which is widely used in many text and web systems, such as digital library, search engine, etc. Traditional retrieval models, including the Okapi's BM25 model and the Smart's vector space model with length normalization, could handle this problem to some extent by taking the query document as a long query. In practice, the Cosine measure is considered as the best model for document similarity search because of its good ability to measure similarity between two documents. In this paper, the quantitative performances of the above models are compared using experiments. Because the Cosine measure is not able to reflect the structural similarity between documents, a new retrieval model based on TextTiling is proposed in the paper. The proposed model takes into account the subtopic structures of documents. It first splits the documents into text segments with TextTiling and calculates the similarities for different pairs of text segments in the documents. Lastly the overall similarity between the documents is returned by combining the similarities of different pairs of text segments with optimal matching method. Experiments are performed and results show: 1) the popular retrieval models (the Okapi's BM25 model and the Smart's vector space model with length normalization) do not perform well for document similarity search; 2) the proposed model based on TextTiling is effective and outperforms other models, including the Cosine measure; 3) the methods for the three components in the proposed model are validated to be appropriately employed.  相似文献   

Because search space in artificial neural networks (ANNs) is high dimensional and multimodal which is usually polluted by noises and missing data, the process of weight training is a complex continuous optimization problem. This paper deals with the application of a recently invented metaheuristic optimization algorithm, bird mating optimizer (BMO), for training feed-forward ANNs. BMO is a population-based search method which tries to imitate the mating ways of bird species for designing optimum searching techniques. In order to study the usefulness of the proposed algorithm, BMO is applied to weight training of ANNs for solving three real-world classification problems, namely, Iris flower, Wisconsin breast cancer, and Pima Indian diabetes. The performance of BMO is compared with those of the other classifiers. Simulation results indicate the superior capability of BMO to tackle the problem of ANN weight training. BMO is also applied to model fuel cell system which has been addressed as an open and demanding problem in electrical engineering. The promising results verify the potential of BMO algorithm.  相似文献   

大规模时间序列数据库降维及相似搜索   总被引:4,自引:0,他引:4  
李爱国  覃征 《计算机学报》2005,28(9):1467-1475
提出一种基于分段多项式表示(PPR)的时间序列数据库相似查询的系统化方法.PPR是一类基于线性多项式回归的正交变换.用PPR变换索引时间序列数据在理论上具备非漏报性质.文中分析了PPR的计算复杂性以及查询阈值的下界,并提出了一种衡量时间序列相似查询算法之查询效率的定量指标.与基于离散傅立叶变换(DFT)和离散小波变换(DWT)的时间序列相似查询算法所作的对比实验表明,所提算法可以用低的索引结构维数获得高的查询效率.  相似文献   

Discriminant information (DI) plays a critical role in face recognition. In this paper, we proposed a second-order discriminant tensor subspace analysis (DTSA) algorithm to extract discriminant features from the intrinsic manifold structure of the tensor data. DTSA combines the advantages of previous methods with DI, the tensor methods preserving the spatial structure information of the original image matrices, and the manifold methods preserving the local structure of the samples distribution. DTSA defines two similarity matrices, namely within-class similarity matrix and between-class similarity matrix. The within-class similarity matrix is determined by the distances of point pairs in the same class, while the between-class similarity matrix is determined by the distances between the means of each pair of classes. Using these two matrices, the proposed method preserves the local structure of the samples to fit the manifold structure of facial images in high dimensional space better than other methods. Moreover, compared to the 2D methods, the tensor based method employs two-sided transformations rather than single-sided one, and yields higher compression ratio. As a tensor method, DTSA uses an iterative procedure to calculate the optimal solution of two transformation matrices. In this paper, we analyzed DTSA's connections to 2D-DLPP and TSA, theoretically. The experiments on the ORL, Yale and YaleB facial databases show the effectiveness of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号