首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
目的 服装检索方法是计算机视觉与自然语言处理领域的研究热点,其包含基于内容与基于文本的两种查询模态。然而传统检索方法通常存在检索效率低的问题,且很少研究关注服装在风格上的相似性。为解决这些问题,本文提出深度多模态融合的服装风格检索方法。方法 提出分层深度哈希检索模型,基于预训练的残差网络ResNet(residual network)进行迁移学习,并把分类层改造成哈希编码层,利用哈希特征进行粗检索,再用图像深层特征进行细检索。设计文本分类语义检索模型,基于LSTM(long short-term memory)设计文本分类网络以提前分类缩小检索范围,再以基于doc2vec提取的文本嵌入语义特征进行检索。同时提出相似风格上下文检索模型,其参考单词相似性来衡量服装风格相似性。最后采用概率驱动的方法量化风格相似性,并以最大化该相似性的结果融合方法作为本文检索方法的最终反馈。结果 在Polyvore数据集上,与原始ResNet模型相比,分层深度哈希检索模型的top5平均检索精度提高11.6%,检索速度提高2.57 s/次。与传统文本分类嵌入模型相比,本文分类语义检索模型的top5查准率提高29.96%,检索速度提高16.53 s/次。结论 提出的深度多模态融合的服装风格检索方法获得检索精度与检索速度的提升,同时进行了相似风格服装的检索使结果更具有多样性。  相似文献   

2.
In recent years, with the development of 3D technologies, 3D model retrieval has become a hot topic. The key point of 3D model retrieval is to extract robust feature for 3D model representation. In order to improve the effectiveness of method on 3D model retrieval, this paper proposes a feature extraction model based on convolutional neural networks (CNN). First, we extract a set of 2D images from 3D model to represent each 3D object. SIFT detector is utilized to detect interesting points from each 2D image and extract interesting patches to represent local information of each 3D model. X-means is leveraged to generate the CNN filters. Second, a single CNN layer learns low-level features which are then given as inputs to multiple recursive neural networks (RNN) in order to compose higher order features. RNNs can generate the final feature for 2D image representation. Finally, nearest neighbor is used to compute the similarity between different 3D models in order to handle the retrieval problem. Extensive comparison experiments were on the popular ETH and MV-RED 3D model datasets. The results demonstrate the superiority of the proposed method.  相似文献   

3.
In this paper, we present a simple and effective topic correlation model (TCM) for cross-modal multimedia retrieval by jointly modeling the text and image components in multimedia documents. In this model, the image component is represented by the bag-of-features model based on local scale-invariant feature transform features, meanwhile the text component is described by a topic distribution learned from a latent topic model. Statistical correlations between these two mid-level features are investigated by mapping them into a semantic space. These cross-modality correlations are used to calculate the conditional probabilities of answers in one modality while given query in the other modality. The model is tested on three cross-modal retrieval benchmark problems including Wikipedia documents in both English and Chinese. Experimental results have demonstrated that the new TCM model achieves the best performance compared to recent state-of-the-art cross-modal retrieval models on the given benchmarks.  相似文献   

4.
一种基于稀疏典型性相关分析的图像检索方法   总被引:1,自引:0,他引:1  
庄凌  庄越挺  吴江琴  叶振超  吴飞 《软件学报》2012,23(5):1295-1304
图像语义检索的一个关键问题就是要找到图像底层特征与语义之间的关联,由于文本是表达语义的一种有效手段,因此提出通过研究文本与图像两种模态之间关系来构建反映两者间潜在语义关联的有效模型的思路,基于该模型,可使用自然语言形式(文本语句)来表达检索意图,最终检索到相关图像.该模型基于稀疏典型性相关分析(sparse canonical correlation analysis,简称sparse CCA),按照如下步骤训练得到:首先利用隐语义分析方法构造文本语义空间,然后以视觉词袋(bag of visual words)来表达文本所对应的图像,最后通过Sparse CCA算法找到一个语义相关空间,以实现文本语义与图像视觉单词间的映射.使用稀疏的相关性分析方法可以提高模型可解释性和保证检索结果稳定性.实验结果验证了Sparse CCA方法的有效性,同时也证实了所提出的图像语义检索方法的可行性.  相似文献   

5.
针对三维模型检索中单一特征检索效果差的难题,首先提出了三维模型的3类特征向量提取算法,即刻画模型表面特性的扩展高斯球面特征向量、反映模型内部结构的Radon变换球面分布特征向量、代表模型投影层次的视图分层压缩感知特征向量。其次,以样本模型的查询结果分类信息熵作为指标并结合监督学习过程,给出了一种多特征融合的加权系数估算方法。最后,设计了融合多特征的模型间相似度度量,完成基于查询示例的模型检索过程。仿真实验表明,提出的3类特征向量具有较好的可区分性,多特征融合检索算法的查全率与查准率有明显提升。  相似文献   

6.
7.
近期,跨模态视频语料库时刻检索(VCMR)这一新任务被提出,它的目标是从未分段的视频语料库中检索出与查询语句相对应的一小段视频片段.现有的跨模态视频文本检索工作的关键点在于不同模态特征的对齐和融合,然而,简单地执行跨模态对齐和融合不能确保来自相同模态且语义相似的数据在联合特征空间下保持接近,也未考虑查询语句的语义.为了解决上述问题,本文提出了一种面向多模态视频片段检索的查询感知跨模态双重对比学习网络(QACLN),该网络通过结合模态间和模态内的双重对比学习来获取不同模态数据的统一语义表示.具体地,本文提出了一种查询感知的跨模态语义融合策略,根据感知到的查询语义自适应地融合视频的视觉模态特征和字幕模态特征等多模态特征,获得视频的查询感知多模态联合表示.此外,提出了一种面向视频和查询语句的模态间及模态内双重对比学习机制,以增强不同模态的语义对齐和融合,从而提高不同模态数据表示的可分辨性和语义一致性.最后,采用一维卷积边界回归和跨模态语义相似度计算来完成时刻定位和视频检索.大量实验验证表明,所提出的QACLN优于基准方法.  相似文献   

8.
Image retrieval is an important problem for researchers in computer vision and content-based image retrieval (CBIR) fields. Over the last decades, many image retrieval systems were based on image representation as a set of extracted low-level features such as color, texture and shape. Then, systems calculate similarity metrics between features in order to find similar images to a query image. The disadvantage of this approach is that images visually and semantically different may be similar in the low level feature space. So, it is necessary to develop tools to optimize retrieval of information. Integration of vector space models is one solution to improve the performance of image retrieval. In this paper, we present an efficient and effective retrieval framework which includes a vectorization technique combined with a pseudo relevance model. The idea is to transform any similarity matching model (between images) to a vector space model providing a score. A study on several methodologies to obtain the vectorization is presented. Some experiments have been undertaken on Wang, Oxford5k and Inria Holidays datasets to show the performance of our proposed framework.  相似文献   

9.
基于深度学习的图像检索系统   总被引:2,自引:0,他引:2  
基于内容的图像检索系统关键的技术是有效图像特征的获取和相似度匹配策略.在过去,基于内容的图像检索系统主要使用低级的可视化特征,无法得到满意的检索结果,所以尽管在基于内容的图像检索上花费了很大的努力,但是基于内容的图像检索依旧是计算机视觉领域中的一个挑战.在基于内容的图像检索系统中,存在的最大的问题是“语义鸿沟”,即机器从低级的可视化特征得到的相似性和人从高级的语义特征得到的相似性之间的不同.传统的基于内容的图像检索系统,只是在低级的可视化特征上学习图像的特征,无法有效的解决“语义鸿沟”.近些年,深度学习技术的快速发展给我们提供了希望.深度学习源于人工神经网络的研究,深度学习通过组合低级的特征形成更加抽象的高层表示属性类别或者特征,以发现数据的分布规律,这是其他算法无法实现的.受深度学习在计算机视觉、语音识别、自然语言处理、图像与视频分析、多媒体等诸多领域取得巨大成功的启发,本文将深度学习技术用于基于内容的图像检索,以解决基于内容的图像检索系统中的“语义鸿沟”问题.  相似文献   

10.
With the development of the processing technologies of 3D model and the increasing of 3D model in different application flieds, 3D model retrieval is attracting more and more people’s attention. In order to handle this problem, most of approaches focus on the feature extraction form different virtual view. It is hard to guarantee the robustness and also ignore the correlation between both views. Thus, we propose an effective view-based 3D model retrieval method via supervised multi-view feature learning (SMFL). First, the subspace dimension of viusal feature is generated through Singular Value Decomposition (SVD) algorithm. This step is used to select main information from multi-view in order to reduce the final amount of calculation; Secondly, we consider the relationship of multi-view from same class and the correlation between two different classes to make the feature mapping in order to reduce the different of views from the same class and increase the different of views from the difference class; Finally, the projection mapping corresponding to the inner product of each 3D model helps to calculate the similarities between two different 3D models. The extensive experiments are conducted on popular ETH, NTU, MV-RED and PSB 3D model datasets with Zernike moments. The comparative results or The experimental results with existing 3D model retrieval methods show the superiority of the proposed method.  相似文献   

11.
基于手绘草图的三维模型检索(SBSR)已成为三维模型检索、模式识别与计算机视 觉领域的一个研究热点。与传统方法相比,基于卷积神经网络(CNN)的三维深度表示方法在三 维模型检索任务中性能优势非常明显。本文提出了一种基于手绘图像融合信息熵和CNN 的三 维模型检索方法。首先,通过计算模型投影图的信息熵得到模型的代表性视图,并将代表性视 图经过边缘检测等处理得到三维模型投影图的轮廓图像;然后,将轮廓图像和手绘草图输入到 CNN 中提取特征描述子,并进行特征匹配。本文方法在Shape Retrieval Contest (SHREC) 2012 数据库和SHREC 2013 数据库上进行实验。实验证明,该方法的效果较其他传统方法检索准确 度更高。  相似文献   

12.
We have witnessed 3D shape models abundant in many application fields including 3D CAD/CAM, augmented/mixed reality (AR/MR), and entertainment. Creating 3D shape models from scratch is still very expensive. Efficient and accurate methods for shape retrieval is essential for 3D shape models to be reused. To retrieve similar 3D shape models, one must provide an arbitrary 3D shape as a query. Most of the research on 3D shape retrieval has been conducted with a “whole” shape as a query (aka whole-to-whole shape retrieval), while a “part” shape (aka part-to-whole shape retrieval) is more practically requested as a query especially by mechanical engineering with 3D CAD/CAM applications. A “part” shape is naturally constructed by a 3D range scanner as an input device. In this paper, we focus on the efficient method for part-to-whole shape retrieval where the “part” shape is assumed to be given by a 3D range scanner. Specifically, we propose a Super-Vector coding feature with SURF local features extracted from the View-Normal-Angle image, or the image synthesized by taking account of the angle between the view vector and the surface normal vector, together with the depth-buffered image, for part-to-whole shape retrieval. In addition, we propose a weighted whole-to-whole re-ranking method taking advantage of global information based on the result of part-to-whole shape retrieval. Through experiments we demonstrate that our proposed method outperforms the previous methods with or without re-ranking.  相似文献   

13.
Traditional content-based image retrieval (CBIR) scheme with assumption of independent individual images in large-scale collections suffers from poor retrieval performance. In medical applications, images usually exist in the form of image bags and each bag includes multiple relevant images of the same perceptual meaning. In this paper, based on these natural image bags, we explore a new scheme to improve the performance of medical image retrieval. It is feasible and efficient to search the bag-based medical image collection by providing a query bag. However, there is a critical problem of noisy images which may present in image bags and severely affect the retrieval performance. A new three-stage solution is proposed to perform the retrieval and handle the noisy images. In stage 1, in order to alleviate the influence of noisy images, we associate each image in the image bags with a relevance degree. In stage 2, a novel similarity aggregation method is proposed to incorporate image relevance and feature importance into the similarity computation process. In stage 3, we obtain the final image relevance in an adaptive way which can consider both image bag similarity and individual image similarity. The experiments demonstrate that the proposed approach can improve the image retrieval performance significantly.  相似文献   

14.
Ying  Dengsheng  Guojun   《Pattern recognition》2008,41(8):2554-2570
Semantic-based image retrieval has attracted great interest in recent years. This paper proposes a region-based image retrieval system with high-level semantic learning. The key features of the system are: (1) it supports both query by keyword and query by region of interest. The system segments an image into different regions and extracts low-level features of each region. From these features, high-level concepts are obtained using a proposed decision tree-based learning algorithm named DT-ST. During retrieval, a set of images whose semantic concept matches the query is returned. Experiments on a standard real-world image database confirm that the proposed system significantly improves the retrieval performance, compared with a conventional content-based image retrieval system. (2) The proposed decision tree induction method DT-ST for image semantic learning is different from other decision tree induction algorithms in that it makes use of the semantic templates to discretize continuous-valued region features and avoids the difficult image feature discretization problem. Furthermore, it introduces a hybrid tree simplification method to handle the noise and tree fragmentation problems, thereby improving the classification performance of the tree. Experimental results indicate that DT-ST outperforms two well-established decision tree induction algorithms ID3 and C4.5 in image semantic learning.  相似文献   

15.
基于仿射迭代模型的特征点匹配算法   总被引:1,自引:1,他引:1       下载免费PDF全文
图像序列中的特征点匹配是计算机视觉中的一个基本问题,也是目标识别、图像检索以及3维重建等问题的基础。为了提高图像匹配的精度,提出了一种针对两幅图像的高精度特征点自动匹配算法。该算法首先分析并提出两幅图像中相应特征点的邻域窗口之间的单应映射可以用仿射变换模型来近似;然后通过快速的基于仿射变换模型的迭代优化方法,不仅估计并矫正了相应邻域窗口之间的透视畸变,同时还补偿了在特征点检测阶段对相应特征点的定位误差,从而使匹配结果达到子像素级精度;最后通过真实图像的实验以及与现有算法的比较结果表明,该算法不仅得到了更多的匹配关系,还提高了特征点匹配的精度。  相似文献   

16.
在机械制造智能化进程中不可避免地产生了海量零配件模型信息,给数据的高效检索带来了巨大的挑战。考虑到设计草图具备用户友好且轻量级的特性,方法通过构造深度跨域表征模型进行基于设计草图的机械零配件模型检索。针对草图和三维模型的跨模态信息关联问题,提出特征联合学习方法,旨在控制检索对象类内及类间差异的过程中,使特征描述符习得单一域特征的同时融合跨域信息,建立跨模态数据在共嵌空间下的一致性关联表征。最后,利用哈希编码构建索引表实现海量数据的快速检索。在零部件数据上的实验结果表明,所提出的基于设计草图的零配件检索方法在同期方法中既能实现最准确的检索结果,也具备较高的检索效率。方法在提升跨模态零配件信息检索准确性的同时提高了数据管理效率,从而间接提升了产品设计的效率和便捷性,相关系统已经在部分企业落地应用且获得良好反馈。  相似文献   

17.
18.
Boost learning algorithm, such as AdaBoost, has been widely used in a variety of applications in multimedia and computer vision. Relevance feedback-based image retrieval has been formulated as a classification problem with a small number of training samples. Several machine learning techniques have been applied to this problem recently. In this paper, we propose a novel paired feature AdaBoost learning system for relevance feedback-based image retrieval. To facilitate density estimation in our feature learning method, we propose an ID3-like balance tree quantization method to preserve most discriminative information. By using paired feature combination, we map all training samples obtained in the relevance feedback process onto paired feature spaces and employ the AdaBoost algorithm to select a few feature pairs with best discrimination capabilities in the corresponding paired feature spaces. In the AdaBoost algorithm, we employ Bayesian classification to replace the traditional binary weak classifiers to enhance their classification power, thus producing a stronger classifier. Experimental results on content-based image retrieval (CBIR) show superior performance of the proposed system compared to some previous methods.  相似文献   

19.
近年来,各种类型的媒体数据,如音频、文本、图像和视频,在互联网上呈现爆发式增长,不同类型的数据通常用于描述同一事件或主题。跨模态检索提供了一些有效的方法,可以为任何模态的给定查询搜索不同模态的语义相关结果,使用户能够获得有关事件/主题的更多信息,从而达到以一种模态数据检索另外一种模态数据的效果。随着数据检索需求以及各种新技术的发展,单一模态检索难以满足用户需求,研究者提出许多跨模态检索的技术来解决这个问题。梳理近期跨模态检索领域研究者的研究成果,简要分析传统的跨模态检索方法,着重介绍近五年研究者提出跨模态检索方法,并对其性能表现进行对比;总结现阶段跨模态检索研究过程中面临的问题,并对后续发展做出展望。  相似文献   

20.
单像机有源形状恢复方法研究   总被引:2,自引:1,他引:1  
物体表面三维形状恢复是计算机视觉的一个重要研究内容.给出了一种利用网格结 构光,采用单像机恢复物体表面三维形状的方法.提出了一种新的定标方法和投影模板检测 算法,并构造完成了有源三维重建实验系统.结果表明该方法能快速、准确地恢复物体形状.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号