首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 125 毫秒
1.
跨媒体相关性推理与检索研究   总被引:1,自引:0,他引:1  
针对不同模态的多媒体数据之间难以度量跨媒体相关性的问题,提出了一种基于相关性推理的跨媒体检索方法,首先从相同模态内部(intra media)的相似性和不同模态之间(cross media)的相关性两个方面进行分析和量化,然后构造跨媒体关联图将相似性和相关性学习结果进行统一表达,以跨媒体关联图的最短路径为基础进行跨媒体检索,并提出相关反馈算法将用户交互中的先验知识融入到跨媒体关联图中,有效提高了跨媒体检索效率.该方法可以应用于针对用户提交查询样例的不同模态交叉检索系统.  相似文献   

2.
代刚  张鸿 《计算机应用》2018,38(9):2529-2534
针对如何挖掘不同模态中具有相同语义的特征数据之间的内在相关性的问题,提出了一种基于语义相关性与拓扑关系(SCTR)的跨媒体检索算法。一方面,利用具有相同语义的多媒体数据之间的潜在相关性去构造多媒体语义相关超图;另一方面,挖掘多媒体数据的拓扑关系来构建多媒体近邻关系超图。通过结合多媒体数据语义相关性与拓扑关系去为每种媒体类型学习一个最优的投影矩阵,然后将多媒体数据的特征向量投影到一个共同空间,从而实现跨媒体检索。该算法在XMedia数据集上,对多项跨媒体检索任务的平均查准率为51.73%,与联合图正则化的异构度量学习(JGRHML)、跨模态相关传播(CMCP)、近邻的异构相似性度量(HSNN)、共同的表示学习(JRL)算法相比,分别提高了22.73、15.23、11.7、9.11个百分点。实验结果从多方面证明了该算法有效提高了跨媒体检索的平均查准率。  相似文献   

3.
一种基于内容相关性的跨媒体检索方法   总被引:12,自引:0,他引:12  
针对传统基于内容的多媒体检索对单一模态的限制,提出一种新的跨媒体检索方法.分析了不同模态的内容特征之间在统计意义上的典型相关性,并通过子空间映射解决了特征向量的异构性问题,同时结合相关反馈中的先验知识,修正不同模态多媒体数据集在子空间中的拓扑结构,实现跨媒体相关性的准确度量.实验以图像和音频数据为例验证了基于相关性学习的跨媒体检索方法的有效性.  相似文献   

4.
历经几十年的发展,多媒体检索取得了长足的进步,然而检索性能的提升依然受到“意图鸿沟”与“语义鸿沟”的制约.针对此问题,学术界提出了一系列查询技术帮助用户清楚地表达检索意图以及反馈技术帮助系统准确地理解用户意图与媒体数据,有效提升了检索性能.对多媒体检索中的查询与反馈技术进行了分析与讨论.分析了查询方式的演变与反馈技术的发展,综述了面向PC机、移动智能终端、触屏设备的查询技术,介绍了不同时期的反馈技术,探讨了探索式搜索中的交互问题,最后分析了该领域的未来研究趋势.  相似文献   

5.
提出和实现了一个面向多媒体文档的多通道(对应多种模态,如文本、图像、视频等)检索系统.系统定义了一个新的用来描述多媒体文档内容的框架,该框架不但提取出多媒体文档在各通道下的基于内容的底层特征,而且还记录下多媒体文档中不同多媒体对象间的链接关系.同时,提出一种基于图模型的交叉参照知识库,用来存储从链接关系中挖掘出的多媒体对象间的语义关系,通过一个有效的语义上下文分析算法,在检索过程中计算每个对象与查询的语义相似度.语义上下文分析算法不仅使得基于内容的多媒体信息检索中考虑了多媒体对象的语义信息,同时支持用户通过通道切换的方式进行相关反馈,提供了一种较为灵活的查询模式.实验表明在交叉参照知识库基础上,该系统还能够有效地提高基于内容的多媒体对象的检索性能(如内容覆盖率等).  相似文献   

6.
随着互联网与多媒体技术的迅猛发展,网络数据的呈现形式由单一文本扩展到包含图像、视频、文本、音频和3D模型等多种媒体,使得跨媒体检索成为信息检索的新趋势.然而,"异构鸿沟"问题导致不同媒体的数据表征不一致,难以直接进行相似性度量,因此,多种媒体之间的交叉检索面临着巨大挑战.随着深度学习的兴起,利用深度神经网络模型的非线性建模能力有望突破跨媒体信息表示的壁垒,但现有基于深度学习的跨媒体检索方法一般仅考虑图像和文本两种媒体数据之间的成对关联,难以实现更多种媒体的交叉检索.针对上述问题,提出了跨媒体深层细粒度关联学习方法,支持多达5种媒体类型数据(图像、视频、文本、音频和3D模型)的交叉检索.首先,提出了跨媒体循环神经网络,通过联合建模多达5种媒体类型数据的细粒度信息,充分挖掘不同媒体内部的细节信息以及上下文关联.然后,提出了跨媒体联合关联损失函数,通过将分布对齐和语义对齐相结合,更加准确地挖掘媒体内和媒体间的细粒度跨媒体关联,同时利用语义类别信息增强关联学习过程的语义辨识能力,提高跨媒体检索的准确率.在两个包含5种媒体的跨媒体数据集PKU XMedia和PKU XMediaNet上与现有方法进行实验对比,实验结果表明了所提方法的有效性.  相似文献   

7.
针对如何在相似媒体之间进行有效关联,描述跨媒体的相似性等问题,提出一种新的跨媒体检索方法。提取多媒体对象的低层特征,利用Ontology对其进行组织,在语义层次实现多媒体关联。实验结果证明,基于Ontology的方式可以有效完成跨媒体检索。  相似文献   

8.
目的 跨媒体检索旨在以任意媒体数据检索其他媒体的相关数据,实现图像、文本等不同媒体的语义互通和交叉检索。然而,"异构鸿沟"导致不同媒体数据的特征表示不一致,难以实现语义关联,使得跨媒体检索面临巨大挑战。而描述同一语义的不同媒体数据存在语义一致性,且数据内部蕴含着丰富的细粒度信息,为跨媒体关联学习提供了重要依据。现有方法仅仅考虑了不同媒体数据之间的成对关联,而忽略了数据内细粒度局部之间的上下文信息,无法充分挖掘跨媒体关联。针对上述问题,提出基于层级循环注意力网络的跨媒体检索方法。方法 首先提出媒体内-媒体间两级循环神经网络,其中底层网络分别建模不同媒体内部的细粒度上下文信息,顶层网络通过共享参数的方式挖掘不同媒体之间的上下文关联关系。然后提出基于注意力的跨媒体联合损失函数,通过学习媒体间联合注意力来挖掘更加精确的细粒度跨媒体关联,同时利用语义类别信息增强关联学习过程中的语义辨识能力,从而提升跨媒体检索的准确率。结果 在2个广泛使用的跨媒体数据集上,与10种现有方法进行实验对比,并采用平均准确率均值MAP作为评价指标。实验结果表明,本文方法在2个数据集上的MAP分别达到了0.469和0.575,超过了所有对比方法。结论 本文提出的层级循环注意力网络模型通过挖掘图像和文本的细粒度信息,能够充分学习图像和文本之间精确跨媒体关联关系,有效地提高了跨媒体检索的准确率。  相似文献   

9.
冯姣  陆昶谕 《计算机科学》2021,48(z1):122-126
随着多媒体技术的快速发展,跨媒体检索逐渐替代传统的单媒体检索成为主流的信息检索方式.现有跨媒体检索方法复杂度高,且不能充分挖掘数据的细节特征,在映射的过程中会产生偏移,难以学习到精准的数据关联.针对上述问题,提出了一种基于残差注意力网络的跨媒体检索方法.首先,为了更好地提取不同媒体数据的关键特征,同时简化跨媒体检索模型...  相似文献   

10.
薛艳 《微型电脑应用》2022,(12):177-179+186
为了提升多媒体课件资源定向检索效果,提出一种大学英语多媒体课件资源定向检索技术,分析课件资源高维特征向量的分布特点,采用LLE非线性降维方法对其进行降维处理,挖掘大学英语多媒体课件资源课件间的语义关系,分别计算各个对象的语义相似度,进而实现大学英语多媒体课件资源定向检索。仿真结果表明,所提方法不仅能够提升检索效率和课件资源总相似度,同时还能够减少定向检索错误率。  相似文献   

11.
The content-based cross-media retrieval is a new type of multimedia retrieval in which the media types of query examples and the returned results can be different. In order to learn the semantic correlations among multimedia objects of different modalities, the heterogeneous multimedia objects are analyzed in the form of multimedia document (MMD), which is a set of multimedia objects that are of different media types but carry the same semantics. We first construct an MMD semi-semantic graph (MMDSSG) by jointly analyzing the heterogeneous multimedia data. After that, cross-media indexing space (CMIS) is constructed. For each query, the optimal dimension of CMIS is automatically determined and the cross-media retrieval is performed on a per-query basis. By doing this, the most appropriate retrieval approach for each query is selected, i.e. different search methods are used for different queries. The query dependent search methods make cross-media retrieval performance not only accurate but also stable. We also propose different learning methods of relevance feedback (RF) to improve the performance. Experiment is encouraging and validates the proposed methods.  相似文献   

12.
A Novel Approach Towards Large Scale Cross-Media Retrieval   总被引:1,自引:1,他引:0       下载免费PDF全文
With the rapid development of Internet and multimedia technology,cross-media retrieval is concerned to retrieve all the related media objects with multi-modality by submitting a query media object.Unfortunately,the complexity and the heterogeneity of multi-modality have posed the following two major challenges for cross-media retrieval:1) how to construct a unified and compact model for media objects with multi-modality,2) how to improve the performance of retrieval for large scale cross-media database.In this paper,we propose a novel method which is dedicate to solving these issues to achieve effective and accurate cross-media retrieval.Firstly,a multi-modality semantic relationship graph(MSRG) is constructed using the semantic correlation amongst the media objects with multi-modality.Secondly,all the media objects in MSRG are mapped onto an isomorphic semantic space.Further,an efficient indexing MK-tree based on heterogeneous data distribution is proposed to manage the media objects within the semantic space and improve the performance of cross-media retrieval.Extensive experiments on real large scale cross-media datasets indicate that our proposal dramatically improves the accuracy and efficiency of cross-media retrieval,outperforming the existing methods significantly.  相似文献   

13.
Zhang  Hong  Huang  Yu  Xu  Xin  Zhu  Ziqi  Deng  Chunhua 《Multimedia Tools and Applications》2018,77(3):3353-3368

Due to the rapid development of multimedia applications, cross-media semantics learning is becoming increasingly important nowadays. One of the most challenging issues for cross-media semantics understanding is how to mine semantic correlation between different modalities. Most traditional multimedia semantics analysis approaches are based on unimodal data cases and neglect the semantic consistency between different modalities. In this paper, we propose a novel multimedia representation learning framework via latent semantic factorization (LSF). First, the posterior probability under the learned classifiers is served as the latent semantic representation for different modalities. Moreover, we explore the semantic representation for a multimedia document, which consists of image and text, by latent semantic factorization. Besides, two projection matrices are learned to project images and text into a same semantic space which is more similar with the multimedia document. Experiments conducted on three real-world datasets for cross-media retrieval, demonstrate the effectiveness of our proposed approach, compared with state-of-the-art methods.

  相似文献   

14.
In this paper, we consider the problem of multimedia document (MMD) semantics understanding and content-based cross-media retrieval. An MMD is a set of media objects of different modalities but carrying the same semantics and the content-based cross-media retrieval is a new kind of retrieval method by which the query examples and search results can be of different modalities. Two levels of manifolds are learned to explore the relationships among all the data in the level of MMD and in the level of media object respectively. We first construct a Laplacian media object space for media object representation of each modality and an MMD semantic graph to learn the MMD semantic correlations. The characteristics of media objects propagate along the MMD semantic graph and an MMD semantic space is constructed to perform cross-media retrieval. Different methods are proposed to utilize relevance feedback and experiment shows that the proposed approaches are effective.  相似文献   

15.
Although multimedia objects such as images, audios and texts are of different modalities, there are a great amount of semantic correlations among them. In this paper, we propose a method of transductive learning to mine the semantic correlations among media objects of different modalities so that to achieve the cross-media retrieval. Cross-media retrieval is a new kind of searching technology by which the query examples and the returned results can be of different modalities, e.g., to query images by an example of audio. First, according to the media objects features and their co-existence information, we construct a uniform cross-media correlation graph, in which media objects of different modalities are represented uniformly. To perform the cross-media retrieval, a positive score is assigned to the query example; the score spreads along the graph and media objects of target modality or MMDs with the highest scores are returned. To boost the retrieval performance, we also propose different approaches of long-term and short-term relevance feedback to mine the information contained in the positive and negative examples.  相似文献   

16.
Qi  Jinwei  Huang  Xin  Peng  Yuxin 《Multimedia Tools and Applications》2017,76(23):25109-25127

As a highlighting research topic in the multimedia area, cross-media retrieval aims to capture the complex correlations among multiple media types. Learning better shared representation and distance metric for multimedia data is important to boost the cross-media retrieval. Motivated by the strong ability of deep neural network in feature representation and comparison functions learning, we propose the Unified Network for Cross-media Similarity Metric (UNCSM) to associate cross-media shared representation learning with distance metric in a unified framework. First, we design a two-pathway deep network pretrained with contrastive loss, and employ double triplet similarity loss for fine-tuning to learn the shared representation for each media type by modeling the relative semantic similarity. Second, the metric network is designed for effectively calculating the cross-media similarity of the shared representation, by modeling the pairwise similar and dissimilar constraints. Compared to the existing methods which mostly ignore the dissimilar constraints and only use sample distance metric as Euclidean distance separately, our UNCSM approach unifies the representation learning and distance metric to preserve the relative similarity as well as embrace more complex similarity functions for further improving the cross-media retrieval accuracy. The experimental results show that our UNCSM approach outperforms 8 state-of-the-art methods on 4 widely-used cross-media datasets.

  相似文献   

17.
With the explosion of multimedia data, it is usual that different multimedia data often coexist in web repositories. Accordingly, it is more and more important to explore underlying intricate cross-media correlation instead of single-modality distance measure so as to improve multimedia semantics understanding. Cross-media distance metric learning focuses on correlation measure between multimedia data of different modalities. However, the existence of content heterogeneity and semantic gap makes it very challenging to measure cross-media distance. In this paper, we propose a novel cross-media distance metric learning framework based on sparse feature selection and multi-view matching. First, we employ sparse feature selection to select a subset of relevant features and remove redundant features for high-dimensional image features and audio features. Secondly, we maximize the canonical coefficient during image-audio feature dimension reduction for cross-media correlation mining. Thirdly, we further construct a Multi-modal Semantic Graph to find embedded manifold cross-media correlation. Moreover, we fuse the canonical correlation and the manifold information into multi-view matching which harmonizes different correlations with an iteration process and build Cross-media Semantic Space for cross-media distance measure. The experiments are conducted on image-audio dataset for cross-media retrieval. Experiment results are encouraging and show that the performance of our approach is effective.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号