首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Exploring context information for visual recognition has recently received significant research attention. This paper proposes a novel and highly efficient approach, which is named semantic diffusion, to utilize semantic context for large-scale image and video annotation. Starting from the initial annotation of a large number of semantic concepts (categories), obtained by either machine learning or manual tagging, the proposed approach refines the results using a graph diffusion technique, which recovers the consistency and smoothness of the annotations over a semantic graph. Different from the existing graph-based learning methods that model relations among data samples, the semantic graph captures context by treating the concepts as nodes and the concept affinities as the weights of edges. In particular, our approach is capable of simultaneously improving annotation accuracy and adapting the concept affinities to new test data. The adaptation provides a means to handle domain change between training and test data, which often occurs in practice. Extensive experiments are conducted to improve concept annotation results using Flickr images and TV program videos. Results show consistent and significant performance gain (10 +% on both image and video data sets). Source codes of the proposed algorithms are available online.  相似文献   

2.
刘杰  杜军平 《电子学报》2014,42(5):987-991
图像语义标注是图像语义分析研究中的一个重要问题.在主题模型的基础上,本文提出一种新颖的跨媒体图像标注方法来进行图像间语义的传播.首先,对训练图像使用主题模型,抽取视觉模态和文本模态信息的潜在语义主题.然后,通过使用一个权重参数来融合两种模态信息的主题分布,从而学习到一种融合主题分布.最后,在融合主题分布的基础上训练一个标注模型来给目标图像赋予合适的语义信息.在标准的MSRC和Corel5K数据集上将提出的方法与最近著名的标注方法进行比较实验.标注性能的详细评价结果表明提出方法的有效性.  相似文献   

3.
图像标注旨在为图像分配一系列的语义标签描述图像的内容。针对高级语义与低级特征之间的语义鸿沟问题,本文提出了基于偏序结构的图像标注方法。首先,计算训练图像与测试图像的相似性得分,得到测试图像的初始邻近集及邻近标签;然后通过构建的属性偏序结构,获得邻近标签的相关语义,提高标签的丰富度,以及利用构建的对象偏序结构,得到最终的候选集。为了提高标注的准确率,设置一个频率阈值筛选出频率较高的标签作为最终的关键词。通过实验证明,实验结果有效地提高了标注的准确率和召回率。   相似文献   

4.
Automatic image annotation is a promising way to achieve more effective image retrieval and image analysis by using keywords associated to the image content. Due to the semantic gap between low-level visual features and high-level semantic concepts of an image, however, the performances of many existing algorithms are not so satisfactory. In this paper, a novel image classification scheme, named high order statistics based maximum a posterior (HOS-MAP), is proposed to deal with the issue of image annotation. To bridge the gap between human judgment and machine intelligence, the proposed scheme first constructs a dissimilarity representation for each image in a non-Euclidean space; then, the information of dissimilarity diffusion distribution for each image is achieved with respect to the high-order statistics of a triplet of nearest neighbor images; finally, a maximum a posteriori algorithm with the information of Gaussian Mixture Model and dissimilarity diffusion distribution is adopted to estimate the relevance between each annotation and an input un-annotated image. Experimental results on a general-purpose image database demonstrate the effectiveness and efficiency of the proposed automatic image annotation scheme.  相似文献   

5.
In this work, we propose an efficient image annotation approach based on visual content of regions. We assume that regions can be described using low-level features as well as high-level ones. Indeed, given a labeled dataset, we adopt a probabilistic semantic model to capture relationships between low-level features and semantic clusters of regions. Moreover, since most previous works on image annotation do not deal with the curse of dimensionality, we solve this problem by introducing a fuzzy version of the Vector Approximation Files (VA-Files). Indeed, the main contribution of this work resides in the association of the generative model with fuzzy VA-Files, which offer an accurate multi-dimensional indexing, to estimate relationships between low-level features and semantic concepts. In fact, the proposed approach reduces the computation complexity while optimizing the annotation quality. Preliminary experiments highlight that the suggested approach outperforms other state-of-the-art approaches.  相似文献   

6.
Automatic image annotation (AIA) is very important to image retrieval and image understanding. Two key issues in AIA are explored in detail in this paper, i.e., structured visual feature selection and the implementation of hierarchical correlated structures among multiple tags to boost the performance of image annotation. This paper simultaneously introduces an input and output structural grouping sparsity into a regularized regression model for image annotation. For input high-dimensional heterogeneous features such as color, texture, and shape, different kinds (groups) of features have different intrinsic discriminative power for the recognition of certain concepts. The proposed structured feature selection by structural grouping sparsity can be used not only to select group-of-features but also to conduct within-group selection. Hierarchical correlations among output labels are well represented by a tree structure, and therefore, the proposed tree-structured grouping sparsity can be used to boost the performance of multitag image annotation. In order to efficiently solve the proposed regression model, we relax the solving process as a framework of the bilayer regression model for multilabel boosting by the selection of heterogeneous features with structural grouping sparsity (Bi-MtBGS). The first-layer regression is to select the discriminative features for each label. The aim of the second-layer regression is to refine the feature selection model learned from the first layer, which can be taken as a multilabel boosting process. Extensive experiments on public benchmark image data sets and real-world image data sets demonstrate that the proposed approach has better performance of multitag image annotation and leads to a quite interpretable model for image understanding.  相似文献   

7.
In this paper, we present an approach based on probabilistic latent semantic analysis (PLSA) to achieve the task of automatic image annotation and retrieval. In order to model training data precisely, each image is represented as a bag of visual words. Then a probabilistic framework is designed to capture semantic aspects from visual and textual modalities, respectively. Furthermore, an adaptive asymmetric learning algorithm is proposed to fuse these aspects. For each image document, the aspect distributions of different modalities are fused by multiplying different weights, which are determined by the visual representations of images. Consequently, the probabilistic framework can predict semantic annotation precisely for unseen images because it associates visual and textual modalities properly. We compare our approach with several state-of-the-art approaches on a standard Corel dataset. The experimental results show that our approach performs more effectively and accurately.  相似文献   

8.
为了解决传统的CBIR系统中存在的"语义鸿沟"问题,提出一种基于潜在语义索引技术(LSI)和相关反馈技术的图像检索方法.在进行图像检索时,先在HSV空间下提取颜色直方图作为底层视觉特征进行图像检索,然后引入潜在语义索引技术试图将底层特征赋予更高层次的语义含义;并且结合相关反馈技术,通过与用户交互进一步提高检索精度.实验...  相似文献   

9.
为了改善传统FCM算法抗噪性差的问题,提出了基于自适应相似度距离的FCM算法.算法将像素分为两个特征:第一个描述的是像素的内在属性(灰度级特征),第二个描述邻域像素特征(空间特征).在此基础上,基于自适应相似度距离,根据像素在图像中的空间位置决定哪一个特征拥有优先级,对其进行聚类.图像分割结果表明,算法比标准FCM算法有明显改善,具有很好的抗噪性能,取得了更好的分割效果.  相似文献   

10.
With the proliferation of applications that demand content-based image retrieval, two merits are becoming more desirable. The first is the reduced search space, and the second is the reduced “semantic gap.” This paper proposes a semantic clustering scheme to achieve these two goals. By performing clustering before image retrieval, the search space can be significantly reduced. The proposed method is different from existing image clustering methods as follows: (1) it is region based, meaning that image sub-regions, instead of the whole image, are grouped into. The semantic similarities among image regions are collected over the user query and feedback history; (2) the clustering scheme is dynamic in the sense that it can evolve to include more new semantic categories. Ideally, one cluster approximates one semantic concept or a small set of closely related semantic concepts, based on which the “semantic gap” in the retrieval is reduced.  相似文献   

11.
为了提高图像标注性能,提出了一种基于视觉语义主题与反馈日志的图像自动标注方法。首 先,提取图像 前景与背景区域,分别进行处理;其次,基于WordNet构建标注词之间的语义关系模型,并 结合概率潜在语义分析(PLSA) 与高斯混合模型(GMM)建立图像底层特征、视觉语义主题与标注  相似文献   

12.
We present a relevance feedback approach based on multi‐class support vector machine (SVM) learning and cluster‐merging which can significantly improve the retrieval performance in region‐based image retrieval. Semantically relevant images may exhibit various visual characteristics and may be scattered in several classes in the feature space due to the semantic gap between low‐level features and high‐level semantics in the user's mind. To find the semantic classes through relevance feedback, the proposed method reduces the burden of completely re‐clustering the classes at iterations and classifies multiple classes. Experimental results show that the proposed method is more effective and efficient than the two‐class SVM and multi‐class relevance feedback methods.  相似文献   

13.
柯逍  邹嘉伟  杜明智  周铭柯 《电子学报》2017,45(12):2925-2935
针对传统图像标注模型存在着训练时间长、对低频词汇敏感等问题,该文提出了基于蒙特卡罗数据集均衡和鲁棒性增量极限学习机的图像自动标注模型.该模型首先对公共图像库的训练集数据进行图像自动分割,选择分割后相应的种子标注词,并通过提出的基于综合距离的图像特征匹配算法进行自动匹配以形成不同类别的训练集.针对公共数据库中不同标注词的数据规模相差较大,提出了蒙特卡罗数据集均衡算法使得各个标注词间的数据规模大体一致.然后针对单一特征描述存在的不足,提出了多尺度特征融合算法对不同标注词图像进行有效的特征提取.最后针对传统极限学习机存在的隐层节点随机性和输入向量权重一致性的问题,提出了鲁棒性增量极限学习,提高了判别模型的准确性.通过在公共数据集上的实验结果表明:该模型可以在很短时间内实现图像的自动标注,对低频词汇具有较强的鲁棒性,并且在平均召回率、平均准确率、综合值等多项指标上均高于现流行的大多数图像自动标注模型.  相似文献   

14.
User interaction is an effective way to handle the semantic gap problem in image annotation. To minimize user effort in the interactions, many active learning methods were proposed. These methods treat the semantic concepts individually or correlatively. However, they still neglect the key motivation of user feedback: to tackle the semantic gap. The size of the semantic gap of each concept is an important factor that affects the performance of user feedback. User should pay more efforts to the concepts with large semantic gaps, and vice versa. In this paper, we propose a semantic-gap-oriented active learning method, which incorporates the semantic gap measure into the information-minimization-based sample selection strategy. The basic learning model used in the active learning framework is an extended multilabel version of the sparse-graph-based semisupervised learning method that incorporates the semantic correlation. Extensive experiments conducted on two benchmark image data sets demonstrated the importance of bringing the semantic gap measure into the active learning process.  相似文献   

15.
设计一个稳健的自动图像标注系统的重要环节是提取能够有效描述图像语义的视觉特征。由于颜色、纹理和形状等异构视觉特征在表示特定图像语义时所起作用的重要程度不同且同一类特征之间具有一定的相关性,该文提出了一种图正则化约束下的非负组稀疏(Graph Regularized Non-negative Group Sparsity, GRNGS)模型来实现图像标注,并通过一种非负矩阵分解方法来计算其模型参数。该模型结合了图正则化与l2,1-范数约束,使得标注过程中所选的组群特征能体现一定的视觉相似性和语义相关性。在Corel5K和ESP Game等图像数据集上的实验结果表明:相较于一些最新的图像标注模型,GRNGS模型的鲁棒性更强,标注结果更精确。  相似文献   

16.
Finding an image from a large set of images is an extremely difficult problem. One solution is to label images manually, but this is very expensive, time consuming and infeasible for many applications. Furthermore, the labeling process depends on the semantic accuracy in describing the image. Therefore many Content based Image Retrieval (CBIR) systems are developed to extract low-level features for describing the image content. However, this approach decreases the human interaction with the system due to the semantic gap between low-level features and high-level concepts. In this study we make use of fuzzy logic to improve CBIR by allowing users to express their requirements in words, the natural way of human communication. In our system the image is represented by a Fuzzy Attributed Relational Graph (FARG) that describes each object in the image, its attributes and spatial relation. The texture and color attributes are computed in a way that model the Human Vision System (HSV). We proposed a new approach for graph matching that resemble the human thinking process. The proposed system is evaluated by different users with different perspectives and is found to match users’ satisfaction to a high degree.  相似文献   

17.
滑文强  王爽  郭岩河  谢雯 《雷达学报》2019,8(4):458-470
该文针对极化SAR图像分类中只有少量标记样本的问题,提出了一种基于邻域最小生成树的半监督极化SAR图像分类方法。该方法针对极化SAR图像以像素为分类对象的特点,结合自训练方法的思想,利用极化SAR图像像素点的空间信息,提出了基于邻域最小生成树辅助学习的样本选择策略,增加自训练过程中被选择无标记样本的可靠性,扩充标记样本数量,训练更好的分类器。最终用训练好的分类器对极化SAR图像进行测试。对3组真实的极化SAR图像进行测试,实验结果表明,该方法在只有少量标记样本的情况下能获得满意的分类结果,且分类正确率明显优于传统的分类算法。   相似文献   

18.
针对传统的基于目标区域的图像检索算法中存在的"语义鸿沟"问题,以及基于全局特征的图像检索算法不能很好地处理多目标检索问题,提出了一种基于多目标区域的图像检索模型,并实现了一款高效的检索算法.首先借助于目标检测算法定位出图像中的目标,然后使用卷积神经网络(CNN)提取各个目标的特征,最后采用新提出的多目标区域相似度测量方法计算其与数据库图像的相似度并返回检索结果.实验表明,所提算法与现有的其他检索算法相比,在多目标图像检索任务上性能更佳.  相似文献   

19.
A Multi-Directional Search technique for image annotation propagation   总被引:1,自引:0,他引:1  
Image annotation has attracted lots of attention due to its importance in image understanding and search areas. In this paper, we propose a novel Multi-Directional Search framework for semi-automatic annotation propagation. In this system, the user interacts with the system to provide example images and the corresponding annotations during the annotation propagation process. In each iteration, the example images are clustered and the corresponding annotations are propagated separately to each cluster: images in the local neighborhood are annotated. Furthermore, some of those images are returned to the user for further annotation. As the user marks more images, the annotation process goes into multiple directions in the feature space. The query movements can be treated as multiple path navigation. Each path could be further split based on the user’s input. In this manner, the system provides accurate annotation assistance to the user - images with the same semantic meaning but different visual characteristics can be handled effectively. From comprehensive experiments on Corel and U. of Washington image databases, the proposed technique shows accuracy and efficiency on annotating image databases.  相似文献   

20.
图像中主要对象的提取对于图像语义的抽取以及图像内容的自动标注具有重要作用。本文提出了一种面向语义的图像主要目标的提取方法。对分割后的图像计算不同区域的显著相关色得到核心目标,然后从中提取训练样本集,通过人工神经网络对图像中的块进行分类训练,获得各个区域的主要对象。实验表明该方法可以较准确地提取出中心区域的对象以及周围区域的对象并能保留图像的主要信息。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号