首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到16条相似文献,搜索用时 250 毫秒
1.
图像的自动标注是图像检索领域一项基础而又富有挑战性的任务。深度学习算法自提出以来在图像和文本识别领域取得了巨大的成功,是一种解决"语义鸿沟"问题的有效方法。图像标注问题可以分解为基于图像与标签相关关系的基本图像标注和基于标注词汇共生关系的标注改善两个过程。文中将基本图像标注问题视为一个多标记学习问题,图像的标签先验知识作为深度神经网络的监督信息。在得到基本标注词汇的基础上,利用原始图像标签词汇的依赖关系与先验分布改善了图像的标注结果。最后将所提出的改进的深度学习模型应用于Corel和ESP图像数据集,验证了该模型框架及所提出的解决方案的有效性。  相似文献   

2.
图像自动标注是模式识别与计算机视觉等领域中的重要问题。针对现有图像自动标注模型普遍受到语义鸿沟问题的影响,提出了基于关键词同现的图像自动标注改善方法,该方法利用数据集中标注词间的关联性来改善图像自动标注的结果。此外,针对上述方法不能反映更广义的人的知识以及易受数据库规模影响等问题,提出了基于语义相似的图像自动标注改善方法,通过引入具有大量词汇、包含了人知识的结构化电子词典WordNet来计算词汇间的关系并改善图像自动标注结果。实验结果表明,提出的两个图像自动标注改善方法在各项评价指标上相比以往模型均有所提高。  相似文献   

3.
一种新的图像语义自动标注模型   总被引:1,自引:0,他引:1       下载免费PDF全文
根据图像低层特征和高级语义间的对应关系,自动进行图像语义标注是目前图像检索系统研究的热点。简要介绍了基于图像语义连接网络的图像检索框架,提出了一种基于该框架的图像自动标注模型。该模型通过积累用户反馈信息,学习并获得图像语义,从而进行自动的图像标注。图像语义及标注可以在与用户交互过程中得到实时更新。还提出了一种词义相关度分析的方法剔除冗余标注词,解决标注误传播的问题。通过在Corel图像集上的对比实验,验证了该方法的有效性。  相似文献   

4.
自动图像标注因其对图像理解和网络图像检索的重要意义,近年来已成为新的热点研究课题.在图像标注的CMRM模型基础上,提出了一种基于词间相关性的CMRM标注方法.该方法提取了标注字之间的词间相关关系,并利用图学习算法,通过将词间相关性矩阵叠加到初始标注矩阵的方法对标注结果进行了改善.利用Corel5k标注图像库中的自然场景图像进行实验.实验结果表明,该方法很好地完成了对测试集图像的自动标注,在查全率与查准率上较CMRM模型有所提高.  相似文献   

5.
在为自动图像标注构建相似图的过程中,针对传统的方法是基于图像间的视觉相似性,其没有考虑到数据集中某个子数据集内的结构信息这一问题,提出一种基于Voronoi k阶邻近图的半监督学习自动图像标注方法。该方法充分考虑Voronoi k阶邻近图能很好地表达空间目标的影响区域以及可以方便地进行空间邻近的描述与推理的特性,将特征空间内的图像数据点分布信息融合到点对间的相似度量表示中,利用未标注样本挖掘图像特征的内在规律,然后把半监督学习的方法和多标记学习有效结合起来,从而达到对图像进行自动标注。实验结果表明,提出的标注方法可行,同时标注结果与传统的标注方法相比得到了明显改善。  相似文献   

6.
周铭柯  柯逍  杜明智 《软件学报》2017,28(7):1862-1880
自动图像标注是一个包含众多标签、多样特征的富有挑战性的研究问题,是新一代图像检索与图像理解的关键步骤.针对传统基于浅层机器学习标注算法标注效率低下、难以处理复杂分类任务的问题,本文提出了基于栈式自动编码器(SAE)的自动图像标注算法,提升了标注效率和标注效果.全文主要针对图像标注数据不平衡问题,提出两种解决思路:对于标注模型,我们提出一种增强训练中低频标签的平衡栈式自动编码器(B-SAE),较好地改善了中低频标签的标注效果.并在此模型基础上提出一种分组强化训练B-SAE子模型的鲁棒平衡栈式自动编码器算法(RB-SAE),提升了标注的稳定性,从而保证模型本身具有较强地处理不平衡数据的能力;对于标注过程,我们以未知图像作为出发点,首先构造未知图像的局部均衡数据集,并判定该图像的高低频属性来决定不同的标注过程,局部语义传播算法(SP)标注中低频图像,RB-SAE算法标注高频图像,形成属性判别的标注框架(ADA),保证了标注过程具有较强地应对不平衡数据的能力,从而提升整体图像标注效果.通过在三个公共数据集上进行实验验证,结果表明,本文方法在许多指标上相比以往方法均有较大提高.  相似文献   

7.
莲花山数据集采用与或图作为视觉知识模型,实现对客观世界中视觉模式的多层次表达,从而将多种标注任务统一到图像语法模型框架下。配套数据库通过两层模型分别管理视觉模型与标注数据,提供了灵活方便的数据导入、管理、查阅、输出功能,这是其他数据集所不具备的。最后给出基于该数据集标注结果的内容检索实验,该算法加入到标注工具中,作为一种自动功能用于辅助加速人工标注过程。  相似文献   

8.
李东艳  李绍滋  柯逍 《计算机应用》2010,30(10):2610-2613
针对图像标注中所使用数据集存在的数据不平衡问题,提出一种新的基于外部数据库的自动平衡模型。该模型先依据原始数据库中词频分布来找出低频点,再根据自动平衡模式,对每个低频词,从外部数据库中增加相应的图片;然后对图片进行特征提取,对Corel 5k数据集中的47065个视觉词汇和从外部数据库中追加的图片中提取出来的996个视觉词汇进行聚类;最后利用基于外部数据库的图像自动标注改善模型对图像进行标注。此方法克服了图像标注中数据库存在的不平衡问题,使得至少被正确标注一次的词的数量、精确率和召回率等均明显提高。  相似文献   

9.
图像自动标注的实质是通过对图像视觉特征的分析来提取高层语义关键词用于表示图像的含义,从而使得现有图像检索问题转化为技术已经相当成熟的文本检索问题,在一定程度上解决了基于内容图像检索中存在的语义鸿沟问题.采用t混合模型在已标注好的训练图像集上计算图像区域类与关键字的联合概率分布,在此基础上,对未曾观察过的测试图像集,利用生成的模型根据贝叶斯最小错误概率准则实现自动图像标注.实验结果表明,该方法能有效改善标注结果.  相似文献   

10.
图像自动标注是模式识别与计算机视觉等领域中重要而又具有挑战性的问题.针对现有模型存在数据利用率低与易受正负样本不平衡影响等问题,提出了基于判别模型与生成模型的新型层叠图像自动标注模型.该模型第一层利用判别模型对未标注图像进行主题标注,获得相应的相关图像集;第二层利用提出的面向关键词的方法建立图像与关键词之间的联系,并使用提出的迭代算法分别对语义关键词与相关图像进行扩展;最后利用生成模型与扩展的相关图像集对未标注图像进行详细标注.该模型综合了判别模型与生成模型的优点,通过利用较少的相关训练图像来获得更好的标注结果.在Corel 5K图像库上进行的实验验证了该模型的有效性.  相似文献   

11.
Image annotation has been an active research topic in recent years due to its potential impact on both image understanding and web image search. In this paper, we propose a graph learning framework for image annotation. First, the image-based graph learning is performed to obtain the candidate annotations for each image. In order to capture the complex distribution of image data, we propose a Nearest Spanning Chain (NSC) method to construct the image-based graph, whose edge-weights are derived from the chain-wise statistical information instead of the traditional pairwise similarities. Second, the word-based graph learning is developed to refine the relationships between images and words to get final annotations for each image. To enrich the representation of the word-based graph, we design two types of word correlations based on web search results besides the word co-occurrence in the training set. The effectiveness of the proposed solution is demonstrated from the experiments on the Corel dataset and a web image dataset.  相似文献   

12.
The vast amount of images available on the Web request for an effective and efficient search service to help users find relevant images.The prevalent way is to provide a keyword interface for users to submit queries.However,the amount of images without any tags or annotations are beyond the reach of manual efforts.To overcome this,automatic image annotation techniques emerge,which are generally a process of selecting a suitable set of tags for a given image without user intervention.However,there are three main challenges with respect to Web-scale image annotation:scalability,noiseresistance and diversity.Scalability has a twofold meaning:first an automatic image annotation system should be scalable with respect to billions of images on the Web;second it should be able to automatically identify several relevant tags among a huge tag set for a given image within seconds or even faster.Noise-resistance means that the system should be robust enough against typos and ambiguous terms used in tags.Diversity represents that image content may include both scenes and objects,which are further described by multiple different image features constituting different facets in annotation.In this paper,we propose a unified framework to tackle the above three challenges for automatic Web image annotation.It mainly involves two components:tag candidate retrieval and multi-facet annotation.In the former content-based indexing and concept-based codebook are leveraged to solve scalability and noise-resistance issues.In the latter the joint feature map has been designed to describe different facets of tags in annotations and the relations between these facets.Tag graph is adopted to represent tags in the entire annotation and the structured learning technique is employed to construct a learning model on top of the tag graph based on the generated joint feature map.Millions of images from Flickr are used in our evaluation.Experimental results show that we have achieved 33% performance improvements compared with those single facet approaches in terms of three metrics:precision,recall and F1 score.  相似文献   

13.
针对自动图像标注中底层特征和高层语义之间的鸿沟问题,提出一种基于随机点积图的图像标注改善算法。该算法首先采用图像底层特征对图像候选标注词建立语义关系图,然后利用随机点积图对其进行随机重构,从而挖掘出训练图像集中丢失的语义关系,最后采用重启式随机游走算法,实现图像标注改善。该算法结合了图像的底层特征与高层语义,有效降低了图像集规模变小对标注的影响。在3种通用图像库上的实验证明了该算法能够有效改善图像标注,宏F值与微平均F值最高分别达到0.784与0.743。  相似文献   

14.
郭海凤 《计算机工程》2012,38(12):211-213
在自动标注系统中,底层特征转换成高层标注的准确度较低。为此,将自动标注系统中的底层视觉特征和社会标注系统中的高级语义相结合,提出一种新的图像语义标注算法——FAC算法。从自动标注系统和flickr网站用户中得到候选标注,利用图像标注推荐策略获取推荐标注,根据WordNet语义词典中的语义关系,精简出最终的标注集合。实验结果表明,与传统的自动标注算法相比,FAC算法的准确度较高。  相似文献   

15.
Scene image understanding has drawn much attention for its intriguing applications in the past years. In this paper, we propose a unified probabilistic graphical model called Topic-based Coherent Region Annotation (TCRA) for weakly-supervised scene region annotation. The multiscale over-segmented regions within a scene image are considered as the “words” of our topic model, which impose neighborhood contextual constraints on topic level through spatial MRF modeling, and incorporate an annotation reasoning mechanism for learning and inferring region labels automatically. Mean field variational inference is provided for model learning. The proposed TCRA has the following two main advantages for understanding natural scene images. First, spatial information of multiscale over-segmented regions is explicitly modeled to obtain coherent region annotations. Second, only image-level labels are needed for automatically inferring the label of every region within the scene. This is particularly helpful in reducing human burden on manually labeling pixel-level semantics in the scene understanding research. Thus, given a scene image that has no textual prior, the regions in it can be automatically labeled using the learned TCRA model. The experimental results conducted on three benchmarks consisting of the MSRCORID image dataset, the UIUC Events image dataset and the SIFT FLOW dataset show that the proposed model outperforms the recent state-of-the-art methods.  相似文献   

16.
In most of the learning-based image annotation approaches, images are represented using multiple-instance (local) or single-instance (global) features. Their performances, however, are mixed as for certain concepts, the single-instance representations of images are more suitable, while for others, the multiple-instance representations are better. Thus this paper explores a unified learning framework that combines the multiple-instance and single-instance representations for image annotation. More specifically, we propose an integrated graph-based semi-supervised learning framework to utilize these two types of representations simultaneously. We further explore three strategies to convert from multiple-instance representation into a single-instance one. Experiments conducted on the COREL image dataset demonstrate the effectiveness and efficiency of the proposed integrated framework and the conversion strategies.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号