为了提高图像自动标注的准确率,提出了一种基于图像显著区域的自动标注方法。首先提取图像的显著区域,然后提取图像的SIFT特征,利用K-均值聚类得到视觉词汇,并根据训练图像的SIFT特征是否位于显著区域进行不同的加权运算得到视觉词汇的词袋表示,最后利用支持向量机训练分类模型实现图像分类和标注。在一个包含1 255幅Corel图像的数据库进行实验,所提方法标注的准确率与整体考虑整幅图像特征相比有很大提高,表明提出的算法优于传统方法。  相似文献   

针对基于深度学习的图像标注模型输出层神经元数目与标注词汇量成正比,导致模型结构因词汇量的变化而改变的问题,提出了结合生成式对抗网络(GAN)和Word2vec的新标注模型。首先,通过Word2vec将标注词汇映射为固定的多维词向量;其次,利用GAN构建神经网络模型--GAN-W模型,使输出层神经元数目与多维词向量维数相等,与词汇量不再相关;最后,通过对模型多次输出结果的排序来确定最终标注。GAN-W模型分别在Corel 5K和IAPRTC-12图像标注数据集上进行实验,在Corel 5K数据集上,GAN-W模型准确率、召回率和F1值比卷积神经网络回归(CNN-R)方法分别提高5、14和9个百分点;在IAPRTC-12数据集上,GAN-W模型准确率、召回率和F1值比两场K最邻近(2PKNN)模型分别提高2、6和3个百分点。实验结果表明,GAN-W模型可以解决输出神经元数目随词汇量改变的问题,同时每幅图像标注的标签数目自适应,使得该模型标注结果更加符合实际标注情形。  相似文献   

自动图像标注是一项具有挑战性的工作,它对于图像分析理解和图像检索都有着重要的意义.在自动图像标注领域,通过对已标注图像集的学习,建立语义概念空间与视觉特征空间之间的关系模型,并用这个模型对未标注的图像集进行标注.由于低高级语义之间错综复杂的对应关系,使目前自动图像标注的精度仍然较低.而在场景约束条件下可以简化标注与视觉特征之间的映射关系,提高自动标注的可靠性.因此提出一种基于场景语义树的图像标注方法.首先对用于学习的标注图像进行自动的语义场景聚类,对每个场景语义类别生成视觉场景空间,然后对每个场景空间建立相应的语义树.对待标注图像,确定其语义类别后,通过相应的场景语义树,获得图像的最终标注.在Corel5K图像集上,获得了优于TM(translation model)、CMRM(cross media relevance model)、CRM(continous-space relevance model)、PLSA-GMM(概率潜在语义分析-高期混合模型)等模型的标注结果.  相似文献   

基于高斯混合模型的自动图像标注方法   总被引:1,自引:0,他引:1  
陈娜 《计算机应用》2010,30(11):2986-2987
为了进一步完善自动图像标注方法,提出基于高斯混合模型的自动图像标注方法。该方法通过建立每个关键词唯一的高斯混合模型(GMM),准确地描述关键词的语义内容,进而提高自动图像标注的精确性。最后,通过采用COREL图像数据集与不同方法的比较,从平均查准率、平均查全率的实验结果验证了该方法的有效性。  相似文献   

针对图像自动标注中底层视觉特征与高层语义之间的语义鸿沟问题,在传统字典学习的基础上,提出一种基于多标签判别字典学习的图像自动标注方法。首先,为每幅图像提取多种类型特征,将多种特征组合作为字典学习输入特征空间的输入信息;然后,设计一个标签一致性正则化项,将原始样本的标签信息融入到初始的输入特征数据中,结合标签一致性判别字典和标签一致性正则化项进行字典学习;最后,通过得到的字典和稀疏编码矩阵求解标签稀疏编向量,实现未知图像的语义标注。在Corel 5K数据集上测试其标注性能,所提标注方法平均查准率和平均查全率分别可达到35%和48%;与传统的稀疏编码方法(MSC)相比,分别提高了10个百分点和16个百分点;与距离约束稀疏/组稀疏编码方法(DCSC/DCGSC)相比,分别提高了3个百分点和14个百分点。实验结果表明,所提方法能够较好地预测未知图像的语义信息,与当前几种流行的图像标注方法进行比较,所提方法具有较好的标注性能。  相似文献   

图像自动标注是模式识别与计算机视觉等领域中的重要问题。针对现有图像自动标注模型普遍受到语义鸿沟问题的影响,提出了基于关键词同现的图像自动标注改善方法,该方法利用数据集中标注词间的关联性来改善图像自动标注的结果。此外,针对上述方法不能反映更广义的人的知识以及易受数据库规模影响等问题,提出了基于语义相似的图像自动标注改善方法,通过引入具有大量词汇、包含了人知识的结构化电子词典WordNet来计算词汇间的关系并改善图像自动标注结果。实验结果表明,提出的两个图像自动标注改善方法在各项评价指标上相比以往模型均有所提高。  相似文献   

基于图像分割的语义标注方法   总被引:1,自引:0,他引:1  
彭晏飞  孙鲁 《计算机应用》2012,32(6):1548-1551
为有效解决图像检索中存在的“语义鸿沟”问题,提出了一种新的语义标注方法。该方法以图像分割为基础,在训练阶段构建图像字典,通过对图像单元颜色、纹理、小波轮廓的分析和描述形成一种结合小波轮廓比对和概率统计的二阶段标注模型,模型针对不同类别的图像分阶段采用相应的标注方法。经实验,应用该模型进行图像检索查全率和查准率都有明显提高,其中查准率最高可提升23.6%,证明该方法更接近人对图像内容的理解,具有良好的标注效果和检索性能。  相似文献   

Jin  Cong  Sun  Qing-Mei  Jin  Shu-Wei 《Multimedia Tools and Applications》2019,78(9):11815-11834
Multimedia Tools and Applications - Automated image annotation (AIA) is an important issue in computer vision and pattern recognition, and plays an extremely important role in retrieving...  相似文献   

Supervised learning of semantic classes for image annotation and retrieval   总被引:9,自引:0,他引:9  
A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning  相似文献   

近年来,图像标注技术得到广泛关注.提出一种图学习的自动图像标注方法,将图像标注作为多示例学习框架下的半监督学习策略,通过给出适合图像在包空间的有效度量方式,充分利用未标注样本挖掘图像特征的内在规律性,将半监督学习的方法和多示例学习有效结合起来,从而获得更准确的标注结果.实验结果表明,提出的标注方法可行,同时标注结果与传统的标注方法相比得到了明显提高.  相似文献   

Automatic image annotation using visual content and folksonomies   总被引:2,自引:4,他引:2  
Automatic image annotation is an important and challenging task, and becomes increasingly necessary when managing large image collections. This paper describes techniques for automatic image annotation that take advantage of collaboratively annotated image databases, so called visual folksonomies. Our approach applies two techniques based on image analysis: First, classification annotates images with a controlled vocabulary and second tag propagation along visually similar images. The latter propagates user generated, folksonomic annotations and is therefore capable of dealing with an unlimited vocabulary. Experiments with a pool of Flickr images demonstrate the high accuracy and efficiency of the proposed methods in the task of automatic image annotation. Both techniques were applied in the prototypical tag recommender “tagr”.  相似文献   

Automatically assigning relevant text keywords to images is an important problem. Many algorithms have been proposed in the past decade and achieved good performance. Efforts have focused upon model representations of keywords, whereas properties of features have not been well investigated. In most cases, a group of features is preselected, yet important feature properties are not well used to select features. In this paper, we introduce a regularization-based feature selection algorithm to leverage both the sparsity and clustering properties of features, and incorporate it into the image annotation task. Using this group-sparsity-based method, the whole group of features [e.g., red green blue (RGB) or hue, saturation, and value (HSV)] is either selected or removed. Thus, we do not need to extract this group of features when new data comes. A novel approach is also proposed to iteratively obtain similar and dissimilar pairs from both the keyword similarity and the relevance feedback. Thus, keyword similarity is modeled in the annotation framework. We also show that our framework can be employed in image retrieval tasks by selecting different image pairs. Extensive experiments are designed to compare the performance between features, feature combinations, and regularization-based feature selection methods applied on the image annotation task, which gives insight into the properties of features in the image annotation task. The experimental results demonstrate that the group-sparsity-based method is more accurate and stable than others.  相似文献   

图像标注的目标是针对每幅图像,利用相对应的文本信息进行描述,从而能够对海量的图像数据进行有效的管理和检索。尽管图像标注已经被研究了若干年,然而它仍然是机器视觉和机器学习领域中一个非常具有挑战性的问题。各种各样的算法被用于图像的标注工作。对目前基于关键词的图像标注的一些常用的算法和模型进行了综述,包括传统的基于分类的方法、相关模型、主题模型、基于随机场的上下文信息的处理以及利用Internet上海量的数据来辅助图像标注等等。讨论了目前图像标注研究中遇到的一些具有挑战性的问题。  相似文献   

为了在图像语义标注领域能更好地反映标注之间的关系,通过对已标注图像的标注进行分析来建立标 注之间的关系,并在此基础上将叙词查询的概念引入到图像语义标注中并提出了基于叙词查询的图像语义标注 方法,把语义标注问题统一在叙词查询与图像的语义关系相结合在统一的框架下,最后通过在Corel图像数据库中的验证表明,所提出的方法是有效的并且标注率得到了明显的提高。  相似文献   

Automatic thresholding has been widely used in machine vision for automatic image segmentation. Otsu’s method selects an optimum threshold by maximizing the between-class variance in a grayscale image. However, the method becomes time-consuming when extended to multi-level threshold problems, because excessive iterations are required in order to compute the cumulative probability and the mean of class. In this paper, we focus on the issue of automatic selection for multi-level thresholding, and we greatly improve the efficiency of Otsu’s method for image segmentation based on evolutionary approaches. We have investigated and evaluated the performance of the Otsu and Valleyemphasis thresholding methods. Based on our evaluation results, we have developed many different algorithms for automatic threshold selection based on the evolutionary method using the Modified Adaptive Genetic Algorithm and the Hill Climbing Algorithm. The experimental results show that the evolutionary approach achieves a satisfactory segmentation effect and that the processing time can be greatly reduced when the number of thresholds increases.  相似文献   

The development of technology generates huge amounts of non-textual information, such as images. An efficient image annotation and retrieval system is highly desired. Clustering algorithms make it possible to represent visual features of images with finite symbols. Based on this, many statistical models, which analyze correspondence between visual features and words and discover hidden semantics, have been published. These models improve the annotation and retrieval of large image databases. However, image data usually have a large number of dimensions. Traditional clustering algorithms assign equal weights to these dimensions, and become confounded in the process of dealing with these dimensions. In this paper, we propose weighted feature selection algorithm as a solution to this problem. For a given cluster, we determine relevant features based on histogram analysis and assign greater weight to relevant features as compared to less relevant features. We have implemented various different models to link visual tokens with keywords based on the clustering results of K-means algorithm with weighted feature selection and without feature selection, and evaluated performance using precision, recall and correspondence accuracy using benchmark dataset. The results show that weighted feature selection is better than traditional ones for automatic image annotation and retrieval.  相似文献   

一种基于SIFT特征的航拍图像序列自动拼接方法   总被引:4,自引:0,他引:4  
高超  张鑫  王云丽  王晖 《计算机应用》2007,27(11):2789-2792
针对航拍图像序列拼接问题,提出一种基于尺度不变特征变换(SIFT)特征的自动拼接方法,该方法主要包括图像配准和镶嵌两个步骤。由于航拍序列各帧图像之间存在较大差异,因而常用的基于特征的配准方法适用性较差,对此,提出利用SIFT特征来实现准确、稳健的航拍图像配准。进一步的,提出一种基于视觉特征的色彩融合方法以取得平滑的镶嵌效果。最后,通过对真实航拍序列进行拼接实验验证了所提方法的有效性。  相似文献   

Image annotation is a process of assigning metadata to digital images in the form of captions or keywords, and has been regarded as image management and one of the most crucial processes of image retrieval. And many automatic methods have been proposed. However, these methods still have some problems respectively. Fractals are fragmented geometries and can be considered separate parts; each part is similar to the contracted overall shape. Fractal features provide geometric information of an image that is irrelevant to the shape and size of an object in the image; therefore, fractal features are more robust than color and texture features. Therefore, this study proposed a fractal-driven image annotation (FIA) schema that extracts fractal features through fractal image coding and integrates color and texture as new visual features to conduct image-based annotation. Experimental results indicate that the effect of thresholds on annotating accuracy is insignificant. This finding supports the application of FIA on complex practical environments, reduces the time for identifying the optimal thresholds, and improves the practicality of using FIA in real environments.  相似文献   

结合多媒体描述接口(MPEG-7)和MM(Mixture Model)混合模型,实现了基于决策融合的图像自动标注。在图像标注过程中,分别利用颜色描述子和纹理描述子为每个主题下的图像建立MM混合模型,实现低层视觉特征到高层语义空间的映射,利用局部决策融合方式融合在颜色和纹理MM混合模型下的标注结果,实现图像自动标注。通过在corel图像数据集上的实验,表明提出的局部决策融合方式能更充分利用图像的颜色和纹理信息,提高了图像标注性能。  相似文献   

