首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
柯逍  邹嘉伟  杜明智  周铭柯 《电子学报》2017,45(12):2925-2935
针对传统图像标注模型存在着训练时间长、对低频词汇敏感等问题,该文提出了基于蒙特卡罗数据集均衡和鲁棒性增量极限学习机的图像自动标注模型.该模型首先对公共图像库的训练集数据进行图像自动分割,选择分割后相应的种子标注词,并通过提出的基于综合距离的图像特征匹配算法进行自动匹配以形成不同类别的训练集.针对公共数据库中不同标注词的数据规模相差较大,提出了蒙特卡罗数据集均衡算法使得各个标注词间的数据规模大体一致.然后针对单一特征描述存在的不足,提出了多尺度特征融合算法对不同标注词图像进行有效的特征提取.最后针对传统极限学习机存在的隐层节点随机性和输入向量权重一致性的问题,提出了鲁棒性增量极限学习,提高了判别模型的准确性.通过在公共数据集上的实验结果表明:该模型可以在很短时间内实现图像的自动标注,对低频词汇具有较强的鲁棒性,并且在平均召回率、平均准确率、综合值等多项指标上均高于现流行的大多数图像自动标注模型.  相似文献   

2.
Image retrieval has lagged far behind text retrieval despite more than two decades of intensive research effort. Most of the research on image retrieval in the last two decades are on content based image retrieval or image retrieval based on low level features. Recent research in this area focuses on semantic image retrieval using automatic image annotation. Most semantic image retrieval techniques in literature, however, treat an image as a bag of features/words while ignore the structural or spatial information in the image. In this paper, we propose a structural image retrieval method based on automatic image annotation and region based inverted file. In the proposed system, regions in an image are treated the same way as keywords in a structural text document, semantic concepts are learnt from image data to label image regions as keywords and weight is assigned to each keyword according to spatial position and relationship. As the result, images are indexed and retrieved in the same way as structural document retrieval. Specifically, images are broken down to regions which are represented using colour, texture and shape features. Region features are then quantized to create visual dictionaries which are similar to monolingual dictionaries like English or Chinese dictionaries. In the next step, a semantic dictionary similar to a bilingual dictionary like the English–Chinese dictionary is learnt to mapping image regions to semantic concepts. Finally, images are then indexed and retrieved using a novel region based inverted file data structure. Results show the proposed method has significant advantage over the widely used Bayesian annotation models.  相似文献   

3.
Automatic image annotation is a promising way to achieve more effective image retrieval and image analysis by using keywords associated to the image content. Due to the semantic gap between low-level visual features and high-level semantic concepts of an image, however, the performances of many existing algorithms are not so satisfactory. In this paper, a novel image classification scheme, named high order statistics based maximum a posterior (HOS-MAP), is proposed to deal with the issue of image annotation. To bridge the gap between human judgment and machine intelligence, the proposed scheme first constructs a dissimilarity representation for each image in a non-Euclidean space; then, the information of dissimilarity diffusion distribution for each image is achieved with respect to the high-order statistics of a triplet of nearest neighbor images; finally, a maximum a posteriori algorithm with the information of Gaussian Mixture Model and dissimilarity diffusion distribution is adopted to estimate the relevance between each annotation and an input un-annotated image. Experimental results on a general-purpose image database demonstrate the effectiveness and efficiency of the proposed automatic image annotation scheme.  相似文献   

4.
图像自动标注在检索大量数字图像时起到关键作用,它能将图像的视觉特征转化为图像的标注字信息,为用户的使用及检索带来极大的方便。研究了图像自动语义标注方法,设计并实现了基于Matlab图像自动标注系统,能够提取图像颜色特征和纹理特征,与已标注图像进行相似性度量并标注出图像语义关键词  相似文献   

5.
Due to the enormous quantity of radar images acquired by satellites and through shuttle missions, there is an evident need for efficient automatic analysis tools. This paper describes unsupervised classification of radar images in the framework of hidden Markov models and generalized mixture estimation. Hidden Markov chain models, applied to a Hilbert-Peano scan of the image, constitute a fast and robust alternative to hidden Markov random field models for spatial regularization of image analysis problems, even though the latter provide a finer and more intuitive modeling of spatial relationships. We here compare the two approaches and show that they can be combined in a way that conserves their respective advantages. We also describe how the distribution families and parameters of classes with constant or textured radar reflectivity can be determined through generalized mixture estimation. Sample results obtained on real and simulated radar images are presented.  相似文献   

6.
In this paper, we propose a probabilistic graphical model to represent weakly annotated images. We consider an image as weakly annotated if the number of keywords defined for it is less than the maximum number defined in the ground truth. This model is used to classify images and automatically extend existing annotations to new images by taking into account semantic relations between keywords. The proposed method has been evaluated in visual-textual classification and automatic annotation of images. The visual-textual classification is performed by using both visual and textual information. The experimental results, obtained from a database of more than 30,000 images, show an improvement by 50.5% in terms of recognition rate against only visual information classification. Taking into account semantic relations between keywords improves the recognition rate by 10.5%. Moreover, the proposed model can be used to extend existing annotations to weakly annotated images, by computing distributions of missing keywords. Semantic relations improve the mean rate of good annotations by 6.9%. Finally, the proposed method is competitive with a state-of-art model.  相似文献   

7.
陈晓 《电视技术》2012,36(23):35-38
针对图像语义概念具体语义描述的问题,提出了一种基于GMM的图像语义标注方法。该方法对于每一个语义概念分别建立基于颜色特征和纹理特征的GMM模型,利用EM算法获取关键词内容,最后融合两个GMM模型求取的概率排序结果,对未知图像进行标注。实验结果表明,提出的方法能够准确地为待标注的图像预测出若干文本关键字,有效提高图像标注的查准率和查全率。  相似文献   

8.
宋婉莹  李明  张鹏  吴艳  贾璐  刘高峰 《电子学报》2016,44(3):520-526
马尔可夫随机场(Markov Random Field,MRF)广泛用于处理遥感图像的分类问题,然而MRF在构建极化合成孔径雷达(Synthetic Aperture Radar,SAR)图像模型时未考虑其非平稳特性且对初始分类较为敏感,为此本文提出了一种基于加权合成核与三重马尔可夫随机场(Triplet Markov Field,TMF)的极化SAR图像分类方法.该方法依据训练样本在特征空间上的距离,提出了加权合成核函数权重系数的自适应确定方法以提高初始分类的精度和普适性;为充分考虑极化SAR图像的非平稳统计特性,利用TMF对极化SAR图像进行统计建模以实现贝叶斯分类.实验结果表明,与基于MRF的极化SAR图像分类方法相比,本文所提方法可获得更高的分类精度和更平滑的同质区域分类结果,而且本文方法能更好地保持图像边缘信息.  相似文献   

9.
A unified framework for image retrieval using keyword and visual features.   总被引:11,自引:0,他引:11  
In this paper, a unified image retrieval framework based on both keyword annotations and visual features is proposed. In this framework, a set of statistical models are built based on visual features of a small set of manually labeled images to represent semantic concepts and used to propagate keywords to other unlabeled images. These models are updated periodically when more images implicitly labeled by users become available through relevance feedback. In this sense, the keyword models serve the function of accumulation and memorization of knowledge learned from user-provided relevance feedback. Furthermore, two sets of effective and efficient similarity measures and relevance feedback schemes are proposed for query by keyword scenario and query by image example scenario, respectively. Keyword models are combined with visual features in these schemes. In particular, a new, entropy-based active learning strategy is introduced to improve the efficiency of relevance feedback for query by keyword. Furthermore, a new algorithm is proposed to estimate the keyword features of the search concept for query by image example. It is shown to be more appropriate than two existing relevance feedback algorithms. Experimental results demonstrate the effectiveness of the proposed framework.  相似文献   

10.
The bag of visual words (BOW) model is an efficient image representation technique for image categorization and annotation tasks. Building good visual vocabularies, from automatically extracted image feature vectors, produces discriminative visual words, which can improve the accuracy of image categorization tasks. Most approaches that use the BOW model in categorizing images ignore useful information that can be obtained from image classes to build visual vocabularies. Moreover, most BOW models use intensity features extracted from local regions and disregard colour information, which is an important characteristic of any natural scene image. In this paper, we show that integrating visual vocabularies generated from each image category improves the BOW image representation and improves accuracy in natural scene image classification. We use a keypoint density-based weighting method to combine the BOW representation with image colour information on a spatial pyramid layout. In addition, we show that visual vocabularies generated from training images of one scene image dataset can plausibly represent another scene image dataset on the same domain. This helps in reducing time and effort needed to build new visual vocabularies. The proposed approach is evaluated over three well-known scene classification datasets with 6, 8 and 15 scene categories, respectively, using 10-fold cross-validation. The experimental results, using support vector machines with histogram intersection kernel, show that the proposed approach outperforms baseline methods such as Gist features, rgbSIFT features and different configurations of the BOW model.  相似文献   

11.
The number of digital images rapidly increases, and it becomes an important challenge to organize these resources effectively. As a way to facilitate image categorization and retrieval, automatic image annotation has received much research attention. Considering that there are a great number of unlabeled images available, it is beneficial to develop an effective mechanism to leverage unlabeled images for large-scale image annotation. Meanwhile, a single image is usually associated with multiple labels, which are inherently correlated to each other. A straightforward method of image annotation is to decompose the problem into multiple independent single-label problems, but this ignores the underlying correlations among different labels. In this paper, we propose a new inductive algorithm for image annotation by integrating label correlation mining and visual similarity mining into a joint framework. We first construct a graph model according to image visual features. A multilabel classifier is then trained by simultaneously uncovering the shared structure common to different labels and the visual graph embedded label prediction matrix for image annotation. We show that the globally optimal solution of the proposed framework can be obtained by performing generalized eigen-decomposition. We apply the proposed framework to both web image annotation and personal album labeling using the NUS-WIDE, MSRA MM 2.0, and Kodak image data sets, and the AUC evaluation metric. Extensive experiments on large-scale image databases collected from the web and personal album show that the proposed algorithm is capable of utilizing both labeled and unlabeled data for image annotation and outperforms other algorithms.  相似文献   

12.
The automatic segmentation of nuclei in confocal reflectance images of cervical tissue is an important goal toward developing less expensive cervical precancer detection methods. Since in vivo confocal reflectance microscopy is an emerging technology for cancer detection, no prior work has been reported on the automatic segmentation of in vivo confocal reflectance images. However, prior work has shown that nuclear size and nuclear-to-cytoplasmic ratio can determine the presence or extent of cervical precancer. Thus, segmenting nuclei in confocal images will aid in cervical precancer detection. Successful segmentation of images of any type can be significantly enhanced by the introduction of accurate image models. To enable a deeper understanding of confocal reflectance microscopy images of cervical tissue, and to supply a basis for parameter selection in a classification algorithm, we have developed a model that accounts for the properties of the imaging system and of the tissues. Using our model in conjunction with a powerful image enhancement tool (anisotropic median-diffusion), appropriate statistical image modeling of spatial interactions (Gaussian Markov random fields), and a Bayesian framework for classification-segmentation, we have developed an effective algorithm for automatically segmenting nuclei in confocal images of cervical tissue. We have applied our algorithm to an extensive set of cervical images and have found that it detects 90% of hand-segmented nuclei with an average of 6 false positives per frame.  相似文献   

13.
Recent studies have shown that sparse representation (SR) can deal well with many computer vision problems, and its kernel version has powerful classification capability. In this paper, we address the application of a cooperative SR in semi-supervised image annotation which can increase the amount of labeled images for further use in training image classifiers. Given a set of labeled (training) images and a set of unlabeled (test) images, the usual SR method, which we call forward SR, is used to represent each unlabeled image with several labeled ones, and then to annotate the unlabeled image according to the annotations of these labeled ones. However, to the best of our knowledge, the SR method in an opposite direction, that we call backward SR to represent each labeled image with several unlabeled images and then to annotate any unlabeled image according to the annotations of the labeled images which the unlabeled image is selected by the backward SR to represent, has not been addressed so far. In this paper, we explore how much the backward SR can contribute to image annotation, and be complementary to the forward SR. The co-training, which has been proved to be a semi-supervised method improving each other only if two classifiers are relatively independent, is then adopted to testify this complementary nature between two SRs in opposite directions. Finally, the co-training of two SRs in kernel space builds a cooperative kernel sparse representation (Co-KSR) method for image annotation. Experimental results and analyses show that two KSRs in opposite directions are complementary, and Co-KSR improves considerably over either of them with an image annotation performance better than other state-of-the-art semi-supervised classifiers such as transductive support vector machine, local and global consistency, and Gaussian fields and harmonic functions. Comparative experiments with a nonsparse solution are also performed to show that the sparsity plays an important role in the cooperation of image representations in two opposite directions. This paper extends the application of SR in image annotation and retrieval.  相似文献   

14.
Since there is semantic gap between low-level visual features and high-level image semantic, the performance of many existing content-based image annotation algorithms is not satisfactory. In order to bridge the gap and improve the image annotation performance, a novel automatic image annotation (AIA) approach using neighborhood set (NS) based on image distance metric learning (IDML) algorithm is proposed in this paper. According to IDML, we can easily obtain the neighborhood set of each image since obtained image distance can effectively measure the distance between images for AIA task. By introducing NS, the proposed AIA approach can predict all possible labels of the image without caption. The experimental results confirm that the introduction of NS based on IDML can improve the efficiency of AIA approaches and achieve better annotation performance than the existing AIA approaches.  相似文献   

15.
徐侃  杨丽春  刘钢  杨文 《现代雷达》2012,34(9):59-62
狄利克雷过程混合模型(Dirichlet Process Mixture,DPM)作为一种非参数概率统计模型,可以有效应用于SAR图像的非监督分类。文中提出一种全自动的MSTAR坦克SAR图像分割方法。该方法首先基于DPM确定出图像中的类别数目,接着使用马尔科夫随机场(Markov Random Field,MRF)对所得图像类别概率的空间邻域关系进行描述,然后结合标号代价能量优化算法获取最终的分割结果。该方法在不需要人为指定待分割图像类别个数的同时,能较好地保证分割结果的合理性与连贯性。在MSTAR SAR数据上的实验表明了其有效性。  相似文献   

16.
基于视觉与标注相关信息的图像聚类算法   总被引:1,自引:0,他引:1       下载免费PDF全文
于林森  张田文 《电子学报》2006,34(7):1265-1269
算法首先按视觉相关程度对标注字进行打分,标注字的分值体现了语义一致图像的视觉连贯程度.利用图像语义类别固有的语言描述性,从图像标注中抽取具有明显视觉连贯性的标注字作为图像的语义类别,减少了数据库设计者繁琐的手工编目工作.按标注字信息对图像进行语义分类,提高了图像聚类的语义一致性.对4500幅Corel标注图像的聚类结果证实了算法的有效性.  相似文献   

17.
In this paper, we present an approach to segmenting the brain vasculature in phase contrast magnetic resonance angiography (PC-MRA). According to our prior work, we can describe the overall probability density function of a PC-MRA speed image as either a Maxwell-uniform (MU) or Maxwell-Gaussian-uniform (MGU) mixture model. An automatic mechanism based on Kullback-Leibler divergence is proposed for selecting between the MGU and MU models given a speed image volume. A coherence measure, namely local phase coherence (LPC), which incorporates information about the spatial relationships between neighboring flow vectors, is defined and shown to be more robust to noise than previously described coherence measures. A statistical measure from the speed images and the LPC measure from the phase images are combined in a probabilistic framework, based on the maximum a posteriori method and Markov random fields, to estimate the posterior probabilities of vessel and background for classification. It is shown that segmentation based on both measures gives a more accurate segmentation than using either speed or flow coherence information alone. The proposed method is tested on synthetic, flow phantom and clinical datasets. The results show that the method can segment normal vessels and vascular regions with relatively low flow rate and low signal-to-noise ratio, e.g., aneurysms and veins.  相似文献   

18.
The paper proposes a novel probabilistic generative model for simultaneous image classification and annotation.The model considers the fact that the category information can provide valuable information for image annotation.Once the category of an image is ascertained,the scope of annotation words can be narrowed,and the probability of generating irrelevant annotation words can be reduced.To this end,the idea that annotates images according to class is introduced in the model.Using variational methods,the approximate inference and parameters estimation algorithms of the model are derived,and efficient approximations for classifying and annotating new images are also given.The power of our model is demonstrated on two real world datasets:a 1 600-images LabelMe dataset and a 1 791-images UIUC-Sport dataset.The experiment results show that the classification performance is on par with several state-of-the-art classification models,while the annotation performance is better than that of several state-of-the-art annotation models.  相似文献   

19.
In this paper, we present an approach based on probabilistic latent semantic analysis (PLSA) to achieve the task of automatic image annotation and retrieval. In order to model training data precisely, each image is represented as a bag of visual words. Then a probabilistic framework is designed to capture semantic aspects from visual and textual modalities, respectively. Furthermore, an adaptive asymmetric learning algorithm is proposed to fuse these aspects. For each image document, the aspect distributions of different modalities are fused by multiplying different weights, which are determined by the visual representations of images. Consequently, the probabilistic framework can predict semantic annotation precisely for unseen images because it associates visual and textual modalities properly. We compare our approach with several state-of-the-art approaches on a standard Corel dataset. The experimental results show that our approach performs more effectively and accurately.  相似文献   

20.
Nowadays, image annotation has been a hot topic in the semantic retrieval field due to the abundant growth of digital images. The purpose of these methods is to realize the content of images and assign appropriate keywords to them. Extensive efforts have been conducted in this field, which effectiveness is limited between low-level image features and high-level semantic concepts. In this paper, we propose a Multi-View Robust Spectral Clustering (MVRSC) method, which tries to model the relationship between semantic and multi-features of training images based on the Maximum Correntropy Criterion. A Half-Quadratic optimization framework is used to solve the objective function. According to the constructed model, a few tags are suggested based on a novel decision-level fusion distance. The stability condition and bound calculation of MVRSC are analyzed, as well. Experimental results on real-world Flickr and 500PX datasets, and Corel5K confirm the superiority of the proposed method over other competing models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号