首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Jin  Cong  Sun  Qing-Mei  Jin  Shu-Wei 《Multimedia Tools and Applications》2019,78(9):11815-11834
Multimedia Tools and Applications - Automated image annotation (AIA) is an important issue in computer vision and pattern recognition, and plays an extremely important role in retrieving...  相似文献   

2.
Zhang  Weifeng  Hu  Hua  Hu  Haiyang 《Multimedia Tools and Applications》2018,77(17):22385-22406

Automatic image annotation aims to predict labels for images according to their semantic contents and has become a research focus in computer vision, as it helps people to edit, retrieve and understand large image collections. In the last decades, researchers have proposed many approaches to solve this task and achieved remarkable performance on several standard image datasets. In this paper, we propose a novel learning to rank approach to address image auto-annotation problem. Unlike typical learning to rank algorithms for image auto-annotation which directly rank annotations for image, our approach consists of two phases. In the first phase, neural ranking models are trained to rank image’s semantic neighbors. Then nearest-neighbor based models propagate annotations from these semantic neighbors to the image. Thus our approach integrates learning to rank algorithms and nearest-neighbor based models, including TagProp and 2PKNN, and inherits their advantages. Experimental results show that our method achieves better or comparable performance compared with the state-of-the-art methods on four challenging benchmarks including Corel5K, ESP Games, IAPR TC-12 and NUS-WIDE.

  相似文献   

3.
Nowadays, more and more images are available. However, to find a required image for an ordinary user is a challenging task. Large amount of researches on image retrieval have been carried out in the past two decades. Traditionally, research in this area focuses on content based image retrieval. However, recent research shows that there is a semantic gap between content based image retrieval and image semantics understandable by humans. As a result, research in this area has shifted to bridge the semantic gap between low level image features and high level semantics. The typical method of bridging the semantic gap is through the automatic image annotation (AIA) which extracts semantic features using machine learning techniques. In this paper, we focus on this latest development in image retrieval and provide a comprehensive survey on automatic image annotation. We analyse key aspects of the various AIA methods, including both feature extraction and semantic learning methods. Major methods are discussed and illustrated in details. We report our findings and provide future research directions in the AIA area in the conclusions  相似文献   

4.
Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) have been used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. This paper presents MRSMO, a MapReduce based distributed SVM algorithm for automatic image annotation. The performance of the MRSMO algorithm is evaluated in an experimental environment. By partitioning the training dataset into smaller subsets and optimizing the partitioned subsets across a cluster of computers, the MRSMO algorithm reduces the training time significantly while maintaining a high level of accuracy in both binary and multiclass classifications.  相似文献   

5.
In this paper, a novel automatic image annotation system is proposed, which integrates two sets of support vector machines (SVMs), namely the multiple instance learning (MIL)-based and global-feature-based SVMs, for annotation. The MIL-based bag features are obtained by applying MIL on the image blocks, where the enhanced diversity density (DD) algorithm and a faster searching algorithm are applied to improve the efficiency and accuracy. They are further input to a set of SVMs for finding the optimum hyperplanes to annotate training images. Similarly, global color and texture features, including color histogram and modified edge histogram, are fed into another set of SVMs for categorizing training images. Consequently, two sets of image features are constructed for each test image and are, respectively, sent to the two sets of SVMs, whose outputs are incorporated by an automatic weight estimation method to obtain the final annotation results. Our proposed annotation approach demonstrates a promising performance for an image database of 12 000 general-purpose images from COREL, as compared with some current peer systems in the literature.  相似文献   

6.
Personal memories composed of digital pictures are very popular at the moment. To retrieve these media items annotation is required. During the last years, several approaches have been proposed in order to overcome the image annotation problem. This paper presents our proposals to address this problem. Automatic and semi-automatic learning methods for semantic concepts are presented. The automatic method is based on semantic concepts estimated using visual content, context metadata and audio information. The semi-automatic method is based on results provided by a computer game. The paper describes our proposals and presents their evaluations.  相似文献   

7.
Multimedia Tools and Applications - In automatic image annotation (AIA) different features describe images from different aspects or views. Part of information embedded in some views is common for...  相似文献   

8.
Automatic image annotation (AIA) is an effective technology to improve the performance of image retrieval. In this paper, we propose a novel AIA scheme based on hidden Markov model (HMM). Compared with the previous HMM-based annotation methods, SVM based semi-supervised learning, i.e. transductive SVM (TSVM), is triggered out for remarkably boosting the reliability of HMM with less users’ labeling effort involved (denoted by TSVM-HMM). This guarantees that the proposed TSVM-HMM based annotation scheme integrates the discriminative classification with the generative model to mutually complete their advantages. In addition, not only the relevance model between the visual content of images and the textual keywords but also the property of keyword correlation is exploited in the proposed AIA scheme. Particularly, to establish an enhanced correlation network among keywords, both co-occurrence based and WordNet based correlation techniques are well fused and are able to be helpful for benefiting from each other. The final experimental results reveal that the better annotation performance can be achieved at less labeled training images.  相似文献   

9.
10.
This paper addresses automatic image annotation problem and its application to multi-modal image retrieval. The contribution of our work is three-fold. (1) We propose a probabilistic semantic model in which the visual features and the textual words are connected via a hidden layer which constitutes the semantic concepts to be discovered to explicitly exploit the synergy among the modalities. (2) The association of visual features and textual words is determined in a Bayesian framework such that the confidence of the association can be provided. (3) Extensive evaluation on a large-scale, visually and semantically diverse image collection crawled from Web is reported to evaluate the prototype system based on the model. In the proposed probabilistic model, a hidden concept layer which connects the visual feature and the word layer is discovered by fitting a generative model to the training image and annotation words through an Expectation-Maximization (EM) based iterative learning procedure. The evaluation of the prototype system on 17,000 images and 7736 automatically extracted annotation words from crawled Web pages for multi-modal image retrieval has indicated that the proposed semantic model and the developed Bayesian framework are superior to a state-of-the-art peer system in the literature.  相似文献   

11.
12.
Automatic image annotation using supervised learning is performed by concept classifiers trained on labelled example images. This work proposes the use of clickthrough data collected from search logs as a source for the automatic generation of concept training data, thus avoiding the expensive manual annotation effort. We investigate and evaluate this approach using a collection of 97,628 photographic images. The results indicate that the contribution of search log based training data is positive despite their inherent noise; in particular, the combination of manual and automatically generated training data outperforms the use of manual data alone. It is therefore possible to use clickthrough data to perform large-scale image annotation with little manual annotation effort or, depending on performance, using only the automatically generated training data. An extensive presentation of the experimental results and the accompanying data can be accessed at .  相似文献   

13.
One of major problems in image auto-annotation is the difference between the expected word counts vector and the resulted word counts vector. This paper presents a new approach to automatic image annotation—an algorithm called resulted word counts optimizer which is an extension to existing methods. An ideal annotator is defined in terms of recall quality measure. On the basis of the ideal annotator an optimization criterion is defined. It allows to reduce the difference between resulted and expected word counts vectors. The proposed algorithm can be used with various image auto-annotation algorithms because its generic nature. Additionally, it does not increase the computational complexity of the original annotation method processing phase. It changes output word probabilities according to a pre-calculated vector of correction coefficients.  相似文献   

14.
为了弥补图像低层视觉特征和高层语义之间的"语义鸿沟",改善图像自动标注的性能,提出了基于多媒体描述接口(MPEG-7)和MM(Mixture Model)混合模型的图像标注算法。该算法采用MPEG-7标准推荐的颜色和纹理描述子提取图像的低层视觉特征,通过MM混合模型建立低层特征到高层语义空间的映射,实现了基于图像整体低层特征的多标签图像自动标注。通过在corel图像数据集上的一系列实验测试验证了该方法的可行性和有效性。  相似文献   

15.
This paper presents a novel approach to automatic image annotation which combines global, regional, and contextual features by an extended cross-media relevance model. Unlike typical image annotation methods which use either global or regional features exclusively, as well as neglect the textual context information among the annotated words, the proposed approach incorporates the three kinds of information which are helpful to describe image semantics to annotate images by estimating their joint probability. Specifically, we describe the global features as a distribution vector of visual topics and model the textual context as a multinomial distribution. The global features provide the global distribution of visual topics over an image, while the textual context relaxes the assumption of mutual independence among annotated words which is commonly adopted in most existing methods. Both the global features and textual context are learned by a probability latent semantic analysis approach from the training data. The experiments over 5k Corel images have shown that combining these three kinds of information is beneficial in image annotation.  相似文献   

16.
A major drawback of statistical models of non-rigid, deformable objects, such as the active appearance model (AAM), is the required pseudo-dense annotation of landmark points for every training image. We propose a regression-based approach for automatic annotation of face images at arbitrary pose and expression, and for deformable model building using only the annotated frontal images. We pose the problem of learning the pattern of manual annotation as a data-driven regression problem and explore several regression strategies to effectively predict the spatial arrangement of the landmark points for unseen face images, with arbitrary expression, at arbitrary poses. We show that the proposed fully sparse non-linear regression approach outperforms other regression strategies by effectively modelling the changes in the shape of the face under varying pose and is capable of capturing the subtleties of different facial expressions at the same time, thus, ensuring the high quality of the generated synthetic images. We show the generalisability of the proposed approach by automatically annotating the face images from four different databases and verifying the results by comparing them with a ground truth obtained from manual annotations.  相似文献   

17.
Hypermedia composite templates define generic structures of nodes and links to be added to a document composition, providing spatio-temporal synchronization semantics. This paper presents EDITEC, a graphical editor for hypermedia composite templates. EDITEC templates are based on the XTemplate 3.0 language. The editor was designed for offering a user-friendly visual approach. It presents a new method that provides several options for representing iteration structures graphically, in order to specify a certain behavior to be applied to a set of generic document components. The editor provides a multi-view environment, giving the user a complete control of the composite template during the authoring process. Composite templates can be used in NCL documents for embedding spatio-temporal semantics into NCL contexts. NCL is the standard declarative language used for the production of interactive applications in the Brazilian digital TV system and ITU H.761 IPTV services. Hypermedia composite templates could also be used in other hypermedia authoring languages offering new types of compositions with predefined semantics.  相似文献   

18.
19.
Automatic image annotation has emerged as an important research topic due to its potential application on both image understanding and web image search. Due to the inherent ambiguity of image-label mapping and the scarcity of training examples, the annotation task has become a challenge to systematically develop robust annotation models with better performance. From the perspective of machine learning, the annotation task fits both multi-instance and multi-label learning framework due to the fact that an image is usually described by multiple semantic labels (keywords) and these labels are often highly related to respective regions rather than the entire image. In this paper, we propose an improved Transductive Multi-Instance Multi-Label (TMIML) learning framework, which aims at taking full advantage of both labeled and unlabeled data to address the annotation problem. The experiments over the well known Corel 5000 data set demonstrate that the proposed method is beneficial in the image annotation task and outperforms most existing image annotation algorithms.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号