首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到12条相似文献,搜索用时 0 毫秒
1.
Automatic image annotation has become an important and challenging problem due to the existence of semantic gap. In this paper, we firstly extend probabilistic latent semantic analysis (PLSA) to model continuous quantity. In addition, corresponding Expectation-Maximization (EM) algorithm is derived to determine the model parameters. Furthermore, in order to deal with the data of different modalities in terms of their characteristics, we present a semantic annotation model which employs continuous PLSA and standard PLSA to model visual features and textual words respectively. The model learns the correlation between these two modalities by an asymmetric learning approach and then it can predict semantic annotation precisely for unseen images. Finally, we compare our approach with several state-of-the-art approaches on the Corel5k and Corel30k datasets. The experiment results show that our approach performs more effectively and accurately.  相似文献   

2.
目的 由于图像检索中存在着低层特征和高层语义之间的“语义鸿沟”,图像自动标注成为当前的关键性问题.为缩减语义鸿沟,提出了一种混合生成式和判别式模型的图像自动标注方法.方法 在生成式学习阶段,采用连续的概率潜在语义分析模型对图像进行建模,可得到相应的模型参数和每幅图像的主题分布.将这个主题分布作为每幅图像的中间表示向量,那么图像自动标注的问题就转化为一个基于多标记学习的分类问题.在判别式学习阶段,使用构造集群分类器链的方法对图像的中间表示向量进行学习,在建立分类器链的同时也集成了标注关键词之间的上下文信息,因而能够取得更高的标注精度和更好的检索效果.结果 在两个基准数据集上进行的实验表明,本文方法在Corel5k数据集上的平均精度、平均召回率分别达到0.28和0.32,在IAPR-TC12数据集上则达到0.29和0.18,其性能优于大多数当前先进的图像自动标注方法.此外,从精度—召回率曲线上看,本文方法也优于几种典型的具有代表性的标注方法.结论 提出了一种基于混合学习策略的图像自动标注方法,集成了生成式模型和判别式模型各自的优点,并在图像语义检索的任务中表现出良好的有效性和鲁棒性.本文方法和技术不仅能应用于图像检索和识别的领域,经过适当的改进之后也能在跨媒体检索和数据挖掘领域发挥重要作用.  相似文献   

3.
为了在图像语义标注领域能更好地反映标注之间的关系,通过对已标注图像的标注进行分析来建立标 注之间的关系,并在此基础上将叙词查询的概念引入到图像语义标注中并提出了基于叙词查询的图像语义标注 方法,把语义标注问题统一在叙词查询与图像的语义关系相结合在统一的框架下,最后通过在Corel图像数据库中的验证表明,所提出的方法是有效的并且标注率得到了明显的提高。  相似文献   

4.
This paper reports on a study to explore how semantic relations can be used to expand a query for objects in an image. The study is part of a project with the overall objective to provide semantic annotation and search facilities for a virtual collection of art resources. In this study we used semantic relations from WordNet for 15 image content queries. The results show that, next to the hyponym/hypernym relation, the meronym/holonym (part-of) relation is particularly useful in query expansion. We identified a number of relation patterns that improve recall without jeopardising precision.  相似文献   

5.
This paper addresses automatic image annotation problem and its application to multi-modal image retrieval. The contribution of our work is three-fold. (1) We propose a probabilistic semantic model in which the visual features and the textual words are connected via a hidden layer which constitutes the semantic concepts to be discovered to explicitly exploit the synergy among the modalities. (2) The association of visual features and textual words is determined in a Bayesian framework such that the confidence of the association can be provided. (3) Extensive evaluation on a large-scale, visually and semantically diverse image collection crawled from Web is reported to evaluate the prototype system based on the model. In the proposed probabilistic model, a hidden concept layer which connects the visual feature and the word layer is discovered by fitting a generative model to the training image and annotation words through an Expectation-Maximization (EM) based iterative learning procedure. The evaluation of the prototype system on 17,000 images and 7736 automatically extracted annotation words from crawled Web pages for multi-modal image retrieval has indicated that the proposed semantic model and the developed Bayesian framework are superior to a state-of-the-art peer system in the literature.  相似文献   

6.
This paper presents an emotion prediction system that can automatically predict certain human emotional concepts from a given textile. The main application motivating this study is textile image annotation, which has recently rapidly expanded in relation to the Web. In the proposed method, color and pattern are used as cues to predict the emotional semantics associated with an image, where these features are extracted using a color quantization and a multi-level wavelet transform, respectively. The extracted features are then applied to three representative classifiers: K-means clustering, Naïve Bayesian, and a multi-layered perceptron (MLP), all of which are widely used in data mining. When evaluating the proposed emotion prediction method using 3600 textile images, the MLP produces the best performance. Thereafter, the proposed MLP-based method is compared with other methods that only use color or pattern, and the proposed method shows the best performance with an accuracy of above 92%. Therefore, the results confirm that the proposed method can be effectively applied to the commercial textile industry and image retrieval.  相似文献   

7.
8.
In this paper, we present the results of a project that seeks to transform low-level features to a higher level of meaning. This project concerns a technique, latent semantic indexing (LSI), in conjunction with normalization and term weighting, which have been used for full-text retrieval for many years. In this environment, LSI determines clusters of co-occurring keywords, sometimes, called concepts, so that a query which uses a particular keyword can then retrieve documents perhaps not containing this keyword, but containing other keywords from the same cluster. In this paper, we examine the use of this technique for content-based image retrieval, using two different approaches to image feature representation. We also study the integration of visual features and textual keywords and the results show that it can help improve the retrieval performance significantly.  相似文献   

9.
In order to improve the retrieval accuracy of content-based image retrieval systems, research focus has been shifted from designing sophisticated low-level feature extraction algorithms to reducing the ‘semantic gap’ between the visual features and the richness of human semantics. This paper attempts to provide a comprehensive survey of the recent technical achievements in high-level semantic-based image retrieval. Major recent publications are included in this survey covering different aspects of the research in this area, including low-level image feature extraction, similarity measurement, and deriving high-level semantic features. We identify five major categories of the state-of-the-art techniques in narrowing down the ‘semantic gap’: (1) using object ontology to define high-level concepts; (2) using machine learning methods to associate low-level features with query concepts; (3) using relevance feedback to learn users’ intention; (4) generating semantic template to support high-level image retrieval; (5) fusing the evidences from HTML text and the visual content of images for WWW image retrieval. In addition, some other related issues such as image test bed and retrieval performance evaluation are also discussed. Finally, based on existing technology and the demand from real-world applications, a few promising future research directions are suggested.  相似文献   

10.
This paper focuses on a practical design for an efficient scalable image database and retrieval system over broadband networks. It describes a concrete solution for the implementation of HD/SHD (high-definition/super-high-definition) still image retrieval services which can be used in different applications. The structure of the complete system, consisting of a directory server, an image server, and MMI (man-machine interface) devices, has been presented, detailing each element and their corresponding functions. The desired HD/SHD image is displayed on the HD-PDP (plasma display panel) with the aid of image matching. The proposed system generates image index automatically, eliminating special skills in preparing index images and crucially reducing the processing time (from 35 min to 110 s), and does not use keywords. It has been also shown that these indices can be used for quite accurate image retrieval, i.e., the system provides high precision rates (currently up to 98%) to the user, eliminating troubles encountered in the image retrieval operations due to limitation on the user’s age, culture, knowledge, and languages.The broadband IP networks currently have a number of issues from the viewpoint of practical system operations, and so the requirements and issues needed for the networks are discussed from the viewpoint of in-service performance, differentiation among different types of services, secure connections, and so on, focusing on handling of HD/SHD still images.  相似文献   

11.
12.
With the proliferation of digital devices in internet of things (IoT) environment featuring advanced visual capabilities, the task of Image Source Identification (ISI) has become increasingly vital for legal purposes, ensuring the verification of image authenticity and integrity, as well as identifying the device responsible for capturing the original scene. Over the past few decades, researchers have employed both traditional and machine-learning methods to classify image sources. In the current landscape, data-driven approaches leveraging deep learning models have emerged as powerful tools for achieving higher accuracy and precision in source prediction. The primary focus of this research is to address the complexities arising from diverse image sources and variable quality in IoT-generated multimedia data. To achieve this, a Hybrid Data Fusion Approach is introduced, leveraging multiple sources of information to bolster the accuracy and robustness of ISI. This fusion methodology integrates diverse data streams from IoT devices, including metadata, sensor information, and contextual data, amalgamating them into a comprehensive data set for analysis. This study introduces an innovative approach to ISI through the implementation of a Twin Convolutional Neural Network Architecture (TCA) aimed at enhancing the efficacy of source classification. In TCA, the first CNN architecture, referred to as DnCNN, is employed to eliminate noise from the original data set, generating 256 × 256 patches for both training and testing. Subsequently, the second CNN architecture is employed to classify images based on features extracted from various convolutional layers using a 3 × 3 filter, thereby enhancing prediction efficiency. The proposed model demonstrates exceptional accuracy in effectively classifying image sources, showcasing its potential as a robust solution in the realm of ISI.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号