首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Semantic gap has become a bottleneck of content-based image retrieval in recent years. In order to bridge the gap and improve the retrieval performance, automatic image annotation has emerged as a crucial problem. In this paper, a hybrid approach is proposed to learn the semantic concepts of images automatically. Firstly, we present continuous probabilistic latent semantic analysis (PLSA) and derive its corresponding Expectation–Maximization (EM) algorithm. Continuous PLSA assumes that elements are sampled from a multivariate Gaussian distribution given a latent aspect, instead of a multinomial one in traditional PLSA. Furthermore, we propose a hybrid framework which employs continuous PLSA to model visual features of images in generative learning stage and uses ensembles of classifier chains to classify the multi-label data in discriminative learning stage. Therefore, the framework can learn the correlations between features as well as the correlations between words. Since the hybrid approach combines the advantages of generative and discriminative learning, it can predict semantic annotation precisely for unseen images. Finally, we conduct the experiments on three baseline datasets and the results show that our approach outperforms many state-of-the-art approaches.  相似文献   

2.
融合语义主题的图像自动标注   总被引:7,自引:0,他引:7  
由于语义鸿沟的存在,图像自动标注已成为一个重要课题.在概率潜语义分析的基础上,提出了一种融合语义主题的方法以进行图像的标注和检索.首先,为了更准确地建模训练数据,将每幅图像的视觉特征表示为一个视觉"词袋";然后设计一个概率模型分别从视觉模态和文本模态中捕获潜在语义主题,并提出一种自适应的不对称学习方法融合两种语义主题.对于每个图像文档,它在各个模态上的主题分布通过加权进行融合,而权值由该文档的视觉词分布的熵值来确定.于是,融合之后的概率模型适当地关联了视觉模态和文本模态的信息,因此能够很好地预测未知图像的语义标注.在一个通用的Corel图像数据集上,将提出的方法与几种前沿的图像标注方法进行了比较.实验结果表明,该方法具有更好的标注和检索性能.  相似文献   

3.
In this paper, we consider the problem of clustering and re-ranking web image search results so as to improve diversity at high ranks. We propose a novel ranking framework, namely cluster-constrained conditional Markov random walk (CCCMRW), which has two key steps: first, cluster images into topics, and then perform Markov random walk in an image graph conditioned on constraints of image cluster information. In order to cluster the retrieval results of web images, a novel graph clustering model is proposed in this paper. We explore the surrounding text to mine the correlations between words and images and therefore the correlations are used to improve clustering results. Two kinds of correlations, namely word to image and word to word correlations, are mainly considered. As a standard text process technique, tf-idf method cannot measure the correlation of word to image directly. Therefore, we propose to combine tf-idf method with a novel feature of word, namely visibility, to infer the word-to-image correlation. By latent Dirichlet allocation model, we define a topic relevance function to compute the weights of word-to-word correlations. Taking word to image correlations as heterogeneous links and word-to-word correlations as homogeneous links, graph clustering algorithms, such as complex graph clustering and spectral co-clustering, are respectively used to cluster images into topics in this paper. In order to perform CCCMRW, a two-layer image graph is constructed with image cluster nodes as upper layer added to a base image graph. Conditioned on the image cluster information from upper layer, Markov random walk is constrained to incline to walk across different image clusters, so as to give high rank scores to images of different topics and therefore gain the diversity. Encouraging clustering and re-ranking outputs on Google image search results are reported in this paper.  相似文献   

4.
Automatic image annotation has become an important and challenging problem due to the existence of semantic gap. In this paper, we firstly extend probabilistic latent semantic analysis (PLSA) to model continuous quantity. In addition, corresponding Expectation-Maximization (EM) algorithm is derived to determine the model parameters. Furthermore, in order to deal with the data of different modalities in terms of their characteristics, we present a semantic annotation model which employs continuous PLSA and standard PLSA to model visual features and textual words respectively. The model learns the correlation between these two modalities by an asymmetric learning approach and then it can predict semantic annotation precisely for unseen images. Finally, we compare our approach with several state-of-the-art approaches on the Corel5k and Corel30k datasets. The experiment results show that our approach performs more effectively and accurately.  相似文献   

5.
Probabilistic latent semantic analysis (PLSA) is a method for computing term and document relationships from a document set. The probabilistic latent semantic index (PLSI) has been used to store PLSA information, but unfortunately the PLSI uses excessive storage space relative to a simple term frequency index, which causes lengthy query times. To overcome the storage and speed problems of PLSI, we introduce the probabilistic latent semantic thesaurus (PLST); an efficient and effective method of storing the PLSA information. We show that through methods such as document thresholding and term pruning, we are able to maintain the high precision results found using PLSA while using a very small percent (0.15%) of the storage space of PLSI.  相似文献   

6.
为了能准确挖掘用户兴趣点,首先利用概率潜在语义分析PLSA模型将“网页 词”矩阵向量投影到概率潜在语义向量空间,并提出“自动相似度阈值选择”方法得到网页间的相似度阈值,最后提出将平面划分法与凝聚式层次聚类相结合的凝聚式层次k中心点HAK medoids算法,实现用户兴趣点聚类。实验结果表明,与传统的基于划分的算法相比,HAK medoids算法聚类效果更好。同时,提出的用户兴趣点聚类技术在个性化服务领域可提高个性化推荐和搜索的效率。关键词:  相似文献   

7.
传统潜在语义分析(Latent Semantic Analysis, LSA)方法无法获得场景目标空间分布信息和潜在主题的判别信息。针对这一问题提出了一种基于多尺度空间判别性概率潜在语义分析(Probabilistic Latent Semantic Analysis, PLSA)的场景分类方法。首先通过空间金字塔方法对图像进行空间多尺度划分获得图像空间信息,结合PLSA模型获得每个局部块的潜在语义信息;然后串接每个特定局部块中的语义信息得到图像多尺度空间潜在语义信息;最后结合提出的权值学习方法来学习不同图像主题间的判别信息,从而得到图像的多尺度空间判别性潜在语义信息,并将学习到的权值信息嵌入支持向量基(Support Vector Machine, SVM)分类器中完成图像的场景分类。在常用的三个场景图像库(Scene-13、Scene-15和Caltech-101)上的实验表明,该方法平均分类精度比现有许多state-of-art方法均优。验证了其有效性和鲁棒性。  相似文献   

8.
为减少图像检索中图像信息的缺失与语义鸿沟的影响,提出了一种基于多特征融合与PLSA-GMM的图像自动标注方法.首先,提取图像的颜色特征、形状特征和纹理特征,三者融合作为图像的底层特征;然后,基于概率潜在语义分析(PLSA)与高斯混合模型(GMM)建立图像底层特征、视觉语义主题与标注关键词间的联系,并基于该模型实现对图像的自动标注.采用Corel 5k数据库进行验证,实验结果证明了本文方法的有效性.  相似文献   

9.
10.
In this paper, a statistical model called statistical local spatial relations (SLSR) is presented as a novel technique of a learning model with spatial and statistical information for semantic image classification. The model is inspired by probabilistic Latent Semantic Analysis (PLSA) for text mining. In text analysis, PLSA is used to discover topics in a corpus using the bag-of-word document representation. In SLSR, we treat image categories as topics, therefore an image containing instances of multiple categories can be modeled as a mixture of topics. More significantly, SLSR introduces spatial relation information as a factor which is not present in PLSA. SLSR has rotation, scale, translation and affine invariant properties and can solve partial occlusion problems. Using the Dirichlet process and variational Expectation-Maximization learning algorithm, SLSR is developed as an implementation of an image classification algorithm. SLSR uses an unsupervised process which can capture both spatial relations and statistical information simultaneously. The experiments are demonstrated on some standard data sets and show that the SLSR model is a promising model for semantic image classification problems.
Wenhui Li (Corresponding author)Email:

Dongfeng Han   received the B.Sc. 2002 and M.S. 2005 in computer science and technology from Jilin University, Changchun, P. R. China. From 2005, he pursuits the PhD degree in computer science and technology Jilin University. His research interests include computer vision, image processing, machine learning and pattern recognition. Wenhui Li   received the PhD degree in computer science from Jilin University in 1996. Now he is a professor of Jilin University. His research interests include computer vision, computer graphic and virtual reality. Zongcheng Li   undergraduated student of Shandong University of Technology, P. R. China. His research interests include computer vision and image processing.   相似文献   

11.
A thousand words in a scene   总被引:2,自引:0,他引:2  
This paper presents a novel approach for visual scene modeling and classification, investigating the combined use of text modeling methods and local invariant features. Our work attempts to elucidate (1) whether a textlike bag-of-visterms (BOV) representation (histogram of quantized local visual features) is suitable for scene (rather than object) classification, (2) whether some analogies between discrete scene representations and text documents exist, and 3) whether unsupervised, latent space models can be used both as feature extractors for the classification task and to discover patterns of visual co-occurrence. Using several data sets, we validate our approach, presenting and discussing experiments on each of these issues. We first show, with extensive experiments on binary and multiclass scene classification tasks using a 9,500-image data set, that the BOV representation consistently outperforms classical scene classification approaches. In other data sets, we show that our approach competes with or outperforms other recent more complex methods. We also show that probabilistic latent semantic analysis (PLSA) generates a compact scene representation, is discriminative for accurate classification, and is more robust than the BOV representation when less labeled training data is available. Finally, through aspect-based image ranking experiments, we show the ability of PLSA to automatically extract visually meaningful scene patterns, making such representation useful for browsing image collections.  相似文献   

12.
为减小图像检索中语义鸿沟的影响,提出了一种基于视觉语义主题的图像自动标注方法.首先,提取图像前景与背景区域,并分别进行预处理;然后,基于概率潜在语义分析与高斯混合模型建立图像底层特征、视觉语义主题与标注关键词间的联系,并基于该模型实现对图像的自动标注.采用corel 5数据库进行验证,实验结果证明了本文方法的有效性.  相似文献   

13.
Two novel word clustering techniques are proposed which employ long distance bigram language models. The first technique is built on a hierarchical clustering algorithm and minimizes the sum of Mahalanobis distances of all words after a cluster merger from the centroid of the class created by merging. The second technique resorts to probabilistic latent semantic analysis (PLSA). Next, interpolated long distance bigrams are considered in the context of the aforementioned clustering techniques. Experiments conducted on the English Gigaword corpus (second edition) demonstrate that: (1) the long distance bigrams, when employed in the two clustering techniques under study, yield word clusters of better quality than the baseline bigrams; (2) the interpolated long distance bigrams outperform the long distance bigrams in the same respect; (3) the long distance bigrams perform better than the bigrams, which incorporate trigger-pairs selected at various distances; and (4) the best word clustering is achieved by the PLSA that employs interpolated long distance bigrams. Both proposed techniques outperform spectral clustering based on k-means. To assess objectively the quality of the created clusters, relative cluster validity indices are estimated as well as the average cluster sense precision, the average cluster sense recall, and the F-measure are computed by exploiting ground truth extracted from the WordNet.  相似文献   

14.
针对传统图像检索系统通过关键字搜索图像时缺乏语义主题多样性的问题,提出了一种基于互近邻一致性和近邻传播的代表性图像选取算法,为每个查询选取与其相关的不同语义主题的图像集合. 该算法利用互近邻一致性调整图像间的相似度,再进行近邻传播(AP)聚类将图像集分为若干簇,最后通过簇排序选取代表性图像簇并从中选取中心图像为代表性图像. 实验表明,本文方法的性能超过基于K-means的方法和基于Greedy K-means的方法,所选图像能直观有效地概括源图像集的内容,并且在语义上多样化.  相似文献   

15.
Many organizations rely on relational database platforms for OLAP-style querying (aggregation and filtering) for small to medium size applications. We investigate the impact of scaling up the data sizes for such queries. We intend to illustrate what kind of performance results an organization could expect should they migrate current applications to big data environments. This paper benchmarks the performance of Hive (Thusoo et al., 2009)  [9], a parallel data warehouse platform that is a part of the Hadoop software stack. We set up a 4-node Hadoop cluster using Hortonworks HDP 1.3.2 (Hortonworks HDP 1.3.2). We use the data generator provided by the TPC-DS benchmark (DSGen v1.1.0) to generate data of different scales. We compare the performance of loading data and querying for SQL and Hive Query Language (HiveQL) on a relational database installation (MySQL) and on a Hive cluster, respectively. We measure the speedup for query execution for three dataset sizes resulting from the scale up. Hive loads the large datasets faster than MySQL, while it is marginally slower than MySQL when loading the smaller datasets. Query execution in Hive is also faster. We also investigate executing Hive queries concurrently in workloads and conclude that serial execution of queries is a much better practice for clusters with limited resources.  相似文献   

16.
Yang CC  Chang HC 《Applied ergonomics》2012,43(6):1072-1080
Collecting affective responses (ARs) from consumers is crucial to designers aspiring to produce an appealing product. Adjectives are frequently used by researchers as an affective means by which consumers can describe their subjective feelings regarding a specific product design. This study proposes a Kansei engineering (KE) approach for selecting representative affective dimensions using factor analysis (FA) and Procrustes analysis (PA). A semantic differential (SD) experiment is used to examine consumers' ARs toward a set of representative product samples. FA is employed to extract the underlying latent factors using an initial set of affective dimensions. A backward elimination process based on PA is used to determine the relative significance of adjectives in each step according to the calculated residual sum of squared differences (RSSDs) to finally obtain the ranking of the initial set of adjectives. Additionally, the results of the proposed approach are compared to the method that combines FA and two-stage cluster analysis (CA). A case study of mobile phone design is provided to demonstrate the analysis results.  相似文献   

17.
《Computer Networks》2007,51(11):3252-3264
We consider the problem of maintaining the prescribed event sensing reliability while maximizing cluster and network lifetime in a multi-cluster 802.15.4 sensor network. Clusters are connected through bridges which also act as cluster coordinators; both ordinary nodes and bridges resolve contention using the CSMA-CA algorithm. Cluster lifetime is maximized through the use of redundant sensors which are periodically sent to sleep using a simple distributed activity management algorithm. Network lifetime is maximized by equalizing energy consumption per backoff period in all clusters through the adjustment of the number of nodes. We model this problem analytically using the datasheet for tmote_sky ultra low power IEEE 802.15.4 compliant wireless sensor module [tmote sky lowpower wireless sensor module, Moteiv Corporation, San Francisco, CA, www.moteiv.com, tmote datasheet 802.15.4, 2006] and derive the probability distribution of the network lifetime. We also derive the expression for node count that compensates for the increased load due to contention caused by the bridge. Numerical results show that this technique easily equalizes cluster lifetimes.  相似文献   

18.
A unified approach to ranking in probabilistic databases   总被引:1,自引:0,他引:1  
Ranking is a fundamental operation in data analysis and decision support and plays an even more crucial role if the dataset being explored exhibits uncertainty. This has led to much work in understanding how to rank the tuples in a probabilistic dataset in recent years. In this article, we present a unified approach to ranking and top-k query processing in probabilistic databases by viewing it as a multi-criterion optimization problem and by deriving a set of features that capture the key properties of a probabilistic dataset that dictate the ranked result. We contend that a single, specific ranking function may not suffice for probabilistic databases, and we instead propose two parameterized ranking functions, called PRF ω and PRF e, that generalize or can approximate many of the previously proposed ranking functions. We present novel generating functions-based algorithms for efficiently ranking large datasets according to these ranking functions, even if the datasets exhibit complex correlations modeled using probabilistic and/xor trees or Markov networks. We further propose that the parameters of the ranking function be learned from user preferences, and we develop an approach to learn those parameters. Finally, we present a comprehensive experimental study that illustrates the effectiveness of our parameterized ranking functions, especially PRF e, at approximating other ranking functions and the scalability of our proposed algorithms for exact or approximate ranking.  相似文献   

19.
Online photo collections have become truly gigantic. Photo sharing sites such as Flickr (http://www.flickr.com/) host billions of photographs, a large portion of which are contributed by tourists. In this paper, we leverage online photo collections to automatically rank canonical views for tourist attractions. Ideal canonical views for a tourist attraction should both be representative of the site and exhibit a diverse set of views (Kennedy and Naaman, International Conference on World Wide Web 297–306, 2008). In order to meet both goals, we rank canonical views in two stages. During the first stage, we use visual features to encode the content of photographs and infer the popularity of each photograph. During the second stage, we rank photographs using a suppression scheme to keep popular views top-ranked while demoting duplicate views. After a ranking is generated, canonical views at various granularities can be retrieved in real-time, which advances over previous work and is a promising feature for real applications. In order to scale canonical view ranking to gigantic online photo collections, we propose to leverage geo-tags (latitudes/longitudes of the location of the scene in the photographs) to speed up the basic algorithm. We preprocess the photo collection to extract subsets of photographs that are geographically clustered (or geo-clusters), and constrain the expensive visual processing within each geo-cluster. We test the algorithm on two large Flickr data sets of Rome and the Yosemite national park, and show promising results on canonical view ranking. For quantitative analysis, we adopt two medium data sets and conduct a subjective comparison with previous work. It shows that while both algorithms are able to produce canonical views of high quality, our algorithm has the advantage of responding in real-time to canonical view retrieval at various granularities.  相似文献   

20.
基于概率潜在语义分析的词汇情感倾向判别   总被引:2,自引:0,他引:2  
该文利用概率潜在语义分析,给出了两种用于判别词汇情感倾向的方法。一是使用概率潜在语义分析获得目标词和基准词之间的相似度矩阵,再利用投票法决定其情感倾向;二是利用概率潜在语义分析获取目标词的语义聚类,然后借鉴基于同义词的词汇情感倾向判别方法对目标词的情感倾向做出判别。两种方法的优点是均可在没有外部资源的条件下,实现词汇情感倾向的判别。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号