首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper we explore the benefits of latent variable modelling of clickthrough data in the domain of image retrieval. Clicks in image search logs are regarded as implicit relevance judgements that express both user intent and important relations between selected documents. We posit that clickthrough data contains hidden topics and can be used to infer a lower dimensional latent space that can be subsequently employed to improve various aspects of the retrieval system. We use a subset of a clickthrough corpus from the image search portal of a news agency to evaluate several popular latent variable models in terms of their ability to model topics underlying queries. We demonstrate that latent variable modelling reveals underlying structure in clickthrough data and our results show that computing document similarities in the latent space improves retrieval effectiveness compared to computing similarities in the original query space. These results are compared with baselines using visual and textual features. We show performance substantially better than the visual baseline, which indicates that content-based image retrieval systems that do not exploit query logs could improve recall and precision by taking this historical data into account.  相似文献   

2.
Search engines are useful because they allow the user to find information of interest from the World Wide Web (WWW). However, most of the popular search engines today are textual; they do not allow the user to find images from the web. For effective retrieval, determining the semantics of the images is essential. In this paper, we describe the problems in determining the semantics of images on the WWW and the approach of AMORE, a WWW search engine that we have developed. AMORE's techniques can be extended to other media like audio and video. We explain how we assign keywords to the images based on HTML pages and the method to determine similar images based on the assigned text. We also discuss some statistics showing the effectiveness of our technique. Finally, we present the visual interface of AMORE with the help of several retrieval scenarios.  相似文献   

3.
基于内容的视频查询系统研究   总被引:1,自引:0,他引:1       下载免费PDF全文
由于多媒体数据库管理和检索的效率直接决定了人们利用多媒体数据信息的效率,因此随着MPEG-7标准的提出,基于内容的图象/视频存储和检索已成为研究的热点.为了快速地对视频进行浏览和检索,在研究基于内容的视频数据库管理和检索等热点问题的基础上,首先使用MPEG-7视觉内容描述子和语义描述子来构建视频数据库的语义结构,并结合底层视觉特征和高层语义特征,采用相关反馈机制和半自动权重更新体制来对视频数据库进行管理和检索;然后采用语法分析器来支持自然语言查询;最后在此基础上实现了基于内容的视频数据库的管理和查询系统.实验证明,该系统能够有效地对视频数据进行管理和检索,并且具有一定的智能性和适应性.  相似文献   

4.
We present a new text-to-image re-ranking approach for improving the relevancy rate in searches. In particular, we focus on the fundamental semantic gap that exists between the low-level visual features of the image and high-level textual queries by dynamically maintaining a connected hierarchy in the form of a concept database. For each textual query, we take the results from popular search engines as an initial retrieval, followed by a semantic analysis to map the textual query to higher level concepts. In order to do this, we design a two-layer scoring system which can identify the relationship between the query and the concepts automatically. We then calculate the image feature vectors and compare them with the classifier for each related concept. An image is relevant only when it is related to the query both semantically and content-wise. The second feature of this work is that we loosen the requirement for query accuracy from the user, which makes it possible to perform well on users’ queries containing less relevant information. Thirdly, the concept database can be dynamically maintained to satisfy the variations in user queries, which eliminates the need for human labor in building a sophisticated initial concept database. We designed our experiment using complex queries (based on five scenarios) to demonstrate how our retrieval results are a significant improvement over those obtained from current state-of-the-art image search engines.  相似文献   

5.
Web image retrieval using majority-based ranking approach   总被引:1,自引:0,他引:1  
Web image retrieval has characteristics different from typical content-based image retrieval; web images have associated textual cues. However, a web image retrieval system often yields undesirable results, because it uses limited text information such as surrounding text, URLs, and image filenames. In this paper, we propose a new approach to retrieval, which uses the image content of retrieved results without relying on assistance from the user. Our basic hypothesis is that more popular images have a higher probability of being the ones that the user wishes to retrieve. According to this hypothesis, we propose a retrieval approach that is based on a majority of the images under consideration. We define four methods for finding the visual features of majority of images; (1) majority-first method, (2) centroid-of-all method, (3) centroid-of-top K method, and (4) centroid-of-largest-cluster method. In addition, we implement a graph/picture classifier for improving the effectiveness of web image retrieval. We evaluate the retrieval effectiveness of both our methods and conventional ones by using precision and recall graphs. Experimental results show that the proposed methods are more effective than conventional keyword-based retrieval methods.  相似文献   

6.
7.
8.
9.
System performance assessment and comparison are fundamental for large-scale image search engine development. This article documents a set of comprehensive empirical studies to explore the effects of multiple query evidences on large-scale social image search. The search performance based on the social tags, different kinds of visual features and their combinations are systematically studied and analyzed. To quantify the visual query complexity, a novel quantitative metric is proposed and applied to assess the influences of different visual queries based on their complexity levels. Besides, we also study the effects of automatic text query expansion with social tags using a pseudo relevance feedback method on the retrieval performance. Our analysis of experimental results shows a few key research findings: (1) social tag-based retrieval methods can achieve much better results than content-based retrieval methods; (2) a combination of textual and visual features can significantly and consistently improve the search performance; (3) the complexity of image queries has a strong correlation with retrieval results’ quality—more complex queries lead to poorer search effectiveness; and (4) query expansion based on social tags frequently causes search topic drift and consequently leads to performance degradation.  相似文献   

10.
为了解决传统的CBIR系统中存在的"语义鸿沟"问题,提出一种结合语义特征和视觉特征的图像检索方法.将图像的语义特征和视觉特征数据结合到同一个索引向量中,进行基于内容的图像检索.系统使用潜在语义索引(LSI)技术提取图像的语义特征,提取颜色直方图作为图像的视觉特征.通过将图像底层视觉特征与图像在向量空间中的语义统计特征相...  相似文献   

11.
This paper investigates the problem of modeling Internet images and associated text or tags for tasks such as image-to-image search, tag-to-image search, and image-to-tag search (image annotation). We start with canonical correlation analysis (CCA), a popular and successful approach for mapping visual and textual features to the same latent space, and incorporate a third view capturing high-level image semantics, represented either by a single category or multiple non-mutually-exclusive concepts. We present two ways to train the three-view embedding: supervised, with the third view coming from ground-truth labels or search keywords; and unsupervised, with semantic themes automatically obtained by clustering the tags. To ensure high accuracy for retrieval tasks while keeping the learning process scalable, we combine multiple strong visual features and use explicit nonlinear kernel mappings to efficiently approximate kernel CCA. To perform retrieval, we use a specially designed similarity function in the embedded space, which substantially outperforms the Euclidean distance. The resulting system produces compelling qualitative results and outperforms a number of two-view baselines on retrieval tasks on three large-scale Internet image datasets.  相似文献   

12.
Ontologies have been intensively applied for improving multimedia search and retrieval by providing explicit meaning to visual content. Several multimedia ontologies have been recently proposed as knowledge models suitable for narrowing the well known semantic gap and for enabling the semantic interpretation of images. Since these ontologies have been created in different application contexts, establishing links between them, a task known as ontology matching, promises to fully unlock their potential in support of multimedia search and retrieval. This paper proposes and compares empirically two extensional ontology matching techniques applied to an important semantic image retrieval issue: automatically associating common-sense knowledge to multimedia concepts. First, we extend a previously introduced textual concept matching approach to use both textual and visual representation of images. In addition, a novel matching technique based on a multi-modal graph is proposed. We argue that the textual and visual modalities have to be seen as complementary rather than as exclusive sources of extensional information in order to improve the efficiency of the application of an ontology matching approach in the multimedia domain. An experimental evaluation is included in the paper.  相似文献   

13.
This paper presents a unified annotation and retrieval framework, which integrates region annotation with image retrieval for performance reinforcement. To integrate semantic annotation with region-based image retrieval, visual and textual fusion is proposed for both soft matching and Bayesian probabilistic formulations. To address sample insufficiency and sample asymmetry in the annotation classifier training phase, we present a region-level multi-label image annotation scheme based on pair-wise coupling support vector machine (SVM) learning. In the retrieval phase, to achieve semantic-level region matching we present a novel retrieval scheme which differs from former work: the query example uploaded by users is automatically annotated online, and the user can judge its annotation quality. Based on the user’s judgment, two novel schemes are deployed for semantic retrieval: (1) if the user judges the photo to be well annotated, Semantically supervised Integrated Region Matching is adopted, which is a keyword-integrated soft region matching method; (2) If the user judges the photo to be poorly annotated, Keyword Integrated Bayesian Reasoning is adopted, which is a natural integration of a Visual Dictionary in online content-based search. In the relevance feedback phase, we conduct both visual and textual learning to capture the user’s retrieval target. Better annotation and retrieval performance than current methods were reported on both COREL 10,000 and Flickr web image database (25,000 images), which demonstrated the effectiveness of our proposed framework.  相似文献   

14.
15.
We discuss an adaptive approach towards Content-Based Image Retrieval. It is based on the Ostensive Model of developing information needs—a special kind of relevance feedback model that learns from implicit user feedback and adds a temporal notion to relevance. The ostensive approach supports content-assisted browsing through visualising the interaction by adding user-selected images to a browsing path, which ends with a set of system recommendations. The suggestions are based on an adaptive query learning scheme, in which the query is learnt from previously selected images. Our approach is an adaptation of the original Ostensive Model based on textual features only, to include content-based features to characterise images. In the proposed scheme textual and colour features are combined using the Dempster-Shafer theory of evidence combination. Results from a user-centred, work-task oriented evaluation show that the ostensive interface is preferred over a traditional interface with manual query facilities. This is due to its ability to adapt to the user's need, its intuitiveness and the fluid way in which it operates. Studying and comparing the nature of the underlying information need, it emerges that our approach elicits changes in the user's need based on the interaction, and is successful in adapting the retrieval to match the changes. In addition, a preliminary study of the retrieval performance of the ostensive relevance feedback scheme shows that it can outperform a standard relevance feedback strategy in terms of image recall in category search.  相似文献   

16.
17.
Video visualization is a computation process that extracts meaningful information from original video data sets and conveys the extracted information to users in appropriate visual representations. This paper presents a broad treatment of the subject, following a typical research pipeline involving concept formulation, system development, a path-finding user study, and a field trial with real application data. In particular, we have conducted a fundamental study on the visualization of motion events in videos. We have, for the first time, deployed flow visualization techniques in video visualization. We have compared the effectiveness of different abstract visual representations of videos. We have conducted a user study to examine whether users are able to learn to recognize visual signatures of motions, and to assist in the evaluation of different visualization techniques. We have applied our understanding and the developed techniques to a set of application video clips. Our study has demonstrated that video visualization is both technically feasible and cost-effective. It has provided the first set of evidence confirming that ordinary users can be accustomed to the visual features depicted in video visualizations, and can learn to recognize visual signatures of a variety of motion events.  相似文献   

18.
The result of a typical web search is often overwhelming. It is very difficult to explore the textual listing of the resulting documents, which may be in the thousands. In order to improve the utility of the search experience, we explore presenting search results through clustering and a zoomable two-dimensional map (zoomable treemap). Furthermore, we apply the fisheye view technique to this map of web search clusters to provide details in context. In this study, we report on our evaluation of these presentation features. The particular interfaces evaluated were: (1) a textual list, (2) a zoomable two-dimensional map of the clustered results, and (3) a fisheye version of the zoomable two-dimensional map where the results were clustered. We found that subjects completed search tasks faster with the visual interfaces than with the textual interface, and faster with the fisheye interface than just the zoomable interface. Based on the findings, we conclude that there is promise in the use of clustering and visualization with a fisheye zooming capability in the exploration of web search results.  相似文献   

19.
Hierarchical video browsing and feature-based video retrieval are two standard methods for accessing video content. Very little research, however, has addressed the benefits of integrating these two methods for more effective and efficient video content access. In this paper, we introduce InsightVideo, a video analysis and retrieval system, which joins video content hierarchy, hierarchical browsing and retrieval for efficient video access. We propose several video processing techniques to organize the content hierarchy of the video. We first apply a camera motion classification and key-frame extraction strategy that operates in the compressed domain to extract video features. Then, shot grouping, scene detection and pairwise scene clustering strategies are applied to construct the video content hierarchy. We introduce a video similarity evaluation scheme at different levels (key-frame, shot, group, scene, and video.) By integrating the video content hierarchy and the video similarity evaluation scheme, hierarchical video browsing and retrieval are seamlessly integrated for efficient content access. We construct a progressive video retrieval scheme to refine user queries through the interactions of browsing and retrieval. Experimental results and comparisons of camera motion classification, key-frame extraction, scene detection, and video retrieval are presented to validate the effectiveness and efficiency of the proposed algorithms and the performance of the system.  相似文献   

20.
Today it is quite common for people to exchange hundreds of comments in online conversations (e.g., blogs). Often, it can be very difficult to analyze and gain insights from such long conversations. To address this problem, we present a visual text analytic system that tightly integrates interactive visualization with novel text mining and summarization techniques to fulfill information needs of users in exploring conversations. At first, we perform a user requirement analysis for the domain of blog conversations to derive a set of design principles. Following these principles, we present an interface that visualizes a combination of various metadata and textual analysis results, supporting the user to interactively explore the blog conversations. We conclude with an informal user evaluation, which provides anecdotal evidence about the effectiveness of our system and directions for further design.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号