首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
When performing queries in web search engines, users often face difficulties choosing appropriate query terms. Search engines therefore usually suggest a list of expanded versions of the user query to disambiguate it or to resolve potential term mismatches. However, it has been shown that users find it difficult to choose an expanded query from such a list. In this paper, we describe the adoption of set‐based text visualization techniques to visualize how query expansions enrich the result space of a given user query and how the result sets relate to each other. Our system uses a linguistic approach to expand queries and topic modeling to extract the most informative terms from the results of these queries. In a user study, we compare a common text list of query expansion suggestions to three set‐based text visualization techniques adopted for visualizing expanded query results – namely, Compact Euler Diagrams, Parallel Tag Clouds, and a List View – to resolve ambiguous queries using interactive query expansion. Our results show that text visualization techniques do not increase retrieval efficiency, precision, or recall. Overall, users rate Parallel Tag Clouds visualizing key terms of the expanded query space lowest. Based on the results, we derive recommendations for visualizations of query expansion results, text visualization techniques in general, and discuss alternative use cases of set‐based text visualization techniques in the context of web search.  相似文献   

2.
We present an annotation management system for relational databases. In this system, every piece of data in a relation is assumed to have zero or more annotations associated with it and annotations are propagated along, from the source to the output, as data is being transformed through a query. Such an annotation management system could be used for understanding the provenance (aka lineage) of data, who has seen or edited a piece of data or the quality of data, which are useful functionalities for applications that deal with integration of scientific and biological data. We present an extension, pSQL, of a fragment of SQL that has three different types of annotation propagation schemes, each useful for different purposes. The default scheme propagates annotations according to where data is copied from. The default-all scheme propagates annotations according to where data is copied from among all equivalent formulations of a given query. The custom scheme allows a user to specify how annotations should propagate. We present a storage scheme for the annotations and describe algorithms for translating a pSQL query under each propagation scheme into one or more SQL queries that would correctly retrieve the relevant annotations according to the specified propagation scheme. For the default-all scheme, we also show how we generate finitely many queries that can simulate the annotation propagation behavior of the set of all equivalent queries, which is possibly infinite. The algorithms are implemented and the feasibility of the system is demonstrated by a set of experiments that we have conducted.  相似文献   

3.
Internet users may suffer the empty or too little answer problem when they post a strict query to the Web database. To address this problem, we develop a general framework to enable automatically query relaxation and top-k result ranking. Our framework consists of two processing steps. The first step is query relaxation. Based on the user original query, we speculate how much the user cares about each specified attribute by measuring its specified value distribution in the database. The rare distribution of the specified value of the attribute indicates the attribute may important for the user. According to the attribute importance, the original query is then rewritten as a relaxed query by expanding each query criterion range. The relaxed degree on each specified attribute is varied with the attribute weight adaptively. The most important attribute is relaxed with the minimum degree so that the answer returned by the relaxed query can be most relevant to the user original intention. The second step is top-k result ranking. In this step, we first generate user contextual preferences from query history and then use them to create a priori orders of tuples during the off-line pre-processing. Only a few representative orders are saved, each corresponding to a set of contexts. Then, these orders and associated contexts are used at querying time to expeditiously provide top-k relevant answers by using the top-k evaluation algorithm. Results of a preliminary user study demonstrate our query relaxation, and top-k result ranking methods can capture the users preferences effectively. The efficiency and effectiveness of our approach is also demonstrated.  相似文献   

4.
The extraction of information from movie film and videotape has always been a very tedious process. Yet the usefulness of these media in biomedical research, behavioral science, industrial testing, etc., is apparent. We are developing a system, GALATEA, for the rapid extraction of data from film or video using interactive graphics. The key aspects of GALATEA are: the user indicates what features are of interest using an x, y digitizing pen. These pen positions are the only data the system sees, so that full digital image encoding is avoided. The user can trace the features while the film runs (frame-by-frame analysis is not necessary). The user has constant feedback in the form of an animated, computer-generated movie (Kinegram), overlaid on the original film image and running synchronously with it. It is this kinetic feedback of the data entered that makes the system efficient. Structured programming, real-time programming techniques, data structures for time-dependent pictures and dynamical graphics ‘tools’ are covered. A detailed discussion is given of Slippage or soft degradation in real-time systems under fluctuating load conditions.  相似文献   

5.
6.
Abstract. Though there has been extensive work on multimedia databases in the last few years, there is no prevailing notion of a multimedia view, nor there are techniques to create, manage, and maintain such views. Visualizing the results of a dynamic multimedia query or materializing a dynamic multimedia view corresponds to assembling and delivering an interactive multimedia presentation in accordance with the visualization specifications. In this paper, we suggest that a non-interactive multimedia presentation is a set of virtual objects with associated spatial and temporal presentation constraints. A virtual object is either an object, or the result of a query. As queries may have different answers at different points in time, scheduling the presentation of such objects is nontrivial. We then develop a probabilistic model of interactive multimedia presentations, extending the non-interactive model described earlier. We also develop a probabilistic model of interactive visualization where the probabilities reflect the user profiles, or the likelihood of certain user interactions. Based on this probabilistic model, we develop three utility-theoretic based types of prefetching algorithms that anticipate how users will interact with the presentation. These prefetching algorithms allow efficient visualization of the query results in accordance with the underlying specification. We have built a prototype system that incorporates these algorithms. We report on the results of experiments conducted on top of this implementation. Received June 10, 1998 / Accepted November 10, 1999  相似文献   

7.
In this article we present ConQueSt, a constraint-based querying system able to support the intrinsically exploratory (i.e., human-guided, interactive and iterative) nature of pattern discovery. Following the inductive database vision, our framework provides users with an expressive constraint-based query language, which allows the discovery process to be effectively driven toward potentially interesting patterns. Such constraints are also exploited to reduce the cost of pattern mining computation. ConQueSt is a comprehensive mining system that can access real-world relational databases from which to extract data. Through the interaction with a friendly graphical user interface (GUI), the user can define complex mining queries by means of few clicks. After a pre-processing step, mining queries are answered by an efficient and robust pattern mining engine which entails the state-of-the-art of data and search space reduction techniques. Resulting patterns are then presented to the user in a pattern browsing window, and possibly stored back in the underlying database as relations.  相似文献   

8.
Searching a digital library is typically a tedious task. A system can improve information access by building on knowledge about a user acquired in a user profile in order to customize information access both in terms of the information returned in response to a query (query personalization) as well as in terms of the presentation of the results (presentation personalization). In this paper, we focus on query personalization in digital libraries; in particular, we address structured queries involving metadata stored in relational databases. We describe the specification of user preferences at the level of a user profile and the process of query personalization with the use of query-rewriting rules.  相似文献   

9.
文本对象查询的相关性计算   总被引:2,自引:0,他引:2  
本文把特征文件,符号对象模型及时间戳排序等概念引入到文本对象查询系统的设计之中,提出了基于索引调组集的用户查询和文本对象相关性计算,以词组标识解决词组同义词等价性判定问题以及借助于时间戳排序技术充分利用查询反馈信息以提高系统时空效率的算法和方法,并讨论了面向文本数据库管理系统的文本对象查询的优化策略及逻辑实现等问题。  相似文献   

10.
1.引言 XML因具有可以直接在互联网(Internet)上使用、支持大量不同应用、与SGML兼容、XML文件容易编写且能够让人直接阅读等优点,受到人们的普遍关注,应用日益广泛。但因XML文件既包含内容也包含(通过标记表示的)内部结构,Web数据资源采用XML文件来描述,在下述方面可能会给应用处理带来不利影响:1)当XML文件内容繁多时,文件庞大,影响内存的处理效率;2)当需要在XQuery查询的  相似文献   

11.
基于网络资源与用户行为信息的领域术语提取   总被引:1,自引:0,他引:1  
领域术语是反映领域特征的词语.领域术语自动抽取是自然语言处理中的一项重要任务,可以应用在领域本体抽取、专业搜索、文本分类、类语言建模等诸多研究领域,利用互联网上大规模的特定领域语料来构建领域词典成为一项既有挑战性又有实际价值的工作.当前,领域术语提取工作所利用的网络语料主要是网页对应的正文,但是由于网页正文信息抽取所面临的难题会影响领域术语抽取的效果,那么利用网页的锚文本和查询文本替代网页正文进行领域术语抽取,则可以避免网页正文信息抽取所面临的难题.针对锚文本和查询文本所存在的文本长度过短、语义信息不足等缺点,提出一种适用于各种类型网络数据及网络用户行为数据的领域数据提取方法,并使用该方法基于提取到的网页正文数据、网页锚文本数据、用户查询信息数据、用户浏览信息数据等开展了领域术语提取工作,重点考察不同类型网络资源和用户行为信息对领域术语提取工作的效果差异.在海量规模真实网络数据上的实验结果表明,基于用户查询信息和用户浏览过的锚文本信息比基于网页正文提取技术得到的正文取得了更好的领域术语提取效果.  相似文献   

12.
13.
Although many existing movie recommender systems have investigated recommendation based on information such as clicks and tags, much less efforts have been made to explore the multimedia content of movies, which has potential information for the elicitation of the user’s visual and musical preferences.In this paper, we explore the content from three media types (image, text, audio) and propose a novel multi-view semi-supervised movie recommendation method, which represents each media type as a view space for movies.The three views of movies are integrated to predict the rating values under the multi-view framework.Furthermore, our method considers the casual users who rate limited movies.The algorithm enriches the user profile with a semi-supervised way when there are only few rating histories.Experiments indicate that the multimedia content analysis reveals the user’s profile in a more comprehensive way.Different media types can be a complement to each other for movie recommendation.And the experimental results validate that our semi-supervised method can effectively enrich the user profile for recommendation with limited rating history.  相似文献   

14.
Atlas: a nested relational database system for text applications   总被引:1,自引:0,他引:1  
Advanced database applications require facilities such as text indexing, image storage, and the ability to store data with a complex structure. However, these facilities are not usually included in traditional database systems. In this paper we describe Atlas, a nested relational database system that has been designed for text-based applications. The Atlas query language is TQL, an SQL-like query language with text operators. The query language is supported by signature file text indexing techniques, and by a parser that can be configured for different text formats and even some foreign languages. Atlas can also be used to store images and audio  相似文献   

15.
Multimodal Retrieval is a well-established approach for image retrieval. Usually, images are accompanied by text caption along with associated documents describing the image. Textual query expansion as a form of enhancing image retrieval is a relatively less explored area. In this paper, we first study the effect of expanding textual query on both image and its associated text retrieval. Our study reveals that judicious expansion of textual query through keyphrase extraction can lead to better results, either in terms of text-retrieval or both image and text-retrieval. To establish this, we use two well-known keyphrase extraction techniques based on tf-idf and KEA. While query expansion results in increased retrieval efficiency, it is imperative that the expansion be semantically justified. So, we propose a graph-based keyphrase extraction model that captures the relatedness between words in terms of both mutual information and relevance feedback. Most of the existing works have stressed on bridging the semantic gap by using textual and visual features, either in combination or individually. The way these text and image features are combined determines the efficacy of any retrieval. For this purpose, we adopt Fisher-LDA to adjudge the appropriate weights for each modality. This provides us with an intelligent decision-making process favoring the feature set to be infused into the final query. Our proposed algorithm is shown to supersede the previously mentioned keyphrase extraction algorithms for query expansion significantly. A rigorous set of experiments performed on ImageCLEF-2011 Wikipedia Retrieval task dataset validates our claim that capturing the semantic relation between words through Mutual Information followed by expansion of a textual query using relevance feedback can simultaneously enhance both text and image retrieval.  相似文献   

16.
Spatio-temporal querying and retrieval is a challenging task due to the lack of simple user interfaces for building queries despite the availability of powerful indexing structures and querying languages. In this paper, we propose Query-by-Gaming scheme for spatio-temporal querying that can benefit from gaming controller for building queries. By using Query-by-Gaming, we introduce our spatio-temporal querying and retrieval system named as GStar to interactively build subsequent spatio-temporal queries to determine if a state is directly reachable from current state and eventual spatio-temporal queries to know whether a spatial state is reachable from a current state. Queries are built using features of gaming controller by displaying the original video frames rather than on a graphical interface using a mouse or a keyboard. GStar has three main components: building the query, searching and retrieval of clips, and displaying query results. The queries are applied to an indexing structure called semantic sequence state graph (S3G) and results of the query are displayed dynamically to provide timely feedback to the user. Experimental results and user interface are provided for a tennis video database. Users define desired game state (player and ball position) using an interactive interface at multiple points in time and GStar automatically retrieves all rallies that contain both states. Finally, the user interface evaluation comparing gamepad-based interface and mouse interface for spatio-temporal querying has been studied.  相似文献   

17.
The text searching paradigm still prevails even when users are looking for image data for example in the Internet. Searching for images mostly means searching on basis of annotations that have been made manually. When annotations are left empty, which is usually the case, searches on image file names are performed. This may lead to surprising retrieval results. The graphical search paradigm, searching image data by querying graphically, either with an image or with a sketch, currently seems not to be the preferred method partly because of the complexity in designing the query.In this paper we present our PictureFinder system, which currently supports “full image retrieval” in analogy to full text retrieval. PictureFinder allows graphical queries for the image the user has in his mind by sketching colored and/or textured regions or by whole images (query by example). By adjusting the search tolerances for each region and image feature (i.e. hue, saturation, lightness, texture pattern and coverage) the user can tune his query either to find images matching his sketch or images which differing from the specified colors and/or textures to a certain degree. To compare colors we propose a color distance measure that takes into account the fact that different colors spread differently in the color space, and which take into account that the position of a region in an image may be important.Furthermore, we show our query by example approach. Based on the example image chosen by the user, a graphical query is generated automatically and presented to the user. One major advantage of this approach is the possibility to change and adjust a query by example in the same way as a query which was sketched by the user. By deleting unimportant regions and by adjusting the tolerances of the remaining regions the user may focus on image details which are important to him.  相似文献   

18.
This paper structures a novel vision for OLAPby fundamentally redefining several of the pillars on which OLAP has been based for the last 20 years. We redefine OLAP queries, in order to move to higher degrees of abstraction from roll-up’s and drill-down’s, and we propose a set of novel intentional OLAP operators, namely, describe, assess, explain, predict, and suggest, which express the user’s need for results. We fundamentally redefine what a query answer is, and escape from the constraint that the answer is a set of tuples; on the contrary, we complement the set of tuples with models (typically, but not exclusively, results of data mining algorithms over the involved data) that concisely represent the internal structure or correlations of the data. Due to the diverse nature of the involved models, we come up (for the first time ever, to the best of our knowledge) with a unifying framework for them, that places its pillars on the extension of each data cell of a cube with information about the models that pertain to it — practically converting the small parts that build up the models to data that annotate each cell. We exploit this data-to-model mapping to provide highlights of the data, by isolating data and models that maximize the delivery of new information to the user. We introduce a novel method for assessing the surprise that a new query result brings to the user, with respect to the information contained in previous results the user has seen via a new interestingness measure. The individual parts of our proposal are integrated in a new data model for OLAP, which we call the Intentional Analytics Model. We complement our contribution with a list of significant open problems for the community to address.  相似文献   

19.
Microblogging is a modern communication paradigm in which users post bits of information, or “memes” as we call them, that are brief text updates or micromedia such as photos, video or audio clips. Once a user post a meme, it become visible to the user community. When a user finds a meme of another user interesting, she can eventually repost it, thus allowing memes to propagate virally trough the social network. In this paper we introduce the meme ranking problem, as the problem of selecting which k memes (among the ones posted by their contacts) to show to users when they log into the system. The objective is to maximize the overall activity of the network, that is, the total number of reposts that occur. We deeply characterize the problem showing that not only exact solutions are unfeasible, but also approximated solutions are prohibitive to be adopted in an on-line setting. Therefore we devise a set of heuristics and we compare them trough an extensive simulation based on the real-world Yahoo! Meme social graph, using parameters learnt from real logs of meme propagations. Our experimentation demonstrates the effectiveness and feasibility of these methods.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号