期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

How do users describe their information need: Query recommendation based on snippet click model

Yiqun Liu Junwei Miao Min Zhang Shaoping Ma Liyun Ru 《Expert systems with applications》2011,38(11):13847-13856

Query recommendation helps users to describe their information needs more clearly so that search engines can return appropriate answers and meet their needs. State-of-the-art researches prove that the use of users’ behavior information helps to improve query recommendation performance. Instead of finding the most similar terms previous users queried, we focus on how to detect users’ actual information need based on their search behaviors. The key idea of this paper is that although the clicked documents are not always relevant to users’ queries, the snippets which lead them to the click most probably meet their information needs. Based on analysis into large-scale practical search behavior log data, two snippet click behavior models are constructed and corresponding query recommendation algorithms are proposed. Experimental results based on two widely-used commercial search engines’ click-through data prove that the proposed algorithms outperform practical recommendation methods of these two search engines. To the best of our knowledge, this is the first time that snippet click models are proposed for query recommendation task. 相似文献

2.

A knowledge infrastructure for intelligent query answering in location-based services

Shijun Yu Stefano Spaccapietra 《GeoInformatica》2010,14(3):379-404

Intelligent query answering in Location-based Services refers to their capability to provide mobile users with personalized and contextualized answers. Personalization is expected to lead to answers that better match user’s interests, as inferable from the user’s profile. Contextualization aims at not selecting answers that for some reason would not be appropriate at the time and place of the user query. These goals are beyond the current state of art in LBS, or are provided based on ad hoc solutions specific to the application at hand. This paper reports on the results of an investigation aiming at defining the knowledge infrastructure that should be developed within the LBS to make it capable of returning intelligent answers. We first discuss the data management features that make LBS different from other query answering systems. Next we propose a data infrastructure that builds on the idea of modular ontologies. We explain how the relevant knowledge may be incrementally set up and dynamically maintained based on an application-independent approach. Last we show how this knowledge is used to reformulate user’s queries via personalized and contextualized rewriting. 相似文献

3.

Approximating query answering on RDF databases 总被引：1，自引：0，他引：1

Hai Huang Chengfei Liu Xiaofang Zhou 《World Wide Web》2012,15(1):89-114

Database users may be frustrated by no answers returned when they pose a query on the database. In this paper, we study the problem of relaxing queries on RDF databases in order to acquire approximate answers. We address two problems in efficient query relaxation. First, to ensure the quality of answers, we compute the similarities between relaxed queries with regard to the user query and use them to score the potential relevant answers. Second, for obtaining top-k answers, we develop two algorithms. One is based on the best-first strategy and relaxed queries are executed in the ranking order. The batch based algorithm executes the relaxed queries as a batch and avoids unnecessary execution cost. At last, we implement and experimentally evaluate our approaches. 相似文献

4.

Approximation and relaxation of semantic web path queries

《Journal of Web Semantics》2016

Given the heterogeneity of complex graph data on the web, such as RDF linked data, it is likely that a user wishing to query such data will lack full knowledge of the structure of the data and of its irregularities. Hence, providing flexible querying capabilities that assist users in formulating their information seeking requirements is highly desirable. In this paper we undertake a detailed theoretical investigation of query approximation, query relaxation, and their combination, for this purpose. The query language we adopt comprises conjunctions of regular path queries, thus encompassing recent extensions to SPARQL to allow for querying paths in graphs using regular expressions (SPARQL 1.1). To this language we add standard notions of query approximation based on edit distance, as well as query relaxation based on RDFS inference rules. We show how both of these notions can be integrated into a single theoretical framework and we provide incremental evaluation algorithms that run in polynomial time in the size of the query and the data, returning answers in ranked order of their ‘distance’ from the original query. We also combine for the first time these two disparate notions into a single ‘flex’ operation that simultaneously applies both approximation and relaxation to a query conjunct, providing even greater flexibility for users, but still retaining polynomial time evaluation complexity and the ability to return query answers in ranked order. 相似文献

5.

Efficient Monte Carlo clustering in subspaces

Clark F. Olson David C. Hunn Henry J. Lyons 《Knowledge and Information Systems》2017,50(3):751-762

One of the key difficulties for users in information retrieval is to formulate appropriate queries to submit to the search engine. In this paper, we propose an approach to enrich the user’s queries by additional context. We used the Language Model to build the query context, which is composed of the most similar queries to the query to expand and their top-ranked documents. Then, we applied a query expansion approach based on the query context and the Latent Semantic Analyses method. Using a web test collection, we tested our approach on short and long queries. We varied the number of recommended queries and the number of expansion terms to specify the appropriate parameters for the proposed approach. Experimental results show that the proposed approach improves the effectiveness of the information retrieval system by 19.23 % for short queries and 52.94 % for long queries according to the retrieval results using the original users’ queries. 相似文献

6.

基于吸收态随机行走的两阶段效用性查询推荐方法

朱小飞郭嘉丰程学旗兰艳艳《计算机研究与发展》2013,50(12):2603-2611

搜索引擎已经成为人们获取信息的重要途径,然而对于用户而言如何构造一个合适的查询仍然是一项困难的工作.为了减轻用户搜索信息的负担,查询推荐技术应运而生并且已经成为当今搜索引擎不可或缺的组成部分.传统的查询推荐方法主要关注向用户推荐相关性查询,即推荐与源查询具有相近搜索意图的其他查询.然而查询推荐的根本目标是帮助用户成功完成其搜索任务,而不仅仅是找到相关性查询,尽管相关性查询有时也能得到有用的搜索结果.为了更好地满足用户的搜索目标,一种更直接的查询推荐方式是向用户推荐高效用性查询,即能够更好满足用户信息需求的查询.提出了一个基于吸收态随机行走的2阶段效用性查询推荐方法,该方法能够同时对用户的查询重构行为和查询点击行为进行建模并推导出查询的效用.在真实查询日志上的实验结果表明:新方法在评价指标查询相关率(query relevant ratio, QRR)和平均相关文档数(mean relevant document, MRD)上要显著优于其他5种基准方法. 相似文献

7.

Progressive evaluation of nested aggregate queries

Kian-Lee Tan Cheng Hian Goh Beng Chin Ooi 《The VLDB Journal The International Journal on Very Large Data Bases》2000,9(3):261-278

In many decision-making scenarios, decision makers require rapid feedback to their queries, which typically involve aggregates. The traditional blocking execution model can no longer meet the demands of these users. One promising approach in the literature, called online aggregation, evaluates an aggregation query progressively as follows: as soon as certain data have been evaluated, approximate answers are produced with their respective running confidence intervals; as more data are examined, the answers and their corresponding running confidence intervals are refined. In this paper, we extend this approach to handle nested queries with aggregates (i.e., at least one inner query block is an aggregate query) by providing users with (approximate) answers progressively as the inner aggregation query blocks are evaluated. We address the new issues pose by nested queries. In particular, the answer space begins with a superset of the final answers and is refined as the aggregates from the inner query blocks are refined. For the intermediary answers to be meaningful, they have to be interpreted with the aggregates from the inner queries. We also propose a multi-threaded model in evaluating such queries: each query block is assigned to a thread, and the threads can be evaluated concurrently and independently. The time slice across the threads is nondeterministic in the sense that the user controls the relative rate at which these subqueries are being evaluated. For enumerative nested queries, we propose a priority-based evaluation strategy to present answers that are certainly in the final answer space first, before presenting those whose validity may be affected as the inner query aggregates are refined. We implemented a prototype system using Java and evaluated our system. Results for nested queries with a level and multiple levels of nesting are reported. Our results show the effectiveness of the proposed mechanisms in providing progressive feedback that reduces the initial waiting time of users significantly without sacrificing the quality of the answers. Received April 25, 2000 / Accepted June 27, 2000 相似文献

8.

A Hierarchical Grid Index (HGI), spatial queries in wireless data broadcasting

Kwangjin Park Patrick Valduriez 《Distributed and Parallel Databases》2013,31(3):413-446

The main requirements for spatial query processing via mobile terminals include rapid and accurate searching and low energy consumption. Most location-based services (LBSs) are provided using an on-demand method, which is suitable for light-loaded systems where contention for wireless channels and server processing is not severe. However, as the number of users of LBSs increases, performance deteriorates rapidly since the servers’ capability to process queries is limited. Furthermore, the response time of a query may significantly increase with the concentration of users’ queries in a server at the same time. That is because the server has to check the locations of users and potential objects for the final result and then individually send answers to clients via a point-to-point channel. At this time, an inefficient structure of spatial index and searching algorithm may incur an extremely large access latency. To address this problem, we propose the Hierarchical Grid Index (HGI), which provides a light-weight sequential location-based index structure for efficient LBSs. We minimize the index size through the use of hierarchical location-based identifications. And we support efficient query processing in broadcasting environments through sequential data transfer and search based on the object locations. We also propose Top-Down Search and Reduction-Counter Search algorithms for efficient searching and query processing. HGI has a simple structure through elimination of replication pointers and is therefore suitable for broadcasting environments with one-dimensional characteristics, thus enabling rapid and accurate spatial search by reducing redundant data. Our performance evaluation shows that our proposed index and algorithms are accurate and fast and support efficient spatial query processing. 相似文献

9.

Blind evaluation of location based queries using space transformation to preserve location privacy 总被引：1，自引：0，他引：1

Ali Khoshgozaran Houtan Shirani-Mehr Cyrus Shahabi 《GeoInformatica》2013,17(4):599-634

In this paper we propose a fundamental approach to perform the class of Range and Nearest Neighbor (NN) queries, the core class of spatial queries used in location-based services, without revealing any location information about the query in order to preserve users’ private location information. The idea behind our approach is to utilize the power of one-way transformations to map the space of all objects and queries to another space and resolve spatial queries blindly in the transformed space. Traditional encryption based techniques, solutions based on the theory of private information retrieval, or the recently proposed anonymity and cloaking based approaches cannot provide stringent privacy guarantees without incurring costly computation and/or communication overhead. In contrast, we propose efficient algorithms to evaluate KNN and range queries privately in the Hilbert transformed space. We also propose a dual curve query resolution technique which further reduces the costs of performing range and KNN queries using a single Hilbert curve. We experimentally evaluate the performance of our proposed range and KNN query processing techniques and verify the strong level of privacy achieved with acceptable computation and communication overhead. 相似文献

10.

Braunmuller B. Ester M. Kriegel H.-P. Sander J. 《Knowledge and Data Engineering, IEEE Transactions on》2001,13(1):79-95

Metric databases are databases where a metric distance function is defined for pairs of database objects. In such databases, similarity queries in the form of range queries or k-nearest-neighbor queries are the most important query types. In traditional query processing, single queries are issued independently by different users. In many data mining applications, however, the database is typically explored by iteratively asking similarity queries for answers of previous similarity queries. We introduce a generic scheme for such data mining algorithms and we investigate two orthogonal approaches, reducing I/O cost as well as CPU cost, to speed-up the processing of multiple similarity queries. The proposed techniques apply to any type of similarity query and to an implementation based on an index or using a sequential scan. Parallelization yields an additional impressive speed-up. An extensive performance evaluation confirms the efficiency of our approach 相似文献

11.

Evolutionary Fuzzy‐based gravitational search algorithm for query optimization in crowdsourcing system to minimize cost and latency

N. Bhaskar P. Mohan Kumar J. Arokia Renjit 《Computational Intelligence》2021,37(1):2-20

Crowdsourcing is an environment where a group of users collaborates together to exchange information and to find answers for complex problems (queries). Query optimization is the task of selecting the best query strategy with less cost associated with it. The crowdsourcing cost can be determined by selecting the best plan from the set of options available and the best plan considerably reduce the cost for the inquiry configuration. As one of the center tasks in information recovery, the investigation of top‐k queries with crowdsourcing, to be specific group empowered top k inquiries is depicted. This issue is defined with three key variables, latency, money related expense, and nature of answers. The fundamental point is to plan a novel system that limits financial cost when the latency is compelled. In this article, we used a heuristic search algorithm named as Evolutionary Fuzzy‐based Gravitational Search algorithm (EFGSA) that produces an optimal query feature selection results with minimizing cost and latency. EFGSA‐based crowdsourcing framework gives a better balance between latency and cost while generating query plans. The performance analysis of proposed EFSGA for optimal query plan is evaluated in terms of running time, accuracy, monetary cost, and so on. From the experimental results, the proposed method achieved better results than other methods in our cost and latency model. 相似文献

12.

Answering form-based web queries using the data-mining approach

Xiaochun Yang Yiu-Kai Ng 《Journal of Intelligent Information Systems》2008,30(1):1-32

Web users often post queries through form-based interfaces on the Web to retrieve data from the Web; however, answers to these queries are mostly computed according to keywords entered into different fields specified in a query interface, and their precision and recall could be low. The precision and recall ratios in answering this type of query can be improved by considering closely related previous queries submitted through the same interface, along with their answers. In this paper, we present an approach for enhancing the retrieval of relevant answers to a form-based Web query by adopting the data-mining approach using previous, relevant queries and their answers. Experimental results on a randomly selected set of 3,800 documents retrieved from various Web sites show that our data-mining, query-rewriting approach achieves average precision and true positive ratios on rewritten queries in the upper 80% range, whereas the average false positive ratio is less than 2.0%. Work partially done during a visit to BYU and partially supported by National Natural Science Foundation of China No. 60503036 and Fok YingTong Education Foundation No. 104027. 相似文献

13.

Top-k answers for XML keyword queries

Khanh Nguyen Jinli Cao 《World Wide Web》2012,15(5-6):485-515

Searching XML data using keyword queries has attracted much attention because it enables Web users to easily access XML data without having to learn a structured query language or study possibly complex data schemas. Most of the current approaches identify the meaningful results of a given keyword query based on the semantics of lowest common ancestor (LCA) and its variants. However, given the fact that LCA candidates are usually numerous and of low relevance to the users?? information need, how to effectively and efficiently identify the most relevant results from a large number of LCA candidates is still a challenging and unresolved issue. In this article, we introduce a novel semantics of relevant results based on mutual information between the query keywords. Then, we introduce a novel approach for identifying the relevant answers of a given query by adopting skyline semantics. We also recommend three different ranking criteria for selecting the top-k relevant results of the query. Efficient algorithms are proposed which rely on some provable properties of the dominance relationship between result candidates to rapidly identify the top-k dominant results. Extensive experiments were conducted to evaluate our approach and the results show that the proposed approach has a good performance compared with other existing approaches in different data sets and evaluation metrics 相似文献

14.

Predicting user click behaviour in search engine advertisements

Mohammad Daryaie Zanjani Shahram Khadivi 《New Review of Hypermedia and Multimedia》2015,21(3-4):301-319

According to the specific requirements and interests of users, search engines select and display advertisements that match user needs and have higher probability of attracting users’ attention based on their previous search history. New objects such as user, advertisement or query cause a deterioration of precision in targeted advertising due to their lack of history. This article surveys this challenge. In the case of new objects, we first extract similar observed objects to the new object and then we use their history as the history of new object. Similarity between objects is measured based on correlation, which is a relation between user and advertisement when the advertisement is displayed to the user. This method is used for all objects, so it has helped us to accurately select relevant advertisements for users’ queries. In our proposed model, we assume that similar users behave in a similar manner. We find that users with few queries are similar to new users. We will show that correlation between users and advertisements’ keywords is high. Thus, users who pay attention to advertisements’ keywords, click similar advertisements. In addition, users who pay attention to specific brand names might have similar behaviours too. 相似文献

15.

Keyword Query over Error-Tolerant Knowledge Bases

下载免费PDF全文

Yu-Rong Cheng Ye Yuan Jia-Yu Li Lei Chen Guo-Ren Wang 《计算机科学技术学报》2016,31(4):702-719

With more and more knowledge provided by WWW, querying and mining the knowledge bases have attracted much research attention. Among all the queries over knowledge bases, which are usually modelled as graphs, a keyword query is the most widely used one. Although the problem of keyword query over graphs has been deeply studied for years, knowledge bases, as special error-tolerant graphs, lead to the results of the traditional defined keyword queries out of users’ satisfaction. Thus, in this paper, we define a new keyword query, called confident r-clique, specific for knowledge bases based on the r-clique definition for keyword query on general graphs, which has been proved to be the best one. However, as we prove in the paper, finding the confident r-cliques is #P-hard. We propose a filtering-and-verification framework to improve the search efficiency. In the filtering phase, we develop the tightest upper bound of the confident r-clique, and design an index together with its search algorithm, which suits the large scale of knowledge bases well. In the verification phase, we develop an efficient sampling method to verify the final answers from the candidates remaining in the filtering phase. Extensive experiments demonstrate that the results derived from our new definition satisfy the users’ requirement better compared with the traditional r-clique definition, and our algorithms are efficient. 相似文献

16.

Comparing data summaries for processing live queries over Linked Data

J??rgen Umbrich Katja Hose Marcel Karnstedt Andreas Harth Axel Polleres 《World Wide Web》2011,14(5-6):495-544

A growing amount of Linked Data??graph-structured data accessible at sources distributed across the Web??enables advanced data integration and decision-making applications. Typical systems operating on Linked Data collect (crawl) and pre-process (index) large amounts of data, and evaluate queries against a centralised repository. Given that crawling and indexing are time-consuming operations, the data in the centralised index may be out of date at query execution time. An ideal query answering system for querying Linked Data live should return current answers in a reasonable amount of time, even on corpora as large as the Web. In such a live query system source selection??determining which sources contribute answers to a query??is a crucial step. In this article we propose to use lightweight data summaries for determining relevant sources during query evaluation. We compare several data structures and hash functions with respect to their suitability for building such summaries, stressing benefits for queries that contain joins and require ranking of results and sources. We elaborate on join variants, join ordering and ranking. We analyse the different approaches theoretically and provide results of an extensive experimental evaluation. 相似文献

17.

Privacy-Conscious Location-Based Queries in Mobile Environments 总被引：1，自引：0，他引：1

Xu Jianliang Tang Xueyan Hu Haibo Du Jing 《Parallel and Distributed Systems, IEEE Transactions on》2010,21(3):313-326

In location-based services, users with location-aware mobile devices are able to make queries about their surroundings anywhere and at any time. While this ubiquitous computing paradigm brings great convenience for information access, it also raises concerns over potential intrusion into user location privacy. To protect location privacy, one typical approach is to cloak user locations into spatial regions based on user-specified privacy requirements, and to transform location-based queries into region-based queries. In this paper, we identify and address three new issues concerning this location cloaking approach. First, we study the representation of cloaking regions and show that a circular region generally leads to a small result size for region-based queries. Second, we develop a mobility-aware location cloaking technique to resist trace analysis attacks. Two cloaking algorithms, namely MaxAccu_Cloak and MinComm_Cloak, are designed based on different performance objectives. Finally, we develop an efficient polynomial algorithm for evaluating circular-region-based kNN queries. Two query processing modes, namely bulk and progressive, are presented to return query results either all at once or in an incremental manner. Experimental results show that our proposed mobility-aware cloaking algorithms significantly improve the quality of location cloaking in terms of an entropy measure without compromising much on query latency or communication cost. Moreover, the progressive query processing mode achieves a shorter response time than the bulk mode by parallelizing the query evaluation and result transmission. 相似文献

18.

Enriching the conceptual basis for query formulation through relationship semantics in databases

《Information Systems》2001,26(6):445-475

The rapid increase in end-user computing calls into question the suitability of existing database query languages (DBQLs). Because the typical DB end-user is not a DB specialist, it is essential that DBQLs use concepts that are as close as possible to those in the end-users’ cognitive mental model and adopt interface techniques that are suited to end-users’ abilities. Concept-based query languages are well suited for this. This realization has motivated further research in conceptual, or semantic, query approaches. However, the primary focus in this field has been on semantic query optimization, not on query formulation. In this study, we address ourselves to the problem of formulation of queries using concepts. We propose a concept-based query language, called the conceptual query language (CQL), which allows for the conceptual abstraction of database queries and exploits the rich semantics of data models to ease and facilitate query formulation.The CQL approach uses the relationship semantics of semantic data models to render transparent the technical complexities of existing DB query languages. Association semantics are also used to automatically construct query graphs and pseudo-natural language explanations of queries, and to generate SQL codes. A set theoretic formalism for conceptual queries is developed and used. This paper discusses the design of CQL, its expressive power, its implementation, and the strategies for CQL query processing. The implementation of a CQL prototype is briefly discussed in this paper. User experiments were carried out extensively and showed the advantage of CQL over alternative languages such as SQL. 相似文献

19.

Answering reachability queries with ordered label constraints over labeled graphs

Daoliang HE Pingpeng YUAN Hai JIN 《Frontiers of Computer Science》2024,18(1):181601

Reachability query plays a vital role in many graph analysis tasks. Previous researches proposed many methods to efficiently answer reachability queries between vertex pairs. Since many real graphs are labeled graph, it highly demands Label-Constrained Reachability (LCR) query in which constraint includes a set of labels besides vertex pairs. Recent researches proposed several methods for answering some LCR queries which require appearance of some labels specified in constraints in the path. Besides that constraint may be a label set, query constraint may be ordered labels, namely OLCR (Ordered-Label-Constrained Reachability) queries which retrieve paths matching a sequence of labels. Currently, no solutions are available for OLCR. Here, we propose DHL, a novel bloom filter based indexing technique for answering OLCR queries. DHL can be used to check reachability between vertex pairs. If the answers are not no, then constrained DFS is performed. So, we employ DHL followed by performing constrained DFS to answer OLCR queries. We show that DHL has a bounded false positive rate, and it’s powerful in saving indexing time and space. Extensive experiments on 10 real-life graphs and 12 synthetic graphs demonstrate that DHL achieves about 4.8–22.5 times smaller index space and 4.6–114 times less index construction time than two state-of-art techniques for LCR queries, while achieving comparable query response time. The results also show that our algorithm can answer OLCR queries effectively. 相似文献

20.

多媒体对象查询语言及其查询处理 总被引：4，自引：0，他引：4

田增平党华锐周傲英施伯乐《软件学报》1999,10(7):694-701

文章研究了多媒体数据库的查询需求,提出结构化的多媒体对象查询语言MOQL(multi-media object query language).它能够支持基于类型、结构特征、同步关系、时态关系和内容信息的多媒体查询.以DB2数据库为存储机制,定义了一组代数算子和变换规则,利用它们可以将用户定义的MOQL查询变换为代数表达式,进行代数优化,并将代数查询表达式转换为能够在DB2数据库上运行的DB2SQL和C++查询过程. 相似文献