共查询到20条相似文献,搜索用时 0 毫秒
1.
Users are rarely familiar with the content of a data source they are querying, and therefore cannot avoid using keywords that do not exist in the data source. Traditional systems may respond with an empty result, causing dissatisfaction, while the data source in effect holds semantically related content. In this paper we study this no-but-semantic-match problem on XML keyword search and propose a solution which enables us to present the top-k semantically related results to the user. Our solution involves two steps: (a) extracting semantically related candidate queries from the original query and (b) processing candidate queries and retrieving the top-k semantically related results. Candidate queries are generated by replacement of non-mapped keywords with candidate keywords obtained from an ontological knowledge base. Candidate results are scored using their cohesiveness and their similarity to the original query. Since the number of queries to process can be large, with each result having to be analyzed, we propose pruning techniques to retrieve the top-k results efficiently. We develop two query processing algorithms based on our pruning techniques. Further, we exploit a property of the candidate queries to propose a technique for processing multiple queries in batch, which improves the performance substantially. Extensive experiments on two real datasets verify the effectiveness and efficiency of the proposed approaches. 相似文献
2.
3.
Query learning is to learn aconcept (i.e., a representation of some language) through communication with ateacher (i.e., someone who knows the concept). The purpose of this paper is to prepare a formal framework for studying polynomial-time query learnability. We introduce necessary notation and, by using several examples, clarify notions that are necessary for discussing polynomial-time query learning.This is an extended version of a part of the paper A Formal Study of Learning via Queries, which was presented at the 17th International Colloquium on Automata, Languages, and Programming. The preparation of this paper was done while the author was visiting the Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya, and was supported in part by ESPRIT II Basic Research Actions program of the EC under Contract No. 3075 (project ALCOM) and by Grant in Aid for Scientific Research of the Ministry of Education, Science, and Culture of Japan under Grant-in-Aid for Co-operative Research (A) 02302047 (1990). 相似文献
4.
Xudong Lin Author Vitae Ning Wang Author Vitae Author Vitae Xiaoning Zeng Author Vitae 《Journal of Systems and Software》2010,83(6):990-1003
Keyword query is an important means to find object information in XML document. Most of the existing keyword query approaches adopt the subtrees rooted at the smallest lowest common ancestors of the keyword matching nodes as the basic result units. The structural relationships among XML nodes are excessively emphasized but the semantic relevance is not fully exploited.To change this situation, we propose the concept of entity subtree and emphasis the semantic relevance among different nodes as querying information from XML. In our approach, keyword query cases are improved to a new keyword-based query language, Grouping and Categorization Keyword Expression (GCKE) and the core query algorithm, finding entity subtrees (FEST) is proposed to return high quality results by fully using the keyword semantic meanings exposed by GCKE. We demonstrate the effectiveness and the efficiency of our approach through extensive experiments. 相似文献
5.
Soumyadeb Mitra Marianne Winslett Windsor W. Hsu Kevin Chen-Chuan Chang 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(2):225-242
Intense regulatory focus on secure retention of electronic records has led to a need to ensure that records are trustworthy,
i.e., able to provide irrefutable proof and accurate details of past events. In this paper, we analyze the requirements for
a trustworthy index to support keyword-based search queries. We argue that trustworthy index entries must be durable—the index
must be updated when new documents arrive, and not periodically deleted and rebuilt. To this end, we propose a scheme for
efficiently updating an inverted index, based on judicious merging of the posting lists of terms. Through extensive simulations
and experiments with two real world data sets and workloads, we demonstrate that the scheme achieves online update speed while
maintaining good query performance. We also present and evaluate jump indexes, a novel trustworthy and efficient index for join operations on posting lists for multi-keyword queries. Jump indexes support
insert, lookup and range queries in time logarithmic in the number of indexed documents. 相似文献
6.
针对关系数据库关键词查询系统中的结果排序问题,提出了一种新的排序方法.该方法结合了查询相关性和结构权重,将单个元组看作是一个虚拟文档,通过对元组引入信息检索(information retrieval,JR)式评分方式,采用标准化词频和标准化逆文档频率说明元组与查询条件之间的相关性程度,对整个结果采用结构权重来反应结果的语义强度.相比于以往只考虑结构权重的排序方法,该方法能更有效的将与查询高度相关的结果排在前面.实验结果表明,结合查询相关性的排序方法可以有效的对结果进行排序. 相似文献
7.
Searching textual databases can be confusing for users. Popular search systems for the World Wide Web and stand alone systems typically provide a simple interface: users type in keywords and receive a relevance ranked list of 10 results. This is appealing in its simplicity, but users are often frustrated because search results are confusing or aspects of the search are out of their control. If we are to improve user performance, reduce mistaken assumptions, and increase successful searches, we need more predictable design. To coordinate design practice, we suggest a four-phase framework that would satisfy first time, intermittent, and frequent users accessing a variety of textual and multimedia libraries 相似文献
8.
Precision-oriented search results such as those typically returned by the major search engines are vulnerable to issues of polysemy. When the same term refers to different things, the dominant sense is preferred in the rankings of search results. In this paper, we propose a novel two-box technique in the context of Web search that utilizes contextual terms provided by users for query disambiguation, making it possible to prefer other senses without altering the original query. A prototype system, Bobo, has been implemented. In Bobo, contextual terms are used to capture domain knowledge from users, help estimate relevance of search results, and route them towards a user-intended domain. A vast advantage of Bobo is that a wide range of domain knowledge can be effectively utilized, where helpful contextual terms do not even need to co-occur with query terms on any page. We have extensively evaluated the performance of Bobo on benchmark datasets that demonstrates the utility and effectiveness of our approach. 相似文献
9.
Dewey码是XML关键字检索中采用的重要编码方式。在目前的研究当中,Dewey码通常以字符形式进行存储,这种方式造成Dewey码存储代价过大,并且在LCA求解过程中也必须通过字符比较才能获得Dewey码各层的数值,影响LCA求解效率。提出采用前缀共享和变长整形编码思路的PSVL存储方式,在消除字符比较操作的同时减少了Dewey码集合的存储代价。实验证明利用该存储方式对Dewey码集合进行存储,可以有效地降低其存储代价,并且减少获取Dewey码各层数值这一步骤花费的时间,间接提高了LCA的求解效率。 相似文献
10.
Journal of Computer Virology and Hacking Techniques - m-Health stands for mobile health, where mobile devices are used for collecting and distributing health-related data. As the information... 相似文献
11.
Andreas Griesmayer Zhiming Liu Charles Morisset Shuling Wang 《Innovations in Systems and Software Engineering》2013,9(1):3-16
The refinement calculus provides a methodology for transforming an abstract specification into a concrete implementation, by following a succession of refinement rules. These rules have been mechanized in theorem provers, thus providing a formal and rigorous way to prove that a given program refines another one. In a previous work, we have extended this mechanization for object-oriented programs, where the memory is represented as a graph, and we have integrated our approach within the rCOS tool, a model-driven software development tool providing a refinement language. Hence, for any refinement step, the tool automatically generates the corresponding proof obligations and the user can manually discharge them, using a provided library of refinement lemmas. In this work, we propose an approach to automate the search of possible refinement rules from a program to another, using the rewriting tool Maude. Each refinement rule in Maude is associated with the corresponding lemma in Isabelle, thus allowing the tool to automatically generate the Isabelle proof when a refinement rule can be automatically found. The user can add a new refinement rule by providing the corresponding Maude rule and Isabelle lemma. 相似文献
12.
Jorge-Arnulfo Quiané-Ruiz Philippe Lamarre Patrick Valduriez 《The VLDB Journal The International Journal on Very Large Data Bases》2009,18(3):649-674
In large-scale distributed information systems, where participants are autonomous and have special interests for some queries,
query allocation is a challenge. Much work in this context has focused on distributing queries among providers in a way that
maximizes overall performance (typically throughput and response time). However, preserving the participants’ interests is
also important. In this paper, we make the following contributions. First, we provide a model to define the participants’
perception of the system regarding their interests and propose measures to evaluate the quality of query allocation methods.
Then, we propose a framework for query allocation called Satisfaction-based Query Load Balancing (SQLB, for short), which dynamically trades consumers’ interests for providers’ interests based on their satisfaction. Finally, we compare SQLB, through experimentation, with two important baseline query allocation methods, namely Capacity based and Mariposa-like. The results demonstrate that SQLB yields high efficiency while satisfying the participants’ interests and significantly outperforms the baseline methods.
Work partially funded by ARA “Massive Data” of the French ministry of research (Respire project) and the European Strep Grid4All
project. 相似文献
13.
We introduce a new cryptographic primitive, called proxy re-encryption with keyword search, which is motivated by the following scenario in email systems: Charlie sends an encrypted email, which contains some keywords, such as “urgent”, to Alice under Alice’s public key, and Alice delegates her decryption rights to Bob via her mail server. The desired situations are: (1) Bob can decrypt mails delegated from Alice by using only his private key, (2) Bob’s mail gateway, with a trapdoor from Bob, can test whether the email delegated from Alice contains some keywords, such as “urgent”, (3) Alice and Bob do not wish to give the mail server or mail gateway the access to the content of emails.The function of proxy re-encryption with keyword search (PRES) is the combination of proxy re-encryption (PRE) and public key encryption with keyword search (PEKS). However, a PRES scheme cannot be obtained by directly combining those two schemes, since the resulting scheme is no longer proven secure in our security model. In this paper, a concrete construction is proposed, which is proven secure in the random oracle model, based on the modified Decisional Bilinear Diffie-Hellman assumption. 相似文献
14.
Yingwu Zhu 《Peer-to-Peer Networking and Applications》2016,9(1):142-158
In this paper, we focus our studies on a distributed keyword continuous query processing system that is built on distributed hash tables. Treating bandwidth as a first-class resource, we propose novel query indexing algorithms including MHI and SAP-MHI, multicast-based document announcement, and adaptive query resolution to reduce bandwidth cost. Our detailed simulations show that our proposed techniques, combined together, effectively and greatly cut down bandwidth consumption. 相似文献
15.
This paper presents a simple and intuitive method for mining search engine query logs for fast social filtering, where searchers are provided with dynamic query recommendations on a large-scale industrial-strength search engine. We adopt a dynamic approach that is able to absorb new and recent trends in web usage trends on search engines, while forgetting outdated trends, thus adapting to dynamic changes in web user’s interests. In order to get well-rounded recommendations, we combine two methods: first, we model search engine users’ sequential search behavior, and interpret this consecutive search behavior as client-side query refinement, that should form the basis for the search engine’s own query refinement process. This query refinement process is exploited to learn useful information that helps generate related queries. Second, we combine this method with a traditional text or content based similarity method to compensate for the shortness of query sessions and sparsity of real query log data. 相似文献
16.
Similarity search aims to find all objects similar to a query object. Typically, some base similarity measures for the different properties of the objects are defined, and light-weight similarity indexes for these measures are built. A query plan specifies which similarity indexes to use with which similarity thresholds and how to combine the results. Previous work creates only a single, static query plan to be used by all queries. In contrast, our approach creates a new plan for each query. 相似文献
17.
This research investigates and approach to query processing in a multidatabase system that uses an objectoriented model to capture the semantics of other data models. The object-oriented model is used to construct a global schema, defining an integrated view of the different schemas in the environment. The model is also used as a self-describing model to build a meta-database for storing information about the global schema. A unique aspect of this work is that the object-oriented model is used to describe the different data models of the multidatabase environment, thereby extending the meta database with semantic information about the local schemas. With the global and local schemas all represented in an object-oriented form, structural mappings between the global schema and each local schema are then easily supported. An object algebra then provides a query language for expressing global queries, using the structural mappings to translate object algebra queries into SQL queries over local relational schema. The advantage of using an object algebra is that the object-oriented database can be viewed as a blackboard for temporary storage of local data and for establishing relationships between different databases. The object algebra can be used to directly retrieve temporarily-stored data from the object-oriented database or to transparently retrieve data from local sources using the translation process described in this paper. 相似文献
18.
19.
Domain-specific Web search with keyword spices 总被引:4,自引:0,他引:4
Domain-specific Web search engines are effective tools for reducing the difficulty experienced when acquiring information from the Web. Existing methods for building domain-specific Web search engines require human expertise or specific facilities. However, we can build a domain-specific search engine simply by adding domain-specific keywords, called "keyword spices," to the user's input query and forwarding it to a general-purpose Web search engine. Keyword spices can be effectively discovered from Web documents using machine learning technologies. The paper describes domain-specific Web search engines that use keyword spices for locating recipes, restaurants, and used cars. 相似文献
20.
现有的XML关键字查询方法包括两步:确定满足特定语义的节点;构建满足特定条件的子树.这种处理方式需要多次扫描关键字倒排表,效率低下.针对这一问题,提出快速分组方法来减少扫描倒排表次数,进而基于快速分组方法提出FastMatch算法.该算法仅需扫描一次关键字倒排表就能构建满足特定条件的子树,从而提高了查询效率.最后通过实验验证了该方法的高效性. 相似文献