首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Users are rarely familiar with the content of a data source they are querying, and therefore cannot avoid using keywords that do not exist in the data source. Traditional systems may respond with an empty result, causing dissatisfaction, while the data source in effect holds semantically related content. In this paper we study this no-but-semantic-match problem on XML keyword search and propose a solution which enables us to present the top-k semantically related results to the user. Our solution involves two steps: (a) extracting semantically related candidate queries from the original query and (b) processing candidate queries and retrieving the top-k semantically related results. Candidate queries are generated by replacement of non-mapped keywords with candidate keywords obtained from an ontological knowledge base. Candidate results are scored using their cohesiveness and their similarity to the original query. Since the number of queries to process can be large, with each result having to be analyzed, we propose pruning techniques to retrieve the top-k results efficiently. We develop two query processing algorithms based on our pruning techniques. Further, we exploit a property of the candidate queries to propose a technique for processing multiple queries in batch, which improves the performance substantially. Extensive experiments on two real datasets verify the effectiveness and efficiency of the proposed approaches.  相似文献   

2.
3.
4.
为了实现最优有序路径关键词查询,提出了基于动态阈值的OSRK迭代算法,通过不断缩小阈值来过滤不可能出现在最优有序路径中的空间对象,同时在迭代添加路径时,删除不包含给定关键词的空间对象,能够有效地减少候选空间数据集的大小,提高查询响应性能。通过实验验证了算法的有效性。  相似文献   

5.
Query learning is to learn aconcept (i.e., a representation of some language) through communication with ateacher (i.e., someone who knows the concept). The purpose of this paper is to prepare a formal framework for studying polynomial-time query learnability. We introduce necessary notation and, by using several examples, clarify notions that are necessary for discussing polynomial-time query learning.This is an extended version of a part of the paper A Formal Study of Learning via Queries, which was presented at the 17th International Colloquium on Automata, Languages, and Programming. The preparation of this paper was done while the author was visiting the Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya, and was supported in part by ESPRIT II Basic Research Actions program of the EC under Contract No. 3075 (project ALCOM) and by Grant in Aid for Scientific Research of the Ministry of Education, Science, and Culture of Japan under Grant-in-Aid for Co-operative Research (A) 02302047 (1990).  相似文献   

6.
针对传统关系数据库处理海量空间文本数据的不足,基于HBase数据库提出了一种结合Geohash编码与分词技术的空间文本索引方案,并基于该空间文本索引提出了一种多边形区域内的空间关键字查询算法。通过与传统经纬度索引方案的实验比较,验证了算法的高效性和可扩展性。  相似文献   

7.
8.
Keyword query is an important means to find object information in XML document. Most of the existing keyword query approaches adopt the subtrees rooted at the smallest lowest common ancestors of the keyword matching nodes as the basic result units. The structural relationships among XML nodes are excessively emphasized but the semantic relevance is not fully exploited.To change this situation, we propose the concept of entity subtree and emphasis the semantic relevance among different nodes as querying information from XML. In our approach, keyword query cases are improved to a new keyword-based query language, Grouping and Categorization Keyword Expression (GCKE) and the core query algorithm, finding entity subtrees (FEST) is proposed to return high quality results by fully using the keyword semantic meanings exposed by GCKE. We demonstrate the effectiveness and the efficiency of our approach through extensive experiments.  相似文献   

9.
Intense regulatory focus on secure retention of electronic records has led to a need to ensure that records are trustworthy, i.e., able to provide irrefutable proof and accurate details of past events. In this paper, we analyze the requirements for a trustworthy index to support keyword-based search queries. We argue that trustworthy index entries must be durable—the index must be updated when new documents arrive, and not periodically deleted and rebuilt. To this end, we propose a scheme for efficiently updating an inverted index, based on judicious merging of the posting lists of terms. Through extensive simulations and experiments with two real world data sets and workloads, we demonstrate that the scheme achieves online update speed while maintaining good query performance. We also present and evaluate jump indexes, a novel trustworthy and efficient index for join operations on posting lists for multi-keyword queries. Jump indexes support insert, lookup and range queries in time logarithmic in the number of indexed documents.  相似文献   

10.
针对关系数据库关键词查询系统中的结果排序问题,提出了一种新的排序方法.该方法结合了查询相关性和结构权重,将单个元组看作是一个虚拟文档,通过对元组引入信息检索(information retrieval,JR)式评分方式,采用标准化词频和标准化逆文档频率说明元组与查询条件之间的相关性程度,对整个结果采用结构权重来反应结果的语义强度.相比于以往只考虑结构权重的排序方法,该方法能更有效的将与查询高度相关的结果排在前面.实验结果表明,结合查询相关性的排序方法可以有效的对结果进行排序.  相似文献   

11.
Precision-oriented search results such as those typically returned by the major search engines are vulnerable to issues of polysemy. When the same term refers to different things, the dominant sense is preferred in the rankings of search results. In this paper, we propose a novel two-box technique in the context of Web search that utilizes contextual terms provided by users for query disambiguation, making it possible to prefer other senses without altering the original query. A prototype system, Bobo, has been implemented. In Bobo, contextual terms are used to capture domain knowledge from users, help estimate relevance of search results, and route them towards a user-intended domain. A vast advantage of Bobo is that a wide range of domain knowledge can be effectively utilized, where helpful contextual terms do not even need to co-occur with query terms on any page. We have extensively evaluated the performance of Bobo on benchmark datasets that demonstrates the utility and effectiveness of our approach.  相似文献   

12.
Shneiderman  B. 《Software, IEEE》1997,14(2):18-20
Searching textual databases can be confusing for users. Popular search systems for the World Wide Web and stand alone systems typically provide a simple interface: users type in keywords and receive a relevance ranked list of 10 results. This is appealing in its simplicity, but users are often frustrated because search results are confusing or aspects of the search are out of their control. If we are to improve user performance, reduce mistaken assumptions, and increase successful searches, we need more predictable design. To coordinate design practice, we suggest a four-phase framework that would satisfy first time, intermittent, and frequent users accessing a variety of textual and multimedia libraries  相似文献   

13.
杨宁  陈群 《计算机工程与应用》2013,49(1):137-140,151
Dewey码是XML关键字检索中采用的重要编码方式。在目前的研究当中,Dewey码通常以字符形式进行存储,这种方式造成Dewey码存储代价过大,并且在LCA求解过程中也必须通过字符比较才能获得Dewey码各层的数值,影响LCA求解效率。提出采用前缀共享和变长整形编码思路的PSVL存储方式,在消除字符比较操作的同时减少了Dewey码集合的存储代价。实验证明利用该存储方式对Dewey码集合进行存储,可以有效地降低其存储代价,并且减少获取Dewey码各层数值这一步骤花费的时间,间接提高了LCA的求解效率。  相似文献   

14.
梁银  董永权 《计算机应用》2014,34(7):1992-1996
在进行空间关键词查询时,有时需要查找一组既紧凑且离查询点最近、又覆盖查询关键词且对象个数很少的对象,而现有的查询方法通常只能返回包含所有查询关键词的单个空间对象。为此,提出了解决此类查询问题的近似查询算法和精确查询算法。首先给出了这类查询问题的形式化定义,以及描述对象集合质量的代价函数,并对代价函数进行了归一化处理;然后在近似查询算法中采用基于IR-tree的最佳优先搜索策略进行剪枝,有效缩减了查询候选空间;在精确查询算法中采用基于IR-tree的广度优先搜索策略查找包含查询关键词的对象,以达到降低查询处理代价的目的。实验结果表明,近似算法的查询效率明显优于精确算法,且能获得非常精确的查询结果。  相似文献   

15.
为保证敏感信息的数据安全,用户通常会将其加密后存储到云端数据库,这为数据库管理及后续使用增加了难度。提出一种安全查询方案,在不暴露敏感信息的情况下可获得符合查询条件的结果集。使用伪随机函数和Bloom过滤器,对敏感信息的关键词集合进行预处理,在数据库中生成相应的索引数据结构,支持不固定数量的关键词查询与高效的数据更新。查询时,客户端计算出关键词相应的陷门并将其发送给服务器,服务器使用陷门执行查询,将多关键词计算出的陷门进行串接,可将多关键词查询问题转换成单关键词查询问题,并且不提高时间复杂度。此外,有效的陷门只能由拥有密钥的用户产生,陷门不会泄露任何敏感信息,故该方案不依赖完全可信的数据库服务提供商。与现有的采用特殊双层结构的加密方式相比,提高了查询效率,解决了加密数据库处理用户查询请求时的敏感信息泄露问题,且允许用户对敏感信息采用不同的加密方式,具有很强的兼容性。使用TPC-H的数据库测试方案和测试数据进行实验,实验结果证明了算法具有较高的执行效率。  相似文献   

16.
Journal of Computer Virology and Hacking Techniques - m-Health stands for mobile health, where mobile devices are used for collecting and distributing health-related data. As the information...  相似文献   

17.
The refinement calculus provides a methodology for transforming an abstract specification into a concrete implementation, by following a succession of refinement rules. These rules have been mechanized in theorem provers, thus providing a formal and rigorous way to prove that a given program refines another one. In a previous work, we have extended this mechanization for object-oriented programs, where the memory is represented as a graph, and we have integrated our approach within the rCOS tool, a model-driven software development tool providing a refinement language. Hence, for any refinement step, the tool automatically generates the corresponding proof obligations and the user can manually discharge them, using a provided library of refinement lemmas. In this work, we propose an approach to automate the search of possible refinement rules from a program to another, using the rewriting tool Maude. Each refinement rule in Maude is associated with the corresponding lemma in Isabelle, thus allowing the tool to automatically generate the Isabelle proof when a refinement rule can be automatically found. The user can add a new refinement rule by providing the corresponding Maude rule and Isabelle lemma.  相似文献   

18.
In this paper, we focus our studies on a distributed keyword continuous query processing system that is built on distributed hash tables. Treating bandwidth as a first-class resource, we propose novel query indexing algorithms including MHI and SAP-MHI, multicast-based document announcement, and adaptive query resolution to reduce bandwidth cost. Our detailed simulations show that our proposed techniques, combined together, effectively and greatly cut down bandwidth consumption.  相似文献   

19.
We introduce a new cryptographic primitive, called proxy re-encryption with keyword search, which is motivated by the following scenario in email systems: Charlie sends an encrypted email, which contains some keywords, such as “urgent”, to Alice under Alice’s public key, and Alice delegates her decryption rights to Bob via her mail server. The desired situations are: (1) Bob can decrypt mails delegated from Alice by using only his private key, (2) Bob’s mail gateway, with a trapdoor from Bob, can test whether the email delegated from Alice contains some keywords, such as “urgent”, (3) Alice and Bob do not wish to give the mail server or mail gateway the access to the content of emails.The function of proxy re-encryption with keyword search (PRES) is the combination of proxy re-encryption (PRE) and public key encryption with keyword search (PEKS). However, a PRES scheme cannot be obtained by directly combining those two schemes, since the resulting scheme is no longer proven secure in our security model. In this paper, a concrete construction is proposed, which is proven secure in the random oracle model, based on the modified Decisional Bilinear Diffie-Hellman assumption.  相似文献   

20.
现有的空间关键字查询处理模式大都仅支持位置相近和文本相似匹配,但不能将语义相近但形式上不匹配的对象提供给用户;并且,当前的空间-文本索引结构也不能对空间对象中的数值属性进行处理。针对上述问题,本文提出了一种支持语义近似查询的空间关键字查询方法。首先,利用词嵌入技术对用户原始查询进行扩展,生成一系列与原始查询关键字语义相关的查询关键字;然后,提出了一种能够同时支持文本和语义匹配,并利用Skyline方法对数值属性进行处理的混合索引结构AIR-Tree;最后,利用AIR-Tree进行查询匹配,返回top-k个与查询条件最为相关的有序空间对象。实验分析和结果表明,与现有同类方法相比,本文方法具有较高的执行效率和较好的用户满意度;基于AIR-Tree索引的查询效率较IRS-Tree索引提高了3.6%,在查询结果准确率上较IR-Tree和IRS-Tree索引分别提高了10.14%和16.15%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号