首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 46 毫秒
1.
现有的XML关键字查询方法包括两步:确定满足特定语义的节点;构建满足特定条件的子树。这种处理方式需要多次扫描关键字倒排表,效率低下。针对这一问题,提出快速分组方法来减少扫描倒排表次数,进而基于快速分组方法提出FastMatch算法。该算法仅需扫描一次关键字倒排表就能构建满足特定条件的子树,从而提高了查询效率。最后通过实验验证了该方法的高效性。  相似文献   

2.
当含有敏感信息的XML文档在网络上传输或交换时,需要用户执行受限查询,如何提高查询效率,同时又保证敏感信息的安全一直是安全领域的研究热点。以带访问权限的实例信息树为主体,优先抽取主干信息策略,再反向作用于实例信息树存储特殊节点的压缩方法,为安全且高效的XML关键字查询奠定了基础,而且采用扩展的Dewey编码方式,为安全查询提供了方便。实验结果表明,这种基于压缩策略的安全查询方式减轻了存储负担,提高了查询效率。  相似文献   

3.
为了解决基于LCA(Lower Common Ancestor)的XML关键字查询丢失语义的问题,提出了一种基于“自然语言生成技术(Natural Language Generation,NLG)”的XML关键字查询技术,将NLG的内容规划应用到XML文档,产生针对用户查询的消息语句集,通过对消息语句集的筛选既可以实现基于语义的XML关键字查询,又可以极大地提高查询效率。  相似文献   

4.
Emerging applications such as personalized portals, enterprise search, and web integration systems often require keyword search over semi-structured views. However, traditional information retrieval techniques are likely to be expensive in this context because they rely on the assumption that the set of documents being searched is materialized. In this paper, we present a system architecture and algorithm that can efficiently evaluate keyword search queries over virtual (unmaterialized) XML views. An interesting aspect of our approach is that it exploits indices present on the base data and thereby avoids materializing large parts of the view that are not relevant to the query results. Another feature of the algorithm is that by solely using indices, we can still score the results of queries over the virtual view, and the resulting scores are the same as if the view was materialized. Our performance evaluation using the INEX data set in the Quark (Bhaskar et al. in Quark: an efficient XQuery full-text implementation. In: SIGMOD, 2006) open-source XML database system indicates that the proposed approach is scalable and efficient.  相似文献   

5.
提出了基于结果类型分组的XML(extensible markup language,可扩展标志语言)关键字查询算法。采用熵值赋权法确定结果类型,继而对XML文档节点虚拟分组,并在虚拟组的基础上给出了相应的查询算法,不仅确保了结果信息的完整,避免了丢失某些有意义结果和返回无意义结果的现象。实验结果表明,所提出算法与SLCA、MLCEA相比,在查询质量、效率及稳定性上有一定提高。  相似文献   

6.
基于有效最低公共祖先的XML关键字查询算法   总被引:1,自引:0,他引:1  
郑弘晖  郭红 《计算机应用》2010,30(3):825-830
针对XML文档关键字搜索问题,从元素标签内容等价和元素结构相似性等价两个方面考虑无效的查询结果。介绍了有效最低公共祖先(FLCA)的概念,在此基础上提出紧致的有效最低公共祖先(CFLCA)的概念。根据定义的查询结果集,提出基于等价模式值索引的查询算法(BEPVA)。最后与CVLCA和SLCA进行了比较,结果表明提出的方法在查询质量和查询效率上有较大的提高。  相似文献   

7.
As a large number of corpuses are represented, stored and published in XML format, how to find useful information from XML databases has become an increasingly important issue. Keyword search enables web users to easily access XML data without the need to learn a structured query language or to study complex data schemas. Most existing indexing strategies for XML keyword search are based upon Dewey encoding. In this paper, we proposed a new encoding method called Level Order and Father (LAF) for XML documents. With LAF encoding, we devised a new index structure, called two‐layer LAF inverted index, which can greatly decrease the space complexity compared with Dewey encoding‐based inverted index. Furthermore, with two‐layer LAF inverted index, we proposed a new keyword query algorithm called Algorithm based on Binary Search (ABS) that can quickly find all Smallest Lowest Common Ancestor. We experimentally evaluate two‐layer LAF inverted index and ABS algorithm on four real XML data sets selected from Wikipedia. The experimental results prove the advantages of our index method and querying algorithm. The space consumed by two‐layer LAF index is less than half of that consumed by Dewey inverted index. Moreover, ABS is about one to two orders of magnitude faster than the classic Stack algorithm. Concurrency and Computation: Practice and Experience, 2012.© 2012 Wiley Periodicals, Inc.  相似文献   

8.
In this paper, we study the problem of keyword proximity search in XML documents. We take the disjunctive semantics among the keywords into consideration and find top-k relevant compact connected trees (CCTrees) as the answers of keyword proximity queries. We first introduce the notions of compact lowest common ancestor (CLCA) and maximal CLCA (MCLCA), and then propose compact connected trees and maximal CCTrees (MCCTrees) to efficiently and effectively answer keyword proximity queries. We give the theoretical upper bounds of the numbers of CLCAs, MCLCAs, CCTrees and MCCTrees, respectively. We devise an efficient algorithm to generate all MCCTrees, and propose a ranking mechanism to rank MCCTrees. Our extensive experimental study shows that our method achieves both high efficiency and effectiveness, and outperforms existing state-of-the-art approaches significantly.  相似文献   

9.
With a significant advance in ciphertext searchability, public-key encryption with keyword search (PEKS) guarantees both security and convenience for outsourced keyword search over ciphertexts. In this paper, we establish static index (SI) and dynamic index (DI) for PEKS to make search efficient and secure in the state of the art. Suppose there are u senders to generate n searchable ciphertexts for w keywords. The search complexity of PEKS always is O(n) for each query, even if the keyword has been searched for multiple times. It is obviously inefficient for massive searchable ciphertexts. Fortunately, SI and DI help PEKS lowering the burden respectively in two phases: if the queried keyword is the first time to be searched, apply SI to reduce the complexity from O(n) to O(uw); otherwise, apply DI to reduce the complexity from O(n) to O(w). Because DI is invalid for the first time search on any keyword, SI and DI are simultaneously applied with PEKS to complete our work as the secure hybrid indexed search (SHIS) scheme. Since uwn in practice, our SHIS scheme is significantly more efficient than PEKS as demonstrated by our analysis. In the end, we show the extension of SHIS to multi-receiver applications, which is absent for pure PEKS.  相似文献   

10.
XML流上的关键字查询算法   总被引:2,自引:1,他引:1       下载免费PDF全文
针对当前XML流过滤研究中存在的问题,使用关键字查询方法作为解决方案。提出最右包含边界的概念,结合一个虚拟栈实现用于在XML数据流上进行关键字查询的XVirtual Stack算法。理论分析和实验结果证明,该算法具有高效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号