首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到15条相似文献,搜索用时 203 毫秒
1.
针对目前不确定XML小枝模式匹配算法均基于归并,易造成很大的空间和时间浪费问题,提出基于P-文档模型的连续不确定XML的非归并的小枝模式匹配算法.算法在节点入队列和出队列时分别进行过滤剪枝操作,减少待处理节点的个数,匹配过程使用相互关联的链表存储中间结果,不需要归并.理论分析与实验结果表明,该算法是一种高效的连续不确定XML查询算法.  相似文献   

2.
缪丰羽  王宏志 《计算机科学》2016,43(11):284-290
模糊XML文档是指包含不确定信息的XML文档。在模糊XML文档查询方面,现有的研究成果较少,并且都是基于树型结构的XML文档进行的。针对图结构下模糊XML文档的特征,设计了一组高效的图结构模糊XML文档上的模式匹配算法。该算法基于一种适合于图结构文档的索引方式,采用自底向上的结点匹配顺序,大大减少了结点的重复判断操作,也不需要进行局部匹配结果的归并以及针对PC关系设计额外的过滤函数。理论分析以及实验结果证明,提出的模式匹配算法不仅在小枝查询性能上优于现有的相关算法,而且能够较好地实现DAG模式匹配查询。  相似文献   

3.
张晓琳  王鹏 《计算机工程与设计》2014,(5):1674-1677,1704
为了保持等价性,将序列匹配应用到不确定XML小枝模式匹配,需要重新考虑假警报和假不予考虑问题。针对这一问题,对不确定XML序列匹配中模式树序列化、子序列匹配和结构过滤的等价性进行分析,使得序列匹配应用到不确定XML小枝模式匹配的理论依据更为完备;通过实验对不确定XML序列匹配的等价性和效率进行验证。理论分析和实验结果表明,序列匹配应用到不确定XML与普通XML是等价的,具有较高的效率。  相似文献   

4.
现实世界中存在大量的不精确和不确定信息,因此,针对模糊数据的表示和处理的研究工作已经广泛展开.作为下一代Web语言,XML已经成为当前Web数据表示与交换的标准.不精确和不确定数据的出现对XML提出了新的挑战,现有的研究成果已不能满足模糊XML环境下智能化数据管理的迫切需求.为此,文中在模糊XML数据模型的基础上,从编码技术入手,讨论模糊XML环境下的结点编码问题,进而研究模糊XML环境下的小枝查询问题.文中提出了基于模糊XML的小枝模式匹配算法,给出了加速小枝匹配的索引算法,并最终通过实验证明了所提方法的优越性.  相似文献   

5.
随着互联网的迅速发展,XML已经成为网上通用的数据表示与交换的标准。因此,如何有效地查询XML数据成为一个重要的研究课题。近年来,小枝模式匹配问题已被广泛地研究,提出了不少小枝模式匹配算法。在汲取各种小枝模式匹配算法优点的基础上,提出了一种新的小枝模式匹配算法TwigEN。根据XML文档结构它可以跳过那些在结构连接中无用的元素结点,这样不仅减少了待处理结点的数目,缩短了处理时间,而且也节省了内存空间。  相似文献   

6.
一种基于结构索引的XML模式匹配方法   总被引:2,自引:0,他引:2  
XML文档采用了树型的数据模型,对其查询通常是用带有选择谓词的模式树在XML数据中进行匹配.因此,找出XML文档中所有符合模式树结构的元素集,是XML查询处理的核心操作.本文提出了结构索引JoinGuide,并在此基础上提出了一种新的XML模式匹配方法.它使用JoinGuide来对模式树进行预匹配,这样在XML文档上查询时可以利用索引上的匹配结果来忽略部分连接谓词和不必要的候选XML元素序列.本文还提出了三种具体算法来利用索引匹配结果进行进一步的查询.实验结果表明本文中的模式树匹配方法优于以往的匹配方法,并且索引所需的空间很小.  相似文献   

7.
小枝模式匹配作为XML查询的核心操作,目前在该方面已经提出了一系列有效的实现方法.在总结分析先前各种匹配算法的基础上,提出了一种新的基于路径索引的解决方法TwigFilter,该方法是一个单阶段算法,避免了路径归并.同时,考虑到通常查询中只有少数几个结点是所需的输出结果这一特点,该方法区别输出结点和其他查询结点,保证整个查询处理过程都是根据输出结点进行的.实验结果表明,该算法优于以前的算法,尤其是对查询中只有祖先-后裔关系的表达式更有效.  相似文献   

8.
一种基于有序对的含父子边的小枝模式匹配算法   总被引:1,自引:0,他引:1  
随着Internet的发展和网上XML数据规模的与日剧增,如何准确、高效地查询XML数据已经成为研究的热点问题.目前,已经提出了很多小枝模式匹配算法,但没有解决含有父子边的小枝模式查询.针对该问题,提出了一种基于有序对的新算法PCTwig,通过在查询树和文档树上分别建立父子关系的有序对来进行查询.查询过程中避免了产生中间结果,也不需要进行归并操作,实验证明该算法是有效的.  相似文献   

9.
近年来, XML数据查询成为一个重要的研究课题。处理小枝查询是XML查询实现的核心操作,针对小枝模式查询,提出了一种改进的小枝模式匹配算法。该算法通过剪去无用的数据流以减少待处理结点的数目,从而节省处理时间,提高查询的准确率。实验结果表明,该算法能够有效提高查询效率。  相似文献   

10.
目前,XML文档查询是研究的热点,其中小枝模式匹配方法是重要的研究方向,但是大多数基于这种思想的算法只能处理包含祖先/后代关系的查询。为此,提出了一种新的小枝模式匹配算法——TwigStackPC,它能够有效地处理包含祖先/后代和父/子关系的查询。  相似文献   

11.
Matching twigs in fuzzy XML   总被引:2,自引:0,他引:2  
A considerable amount of twig pattern matching algorithms have been proposed to holistically process a twig query. Those algorithms mainly focus on twig pattern query with the AND-logic. However, there is often a need to process a twig query with the OR-predicates. Furthermore, the existing algorithms fall short in their ability to support twig query with OR-logic in fuzzy XML. To overcome this limitation, in this paper, we first introduce a novel encoding scheme to represent node information in fuzzy XML. Based on the encoding scheme, we then propose an effective algorithm for matching a twig pattern query with the AND/OR-logic in fuzzy XML. Our approach adopts a compact stack technique to process the complicated twig query consisting of both AND-logic and OR-logic. More importantly, our method eliminates re-scanning unnecessary portions of XML documents and redundant intermediate results. Finally, the experimental results demonstrate the performance advantages of our approach.  相似文献   

12.
Efficiently Querying Large XML Data Repositories: A Survey   总被引:1,自引:0,他引:1  
Extensible markup language (XML) is emerging as a de facto standard for information exchange among various applications on the World Wide Web. There has been a growing need for developing high-performance techniques to query large XML data repositories efficiently. One important problem in XML query processing is twig pattern matching, that is, finding in an XML data tree D all matches that satisfy a specified twig (or path) query pattern Q. In this survey, we review, classify, and compare major techniques for twig pattern matching. Specifically, we consider two classes of major XML query processing techniques: the relational approach and the native approach. The relational approach directly utilizes existing relational database systems to store and query XML data, which enables the use of all important techniques that have been developed for relational databases, whereas in the native approach, specialized storage and query processing systems tailored for XML data are developed from scratch to further improve XML query performance. As implied by existing work, XML data querying and management are developing in the direction of integrating the relational approach with the native approach, which could result in higher query processing performance and also significantly reduce system reengineering costs.  相似文献   

13.
Indexing and querying XML using extended Dewey labeling scheme   总被引:1,自引:0,他引:1  
Finding all the occurrences of a tree pattern in an XML database is a core operation for efficient evaluation of XML queries. The Dewey labeling scheme is commonly used to label an XML document to facilitate XML query processing by recording information on the path of an element. In order to improve the efficiency of XML tree pattern matching, we introduce a novel labeling scheme, called extended Dewey, which effectively extends the existing Dewey labeling scheme to combine the types and identifiers of elements in a label, and to avoid the scan of labels for internal query nodes to accelerate query processing (in I/O cost). Based on extended Dewey, we propose a series of holistic XML tree pattern matching algorithms. We first present TJFast to answer an XML twig pattern query. To efficiently answer a generalized XML tree pattern, we then propose GTJFast, an optimization that exploits the non-output nodes. In addition, we propose TJFastTL and GTJFastTL based on the tag + level data partition scheme to further reduce I/O costs by level pruning. Finally, we report our comprehensive experimental results to show that our set of XML tree pattern matching algorithms are superior to existing approaches in terms of the number of elements scanned, the size of intermediate results and query performance.  相似文献   

14.
Jian Liu  Z. M. Ma  Li Yan 《World Wide Web》2013,16(3):325-353
As the next generation language of the Internet, XML has been the de-facto standard of information exchange over the web. A core operation for XML query processing is to find all the occurrences of a twig pattern in an XML database. In addition, the study of probabilistic data has become an emerging topic for various applications on the Web. Therefore, researching the combination of XML twig pattern and probabilistic data is quite significant. In prior work of probabilistic XML, the answers of a given twig query are always complete. However, complete answers with low probabilities may be deemed irrelevant while incomplete answers with high probabilities are of great significance because incomplete answers may be the potential answers that interest the users. Different from complete evaluation, evaluating incomplete twigs in probabilistic XML introduces some new challenges. On one hand, incomplete queries do not only obtain complete matches, but also return answers that contain considerable incomplete matches. On the other hand, the processing of incomplete evaluation is more complicated. It is obvious that a ranking approach should be adopted along with evaluating incomplete answers. In this paper, we propose an efficient algorithm to handle the problem of querying incomplete twigs over the probabilistic XML database. We also present a novel algorithm for ranking the incomplete answers. The experimental results show that our proposed algorithms can improve the performance of querying and ranking incomplete twigs significantly.  相似文献   

15.
The flexibility of XML data model allows a more natural representation of uncertain data compared with the relational model. Matching twig pattern against XML data is a fundamental problem in querying information from XML documents. For a probabilistic XML document, each twig answer has a probabilistic value because of the uncertainty of data. The twig answers that have small probabilistic value are useless to the users, and usually users only want to get the answers with the k largest probabilistic values. To this end, existing algorithms for ordinary XML documents cannot be directly applicable due to the need for handling probability distributional nodes and efficient calculation of top-k probabilities of answers in probabilistic XML. In this paper, we address the problem of finding twig answers with top-k probabilistic values against probabilistic XML documents directly. We propose a new encoding scheme called PEDewey for probabilistic XML in this paper. Based on this encoding scheme, we then design two algorithms for finding answers of top-k probabilities for twig queries. One is called ProTJFast, to process probabilistic XML data based on element streams in document order, and the other is called PTopKTwig, based on the element streams ordered by the path probability values. Experiments have been conducted to study the performance of these algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号