首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到16条相似文献,搜索用时 140 毫秒
1.
缪丰羽  王宏志 《计算机科学》2016,43(11):284-290
模糊XML文档是指包含不确定信息的XML文档。在模糊XML文档查询方面,现有的研究成果较少,并且都是基于树型结构的XML文档进行的。针对图结构下模糊XML文档的特征,设计了一组高效的图结构模糊XML文档上的模式匹配算法。该算法基于一种适合于图结构文档的索引方式,采用自底向上的结点匹配顺序,大大减少了结点的重复判断操作,也不需要进行局部匹配结果的归并以及针对PC关系设计额外的过滤函数。理论分析以及实验结果证明,提出的模式匹配算法不仅在小枝查询性能上优于现有的相关算法,而且能够较好地实现DAG模式匹配查询。  相似文献   

2.
近年来, XML数据查询成为一个重要的研究课题。处理小枝查询是XML查询实现的核心操作,针对小枝模式查询,提出了一种改进的小枝模式匹配算法。该算法通过剪去无用的数据流以减少待处理结点的数目,从而节省处理时间,提高查询的准确率。实验结果表明,该算法能够有效提高查询效率。  相似文献   

3.
XML数据库的查询优化技术是当前数据库领域中的一个研究热点,而小枝模式匹配又是其中的一个研究重点.在总结分析各种小枝模式匹配算法的基础上,提出了一种新的基于Extended Dewey编码的小枝模式匹配方法.该方法首先使用TJFast算法在XML文档的JoinGuide索引上进行预匹配,然后再扫描预匹配结果中的叶子结点序列就可以找出所有的匹配结果.最后,用实验的方法同其它算法作了比较,并对实验结果进行了分析.  相似文献   

4.
现实世界中存在大量的不精确和不确定信息,因此,针对模糊数据的表示和处理的研究工作已经广泛展开.作为下一代Web语言,XML已经成为当前Web数据表示与交换的标准.不精确和不确定数据的出现对XML提出了新的挑战,现有的研究成果已不能满足模糊XML环境下智能化数据管理的迫切需求.为此,文中在模糊XML数据模型的基础上,从编码技术入手,讨论模糊XML环境下的结点编码问题,进而研究模糊XML环境下的小枝查询问题.文中提出了基于模糊XML的小枝模式匹配算法,给出了加速小枝匹配的索引算法,并最终通过实验证明了所提方法的优越性.  相似文献   

5.
当前针对小枝模式的XML查询是XML文档查询的研究热点。文章在分析XML数据小枝查询处理常用算法的基础上,提出了一种高灵活性的、易确定结点对之间结构关系的EDiezt-P编码,并基于EDiezt-P编码和层次栈结构提出了一种自底向上的小枝查询算法。实验表明,该算法在一定程度上减少了查询处理时间,提高了查询效率。  相似文献   

6.
小枝模式匹配作为XML查询的核心操作,目前在该方面已经提出了一系列有效的实现方法.在总结分析先前各种匹配算法的基础上,提出了一种新的基于路径索引的解决方法TwigFilter,该方法是一个单阶段算法,避免了路径归并.同时,考虑到通常查询中只有少数几个结点是所需的输出结果这一特点,该方法区别输出结点和其他查询结点,保证整个查询处理过程都是根据输出结点进行的.实验结果表明,该算法优于以前的算法,尤其是对查询中只有祖先-后裔关系的表达式更有效.  相似文献   

7.
针对目前不确定XML小枝模式匹配算法均基于归并,易造成很大的空间和时间浪费问题,提出基于P-文档模型的连续不确定XML的非归并的小枝模式匹配算法.算法在节点入队列和出队列时分别进行过滤剪枝操作,减少待处理节点的个数,匹配过程使用相互关联的链表存储中间结果,不需要归并.理论分析与实验结果表明,该算法是一种高效的连续不确定XML查询算法.  相似文献   

8.
一种基于有序对的含父子边的小枝模式匹配算法   总被引:1,自引:0,他引:1  
随着Internet的发展和网上XML数据规模的与日剧增,如何准确、高效地查询XML数据已经成为研究的热点问题.目前,已经提出了很多小枝模式匹配算法,但没有解决含有父子边的小枝模式查询.针对该问题,提出了一种基于有序对的新算法PCTwig,通过在查询树和文档树上分别建立父子关系的有序对来进行查询.查询过程中避免了产生中间结果,也不需要进行归并操作,实验证明该算法是有效的.  相似文献   

9.
目前,XML文档查询是研究的热点,其中小枝模式匹配方法是重要的研究方向,但是大多数基于这种思想的算法只能处理包含祖先/后代关系的查询。为此,提出了一种新的小枝模式匹配算法——TwigStackPC,它能够有效地处理包含祖先/后代和父/子关系的查询。  相似文献   

10.
姚全珠  郭祯  房美君 《计算机应用》2011,31(10):2782-2785
给定一个小枝模式查询,如何快速地在XML数据集中找到所有感兴趣的信息,已成为当前研究的热点。针对TwigStack算法在处理含有父子节点的情况下会产生大量的中间结果等问题,通过栈来对非叶子节点缓存和对叶子节点延迟输出的思想,提出了一种改进的小枝模式匹配算法--cTwigStack。采用Treebank数据集进行测验,结果表明该算法不仅仅在处理祖孙/后继节点时能使输出结果的准确性达到最优,而且在处理父子节点时,相对目前提出的算法,也是非常高效的。  相似文献   

11.
Matching twigs in fuzzy XML   总被引:2,自引:0,他引:2  
A considerable amount of twig pattern matching algorithms have been proposed to holistically process a twig query. Those algorithms mainly focus on twig pattern query with the AND-logic. However, there is often a need to process a twig query with the OR-predicates. Furthermore, the existing algorithms fall short in their ability to support twig query with OR-logic in fuzzy XML. To overcome this limitation, in this paper, we first introduce a novel encoding scheme to represent node information in fuzzy XML. Based on the encoding scheme, we then propose an effective algorithm for matching a twig pattern query with the AND/OR-logic in fuzzy XML. Our approach adopts a compact stack technique to process the complicated twig query consisting of both AND-logic and OR-logic. More importantly, our method eliminates re-scanning unnecessary portions of XML documents and redundant intermediate results. Finally, the experimental results demonstrate the performance advantages of our approach.  相似文献   

12.
Jian Liu  Z. M. Ma  Li Yan 《World Wide Web》2013,16(3):325-353
As the next generation language of the Internet, XML has been the de-facto standard of information exchange over the web. A core operation for XML query processing is to find all the occurrences of a twig pattern in an XML database. In addition, the study of probabilistic data has become an emerging topic for various applications on the Web. Therefore, researching the combination of XML twig pattern and probabilistic data is quite significant. In prior work of probabilistic XML, the answers of a given twig query are always complete. However, complete answers with low probabilities may be deemed irrelevant while incomplete answers with high probabilities are of great significance because incomplete answers may be the potential answers that interest the users. Different from complete evaluation, evaluating incomplete twigs in probabilistic XML introduces some new challenges. On one hand, incomplete queries do not only obtain complete matches, but also return answers that contain considerable incomplete matches. On the other hand, the processing of incomplete evaluation is more complicated. It is obvious that a ranking approach should be adopted along with evaluating incomplete answers. In this paper, we propose an efficient algorithm to handle the problem of querying incomplete twigs over the probabilistic XML database. We also present a novel algorithm for ranking the incomplete answers. The experimental results show that our proposed algorithms can improve the performance of querying and ranking incomplete twigs significantly.  相似文献   

13.
Efficiently Querying Large XML Data Repositories: A Survey   总被引:1,自引:0,他引:1  
Extensible markup language (XML) is emerging as a de facto standard for information exchange among various applications on the World Wide Web. There has been a growing need for developing high-performance techniques to query large XML data repositories efficiently. One important problem in XML query processing is twig pattern matching, that is, finding in an XML data tree D all matches that satisfy a specified twig (or path) query pattern Q. In this survey, we review, classify, and compare major techniques for twig pattern matching. Specifically, we consider two classes of major XML query processing techniques: the relational approach and the native approach. The relational approach directly utilizes existing relational database systems to store and query XML data, which enables the use of all important techniques that have been developed for relational databases, whereas in the native approach, specialized storage and query processing systems tailored for XML data are developed from scratch to further improve XML query performance. As implied by existing work, XML data querying and management are developing in the direction of integrating the relational approach with the native approach, which could result in higher query processing performance and also significantly reduce system reengineering costs.  相似文献   

14.
路径表达式查询是XML数据查询处理的核心研究问题之一,研究者开展了大量的研究工作.但这些研究更多关注XML数据上路径表达式的匹配,忽略了谓词"包含".研究XML查询处理中谓词"包含"的查询处理方法.采用了两种方法,第一种是采用跳跃表的方法,在XML分枝模式匹配时动态地对结点数据进行读取和关键字匹配.第二种是为XML文档中的词语建立倒排索引,来实现关键字的匹配.并从分枝模式路径长度、查询关键的数量和"包含"谓词判断结点的类型,对两种方法进行了分析和比较.  相似文献   

15.
16.
The flexibility of XML data model allows a more natural representation of uncertain data compared with the relational model. Matching twig pattern against XML data is a fundamental problem in querying information from XML documents. For a probabilistic XML document, each twig answer has a probabilistic value because of the uncertainty of data. The twig answers that have small probabilistic value are useless to the users, and usually users only want to get the answers with the k largest probabilistic values. To this end, existing algorithms for ordinary XML documents cannot be directly applicable due to the need for handling probability distributional nodes and efficient calculation of top-k probabilities of answers in probabilistic XML. In this paper, we address the problem of finding twig answers with top-k probabilistic values against probabilistic XML documents directly. We propose a new encoding scheme called PEDewey for probabilistic XML in this paper. Based on this encoding scheme, we then design two algorithms for finding answers of top-k probabilities for twig queries. One is called ProTJFast, to process probabilistic XML data based on element streams in document order, and the other is called PTopKTwig, based on the element streams ordered by the path probability values. Experiments have been conducted to study the performance of these algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号