首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 250 毫秒
1.
汪万根 《计算机工程》2009,35(8):107-109
针对在XML文档树模型中进行后兄弟节点查询时内存消耗大、匹配效率低等缺陷,提出一种基于XML数据流与栈的后兄弟查询算法。采用SAX解析器与结构连接方法,对XML文档中所有已知节点与后兄弟节点进行精确匹配并输出。结果表明,该算法具有适用范围广、占用系统资源少、匹配效率高等优势。  相似文献   

2.
针对传统XML文档小枝模式查询算法中,与模式树中标签名相同的节点均入内存,易造成很大的空间浪费问题,提出了一种新的算法—StreamFWM(StreamFilter Without Merging)。StreamFWM采用区间编码方式,依据节点间的结构关系过滤标签流中无用的中间节点,且不用归并,只用简单的栈和列表实现。实验结果证明,算法StreamFWM相比TwigStack在查询处理的性能上有所提高。  相似文献   

3.
针对目前连续不确定XML数据同步多区间的查询处理算法易造成较大时间开销的问题,提出一种基于蒙特卡洛最小二乘思想的小枝模式查询处理算法QueryLSMC.算法根据查询请求依节点遍历序列顺序处理路径栈中节点,利用链表匹配并存储中间结果,通过构造随机样本集线性拟合目标节点中的连续分布函数,避免了对大量矩形分段的处理,有效地减少了计算量.实验结果表明,在取得理想精度的同时,该算法具有高效性.  相似文献   

4.
如何在XML数据流上高效地执行XPath查询,是XML数据流管理的关键问题。DTD结构信息对提高XML查询效率有很大帮助,已有的大部分算法没有利用这一资源。提出了一种使用DTD进行XML数据流查询处理的方法,具有以下特征:利用树自动机表示XPath;通过XPath树自动机与DTD树匹配,预先标识不匹配查询结构的DTD节点;给出一种利用DTD的XML流索引方法DBXSI;执行查询时,根据流索引信息直接跳过某些与查询不匹配的节点及子树。实验结果表明:该方法可有效支持Xpath查询,效率优于传统算法。  相似文献   

5.
《微型机与应用》2017,(15):16-21
作为网络数据交换和数据共享的标准,XML数据越来越多地用于表示应用系统的流数据。然而,受制于流数据处理有限空间开销等特征,如何高效地实现这种查询成为值得探讨的问题。与传统的基于自动机或层次栈方法不同,文中提出了一种基于图归约的XML查询自动机(GRAT),采用一种图结构来表示针对不同XML流元素的子查询任务之间的关系,通过图的归约变化来实现XPath查询。实验结果表明,基于GRAT的查询算法能够高效地完成复杂的XML查询,流数据处理的吞吐量达到了较高水平。  相似文献   

6.
XML流数据查询结果的缓存管理   总被引:2,自引:0,他引:2  
杨卫东  王清明  施伯乐 《软件学报》2008,19(8):2080-2088
提出一种系统地处理XML数据流的返回结果集的方法.在该方法中,用户对数据的兴趣用XQuery表示,能够处理递归文档以及同时处理多个查询;通过运行时栈驱动的基于二进制的前缀编码,在运行时确定结果集中节点之间的关系,避免了大量结果集之间的连接操作,能够有效减少内存耗费,提高处理性能.  相似文献   

7.
李素清  陶世群 《计算机应用》2007,27(12):3021-3025
XML已经成为Internet上一种普遍的数据交换标准,目前已经出现了多种对XML文档的查询方法。针对小枝模式的XML查询,提出了一种改进的小枝栈算法。该算法将路径栈算法的思想应用到它的主算法中实现了小枝模式查询。与仅使用路径栈算法相比,改进后的小枝栈算法在运行过程中不会产生中间结果,而且提高了找到小枝模式根元素后的查询效率。  相似文献   

8.
当前针对小枝模式的XML查询是XML文档查询的研究热点。文章在分析XML数据小枝查询处理常用算法的基础上,提出了一种高灵活性的、易确定结点对之间结构关系的EDiezt-P编码,并基于EDiezt-P编码和层次栈结构提出了一种自底向上的小枝查询算法。实验表明,该算法在一定程度上减少了查询处理时间,提高了查询效率。  相似文献   

9.
一种非归并不确定XML小枝模式查询算法   总被引:1,自引:1,他引:0  
针对目前不确定XML小枝模式查询需要存储大量中间结果和归并中间结果的情况,提出一种非归并不确定XML小枝模式查询算法ProTwigList。该算法查询之前通过Tag+Level流进行剪枝,以减少待处理节点的数目;并扩展了区间编码来对剪枝后剩余的普通节点进行编码,用一定规则对分布节点进行标识;查询时采用公共分布节点路径的方法处理分布结点,最后结合最低公共祖先节点的概率计算查询结果的概率值。理论分析和实验结果证明了ProTwigList算法的查询效率。  相似文献   

10.
一种逐层提升缓冲的XML流查询自动机   总被引:3,自引:1,他引:3  
如何在XML流上高效地执行大量XPath查询是当今研究的热点.特别在管道处理等应用中还希望在解析流的同时尽早地输出查询结果.定义了基本XSIEQ(XML Stream Query with Immediate Evaluation)机.它是一个XML流查询框架,是被索引化的、基于栈的自动机;在其上可以扩展应用多种XPath查询算法.在基本XSIEQ机上,提出一种逐层提升缓冲(promoting buffer,简称PBuf)的查询算法,形式地定义了基于PBuf的XSIEQ机并进行了实现和测试.实验结果表明,提出的方法能够支持复杂的XPath查询,在执行效率方面优于传统算法.  相似文献   

11.
Finding the occurrences of structural patterns in XML data is a key operation in XML query processing. Existing algorithms for this operation focus almost exclusively on path patterns or tree patterns. Current applications of XML require querying of data whose structure is complex or is not fully known to the user, or integrating XML data sources with different structures. These applications have motivated recently the introduction of query languages that allow a partial specification of path patterns in a query. In this paper, we consider partial path queries, a generalization of path pattern queries, and we focus on their efficient evaluation under the indexed streaming evaluation model. Our approach explicitly deals with repeated labels (that is, multiple occurrences of the same label in a query). We show that partial path queries can be represented as rooted dags for which a topological ordering of the nodes exists. We present three algorithms for the efficient evaluation of these queries. The first one exploits a structural summary of data to generate a set of path patterns that together are equivalent to a partial path query. To evaluate these path patterns, we extend a previous algorithm for path-pattern queries so that it can work on path patterns with repeated labels. The second one extracts a spanning tree from the query dag, uses a stack-based algorithm to find the matches of the root-to-leaf paths in the tree, and merge-joins the matches to compute the answer. Finally, the third one exploits multiple pointers of stack entries and a topological ordering of the query dag to apply a stack-based holistic technique. We analyze our algorithms and perform extensive experimental evaluations. Our experimental results show that the holistic algorithm outperforms the other ones. Our approaches are the first ones to efficiently evaluate this class of queries in the indexed streaming model.  相似文献   

12.
XML data broadcast is an efficient way to disseminate XML data to a large number of mobile clients in mobile wireless networks. Recently, several indexing methods have been proposed to improve the performance of XML query processing in terms of access time and tuning time over XML streams. However, existing indexing methods cannot process twig pattern XML queries. In this paper, we propose a novel structure for streaming XML data called PS+Pre/Post by integrating the path summary technique and the pre/post labeling scheme. Our proposed XML stream structure exploits the benefits of the path summary technique and the pre/post labeling scheme to efficiently process different types of XML queries over the broadcast stream. Experimental results show that our proposed XML stream structure improves the performance of access time and tuning time in processing different types of XML queries.  相似文献   

13.
Path queries have been extensively used to query semistructured data, such as the Web and XML documents. In this paper we introduce weighted path queries, an extension of path queries enabling several classes of optimization problems (such as the computation of shortest paths) to be easily expressed. Weighted path queries are based on the notion of weighted regular expression, i.e., a regular expression whose symbols are associated to a weight. We characterize the problem of answering weighted path queries and provide an algorithm for computing their answer. We also show how weighted path queries can be effectively embedded into query languages for XML data to express in a simple and compact form several meaningful research problems.  相似文献   

14.
Query matching on XML streams is challenging work for querying efficiency when the amount of queried stream data is huge and the data can be streamed in continuously. In this paper, the method Syntactic Twig-Query Matching (STQM) is proposed to process queries on an XML stream and return the query results continuously and immediately. STQM matches twig queries on the XML stream in a syntactic manner by using a lexical analyzer and a parser, both of which are built from our lexical-rules and grammar-rules generators according to the user's queries and document schema, respectively. For query matching, the lexical analyzer scans the incoming XML stream and the parser recognizes XML structures for retrieving every twig-query result from the XML stream. Moreover, STQM obtains query results without a post-phase for excluding false positives, which are common in many streaming query methods. Through the experimental results, we found that STQM matches the twig query efficiently and also has good scalability both in the queried data size and the branch degree of the twig query. The proposed method takes less execution time than that of a sequence-based approach, which is widely accepted as a proper solution to the XML stream query.  相似文献   

15.
孙东海  张昱  吴晓勇 《计算机科学》2007,34(10):137-142
如何在XML流上高效地执行大量复杂XQuery查询是当今研究的热点之一。在数据选择分发等应用中,还希望在解析流的同时尽早地输出查询结果。为此,本文将XQuery查询的路径导航和结果构造两个阶段分别运行于服务器、客户机两端。导航阶段针对XQuery查询定义了扩展的基本XSIEQ机E-XSIEQ(Extended XML Stream Quervwith Immediate Evaluation),它是一种被索引化、基于栈的自动机。在EXSIEQ机上设计应用了TreeBuf(TreeBuffer)算法,它是一种树型提升缓冲的查询算法,算法使用了前缀共享计算的技术,能高效处理XQuery查询,而且能优化XPath查询。实验证明了TreeBuf算法的高效性。  相似文献   

16.
李婷  程海涛 《计算机科学》2017,44(9):216-221, 226
在精确XML文档上的关键字查询方法的研究大多是基于LCA语义或者其变种语义(SLCA,ELCA等)开展的,将包含所有关键字的最紧致XML子树片段作为查询结果返回。但是这些基于LCA语义产生的查询结果中通常包含了大量的冗余信息,现实世界中存在着大量的不确定和模糊信息,因而如何从模糊XML文档中搜索到高质量的关键字查询结果是一个需要研究的问题。针对模糊XML文档上的关键字近似查询方法进行研究,通过引入最小连接树(MCT)的概念,提出在模糊XML文档上关键字查询的所有GDMCTs问题,并给出解决这一问题的基于栈的算法All fuzzy GDMCTs,该算法可以得到满足用户指定的子树大小阈值和可能性阈值条件的所有GDMCTs结果。实验表明,该算法在模糊XML文档上能够得到较高质量的关键字查询结果。  相似文献   

17.
XML关键字查询是一个用户比较方便的信息搜索方法,非常适用于用户在不熟悉XML查询语言和底层结构的情况下进行信息查询。现有的XML数据流上关键字查询多采用查找SLCA结果集的方式,为了解决基于SLCA结果集定义的不完备性,引入了基于XLCA的结果集定义,使其查询包含尽可能全的结果。文中对于XML数据流提出利用滑动窗口模型保存数据,基于XLCA的结果集定义,提出了一种TOP-K关键字查询算法,并从理论上证明了此算法的正确性和查询的完备性,分析了其时间复杂性和空间复杂性。  相似文献   

18.
Branch query processing is a core operation of XML query processing. In recent years, a number of stack based twig join algorithms have been proposed to process twig queries based on tag stream index. However, in tag stream index, each element is labeled separately without considering the similarity among elements. Besides, algorithms based on tag stream index perform inefficiently on large document. This paper proposes a novel index, named Clustered Chain Path Index, based on a novel labeling scheme. This index provides efficient support for processing branch queries. It also has the same cardinality as 1-index against tree structured XML document. Based on CCPI, efficient algorithms, KMP-Match-Path and Related-Path-Segment-Join, are proposed to process queries efficiently. Analysis and experimental results show that proposed query processing algorithms based on CCPI outperform other algorithms and have good scalability. This paper is partially supported by Natural Science Foundation of Heilongjiang Province, Grant No. zjg03-05 and National Natural Science Foundation of China, Grant No. 60473075 and Key Program of the National Natural Science Foundation of China, Grant No. 60533110.  相似文献   

19.
XML structural index, which acts as a schema, plays an important role in XML query optimization and formulation. To provide a reasonable structural index for branching path query under space constraint, we propose an adaptive index of multiple local branching depths and multiple local bisimilarities, which is constructed by maximizing marginal gain for given query load. It cannot only give good support to branching path queries but also have much smaller size compared with that of same sort of index. Detailed experiments have shown that the index is effective and efficient for XML branching path query.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号