共查询到20条相似文献,搜索用时 171 毫秒
1.
2.
对现有最大序列模式挖掘算法候选序列模式过多以及可扩展性差的缺点,提出了一种基于序列匹配的最大序列模式挖掘算法CSMS(compare sequence finding maximal sequential pattern).算法首先为所有频繁1序列构建位置信息表;然后利用纵向、横向结合搜索位置信息表的序列扩展匹配方法找到潜在最大序列模式;在进行序列匹配扩展的同时,把每个找到的潜在最大序列模式存储在改进的前缀树PStree(prefix sequential pattern tree)中,树中每个结点链接到索引Hash表,Hash表中保存了结点的位置信息,对于那些重复的序列可以直接从Hash表中找到其位置信息;最后通过对前缀树PStree进行剪枝,得到由最大序列模式组成的前缀树MPStree(maximal sequential pattern tree).实验结果表明算法CSMS具有较好的时间效率和扩展性. 相似文献
3.
LinkNet:一种用于大规模P2P系统查找的新方法 总被引:2,自引:0,他引:2
提出了一种新的可扩展分布式数据结构LinkNet来支持大规模P2P系统中的数据查找.在LinkNet中,所有的元素存储在一个有序的双向链表中,该链表中的每个结点都可以存储多个元素.LinkNet使用虚拟链接来减少存储开销和加速查找过程.在一个包含N个结点M个元素的网络中,LinkNet占用的存储空间期望值为O(M),并且当M足够大时,查找操作期望只需要传递O(logN)条消息. 相似文献
4.
为了有效地利用文档类型定义(DTD)中的路径信息、减少结构连接次数,使用二元前缀码对DTD的元素或属性编码,并将DTD编码引入到XML节点编码中.在此基础上,将路径表达式查询分解成若干查询片段,利用二元前缀码的"位"操作高效地计算每个查询片段的结果,最后使用结构连接将这些查询结果组合起来.实验结果表明该方法是正确的和高效的. 相似文献
5.
功耗问题已经成为高性能计算机系统设计的重要问题。并行存储系统是高性能计算机系统的重要组成部分,降低其功耗对于降低整个并行系统功耗具有重要意义。并行存储系统由存储结点组成,降低存储结点功耗是降低并行存储系统功耗的重要部分。本文针对存储结点的处理器提出了功耗优化方法,根据利用率信息调节处理器电压/频率,并通过元数据服务器指导的频率预调节算法缓解因调频所引发的响应时间滞后问题。分析表明,该方法可以有效降低存储结点功耗,实现并行存储系统的功耗优化。 相似文献
6.
7.
在实际应用中经常会有些类似财务科目式的把结点的上下层关系置入编码中的数据,以多级编码的方式存储在库表中,但有时为了获取某结点在树结构中从根结点到本身结点的名称路径却不得不要通过编程来实现。通过对Oracle的SQL语句的分析,提出用一句Oracle的SQL语句完成从根结点到本身结点名称路径的生成方法,此法在多个系统中得到实际应用,简便实用大大提高了程序的可靠性,也为其他多级编目式的树型关系数据获取绝对路径提供一种新颖的方法。 相似文献
8.
9.
对于碰撞检测算法,使用传统的AABB包围盒来构建包围盒层次树时,其包围盒层次树的层数、叶子结点的个数和各结点的存储字节数是影响碰撞检测效率的主要因素.为了减少结点存储容量对碰撞检测效率的影响,提高碰撞检测的效率,文中采取B+树的存储结构来存储包围盒等信息.在包围盒相交测试之前,使得各结点存储索引有序,不需要再对各结点进行额外的排序,减少了内存开销,并且避免了不必要的包围盒测试.此外B+树的非叶子结点不存储具体的数据信息,从而减少了整棵树的存储空间.实验表明,在检测环境和检测对象相同的条件下,使用B+树存储的AABB包围盒碰撞检测算法的检测时间明显比传统的AABB算法短. 相似文献
10.
11.
12.
XML data can be represented by a tree or graph structure and XML query processing requires the information of structural relationships among nodes. The basic structural relationships are parent-child and ancestor-descendant, and finding all occurrences of these basic structural relationships in an XML data is clearly a core operation in XML query processing. Several node labeling schemes have been suggested to support the determination of ancestor-descendant or parent-child structural relationships simply by comparing the labels of nodes. However, the previous node labeling schemes have some disadvantages, such as a large number of nodes that need to be relabeled in the case of an insertion of XML data, huge space requirements for node labels, and inefficient processing of structural joins. In this paper, we propose the nested tree structure that eliminates the disadvantages and takes advantage of the previous node labeling schemes. The nested tree structure makes it possible to use the dynamic interval-based labeling scheme, which supports XML data updates with almost no node relabeling as well as efficient structural join processing. Experimental results show that our approach is efficient in handling updates with the interval-based labeling scheme and also significantly improves the performance of the structural join processing compared with recent methods. 相似文献
13.
14.
基于编码的XML关系数据库存储 总被引:2,自引:0,他引:2
在XML的发展过程中,如何有效地利用关系数据库技术存储和查询XML数据已经成为一个研究热点.提出了一种基于前、后序编码的XML关系数据库存储方法,该方法采用的模式映射方法能够使基于不同DTD(或schema)的XML文档保存在同一个关系表中,支持快速的XML路径查询,且具有较高的XML文档重组效率.对该方法中递归模式的处理技术也进行了讨论.实验表明,与XRel,Florescu和Kossman等人提出的XML关系数据库存储方法相比,该方法能够缩短复杂XML路径查询(如带条件谓词约束的路径查询)的响应时间. 相似文献
15.
16.
为了提高XML数据查询处理效率,提出时XML数据结点采用标签聚类存储,同时结点路径信息存储在位向量中.通过XML Schema和查询信息计算出结点过滤表达式,由位向量间的高效运算剔除不满足过滤表达式的结点.另外给出压缩位向量后对压缩数据直接进行过滤操作的方法.实验结果表明该优化方法对XML的数据查询具有较高效率. 相似文献
17.
XMin: Minimizing Tree Pattern Queries with Minimality Guarantee 总被引:1,自引:0,他引:1
Due to wide use of XPath, the problem of efficiently processing XPath queries has recently received a lot of attention. In
particular, a considerable effort has been devoted to minimizing XPath queries since the efficiency of query processing greatly
depends on the size of the query. Research work in this area can be classified into two categories: constraint-independent
minimization and constraint-dependent minimization. The former minimizes queries in the absence of integrity constraints while
the latter in the presence of them. For a linear path query, which is an XPath query without branching predicates, existing constraint-independent minimization methods are generally
known to be unable to minimize the query without processing the query itself. Most recently, however, by using the DataGuide, a representative structural summary of XML data, a constraint-independent method that minimizes linear path queries in a
top-down fashion has been proposed. Nevertheless, this method can fail to find a minimal query since it minimizes a query
by merely erasing labels from the original query whereas a minimal query could include labels that are not present in the
original query. In this paper, we propose a bottom-up approach called XMin that guarantees finding a minimal query for a given tree pattern query by using the DataGuide without processing the query itself. For the
linear path query, we first show that the sequence of labels occurring in the minimal query is a subsequence of every schema label sequence that matches the original query. Here, the schema label sequence for a node is the sequence of labels from the root of XML
data to the node. We then propose iterative subsequence generation that iteratively generates subsequences from the shortest schema label sequence matching the original query in a bottom-up
fashion and tests query equivalence. Using iterative subsequence generation, we can always find a minimal query and we formally
prove this guarantee. We also propose an extended algorithm that guarantees the minimality for the tree pattern query, which is a linear path query with branching predicates. These methods have been prototyped in a full-fledged object-relational
DBMS. The experimental results using real and synthetic data sets show the practicality of our method. 相似文献
18.
XML documents are often viewed as trees (basically the parse tree
of the document), and queries over such documents typically test
for ancestor relationships among tree nodes. Search engines
process such queries using an index structure summarizing the
ancestor relations. In the index, each document item (tree node)
is identified using some logical id (node label), such that, given
two labels, the engine can determine the ancestor relationship
between the corresponding nodes. The length of the labels is a
main factor of the index size. Therefore, reducing this length,
even by a constant factor, is a critical issue. In this work we consider the
following problem. Given a rooted XML tree
T, label the nodes of T in the most compact way such that
given the labels of two nodes, one can determine in constant time, by
looking at the labels only, whether one node is an ancestor of the
other. Labelings currently being used are all variants of the
following interval scheme. Number the leaves say from left to right and label each
node with a pair consisting of the numbers of its smallest and largest
leaf descendants. An ancestor query then amounts to an interval
containment test on the labels. The maximum label length
using this scheme is 2 log n, where n is the number of nodes
in the tree. (All logarithms in this paper are to base 2.) The focus of this work is finding
a scheme that works best in practice on real XML data. We suggest an orthogonal prefix-based approach, where the labeling
is such that an ancestor query roughly amounts
to testing whether one label is a prefix of the other. We present
several new labeling schemes based on this approach and analyze
their performance both theoretically and empirically. 相似文献
19.
并行XML数据库系统中数据分片策略的研究 总被引:5,自引:0,他引:5
数据分片策略是影响并行数据库系统性能的重要因素之一.着重探讨并行XML数据库系统中大规模XML文档的数据分片问题,提出与传统数据库分片策略不同的两种新的分片方法:基于路径模式的路径实例平衡法(PSPIB)和基于结点模式的结点轮循法(NSNRR).前者的思想是析散DOM树中具有相同路径模式的路径实例,将其分配到不同站点;后者的思想是将DOM树中具有不同结点模式的元素结点以轮循方式析散到不同站点,而将具有相同结点模式的元素结点聚簇到同一站点.还介绍了这两种分片策略的实现,并给出了相应的基于RPE查询的性能测试、分析和评价. 相似文献
20.
高效的存储方法是实现一个高性能XML数据库的关键。该文提出一种基于访问频率的XML文档存储方法。该方法基于XPath模型,通过分析XML查询条件中的路径表达式来获得各位置路径的访问频率,然后根据文档中各节点的位置路径的访问频率来决定该节点的存储策略,使XML文档的存储更加符合实际查询的需要。 相似文献