随着XML技术研究与应用的深入,涌现出了大量XML文档.为了对XML文档进行管理和查询,大多数RDBMS扩充了处理XML数据的功能.这种方式将XML数据映射到关系表中,会破坏XML数据的树型结构.查询时,需要多次的表连接,降低了查询效率.原生XML数据库以XML文档作为基本的逻辑存储单位,建立底层的物理存储模型.提出了一种原生XML数据库的后端实现策略--XBackend,包括存储策略、索引策略、数据库恢复策略. XBackend底层存储是专门为XML数据设计的,配合适当的索引,具有更高的存储和查询效率.数据库恢复策略保证了原生XML数据库数据的一致性状态.实验结果表明XBackend具有很好的性能.  相似文献   

充分利用XML数据库文档的树形结构特性,结合Dewey编码原理和B+树的索引特性,提出了一种基于B+树的加密XML结构索引和查询模型.在XML文档加密过程中,将XML加密数据与基于加密数据的B+树索引一起存储在服务器端,以便在服务器端完成对加密数据的结构索引.实验结果表明,此法提高了查询的效率,无需解密无关的加密数据,有效地实现了对加密XML数据的结构索引.  相似文献   

原生XML存储方案直接关系到查询处理和数据更新。目前的原生XML存储方案大多关注于查询处理而很少涉及对数据更新的支持。与关系表的更新不同,XML更新要考虑到节点的文档顺序。提出了一种新的原生XML存储更新机制,它既保持了节点的文档顺序,又使更新操作局限于一个页面之内,保证了更新的效率。通过引入前向链接记录和重定位记录,该更新机制使页面分裂时记录存储地址保持不变,避免了索引更新的I/O开销。通过实例说明该原生XML存储方案的数据更新机制是有效的。  相似文献   

建立高效的索引来快速定位满足要求的节点是提高XML数据查询效率的一个必要手段.文中以降低复杂度和提高查询效率为目标,以基于路径的XML索引原理为基础,提出了一种新型的基于Dewey编码的索引结构RTL-Index.RTL-Index通过对文档节点编码来表示结构信息,利用前缀路径匹配操作完成结构查询,支持含通配符" 和后代轴"//"的查询以及兄弟节点无序的模式树的查询.仿真实验结果表明RTL-Index索引具有较低的时间和空间复杂度,解决了XML文档分支路径查找问题,是一种较为有效的XML索引结构.  相似文献   

如何高效地处理XML查询,是目前研究的热点。由于当前方法存在过多扫描无用节点引起效率下降的问题,本文设计了一种XML数据的二级索引结构,基于该结构给出路径查询处理算法。首先,本文对XML模式中每个节点按路径类型进行分类编码,然后把每个节点按该编码进行聚类存储。在查询时,就可以先根据模式信息和查询信息得到目标节点的编码,然后只需将二级索引中这些编码对应的部分载入内存,进行过滤操作。这样就不必扫描整个索引,提高CPU和IO效率。本文还对二级索引结构进行扩展,使本文的过滤索引能方便应用在有分支结构的查询上。实验结果表明,本文的XML数据过滤算法效率优于基于Bit vector的过滤算法,并且索引结构所需要的存储空间也小于Bitvector索引。  相似文献   

XML目前正在成为各种数据库特别是文档的首选格式,然而由于数据模型的差异.利用关系数据库查询处理XML数据给传统数据库技术带来新的挑战.通过一种基于DTD的XML关系数据库存储方法.该方法分别针对DTD和XML文档建立独立的表结构和索引,从而提高查询效率.  相似文献   

在关键字查询领域,目前提出的大多数索引结构主要考虑的是静态的XML文档.当XML文档出现频繁更新时,这些索引结构可能面临着大范围的重新编码,从而增加了数据库索引维护的代价.为了能在XML文档动态更新的环境下保持其索引结构的稳定,提出了一种支持动态XML文档上关键字查询的索引结构DLSS( DDE Level Structure Summary).该索引结构采用了一种针对动态更新改进的Dewey编码,该编码只需在文档更新时对新的节点赋予相应的编码,而不需要调整原有的编码结构.实验证明,DLSS索引结构可以在XML文档频繁更新或者较少更新时都能保持索引结构的相对稳定,并能在其上实现较高的关键字查询效率.  相似文献   

随着XML技术的发展,如何利用现有的数据库技术存储和查询XML文档已成为XML数据管理领域研究的热点问题。本文介绍了一种新的文档编码方法,以及基于这种编码方式提出了一种新的XML文档存储方法。方法按照文档中结点类型将XML文档树型结构分解为结点,分别存储到对应的关系表中,这种方法能够将任意结构的文档存储到一个固定的关系模式中。同时为了便于实现数据的查询,将文档中出现的简单路径模式也存储为一个表。这种新的文档存储方法能够有效地支持文档的查询操作,并能根据结点的编码信息实现原XML文档的正确恢复。最后,对本文提出的存储方法和恢复算法进行了实验验证。  相似文献   

如何在XML文档中表达时间相关的数据,跟踪历史信息和恢复文档在以前任意时刻的状态的问题,在最近的研究中受到不少的关注.许多文献提出了各种不同的模型.我们将这一类的问题归为索引时态XML文档的问题.本文将时态XML文档转换到n维空间的节点和直线,使用UB-tree对这些 N维空间的节点和直线进行索引,并针对时态查询提出了新的查询算法.通过实验证明,这样的索引比之前针对时态模型提出的索引具有更好的性能.  相似文献   

XML文档进行高效编码、索引、查询的前提是数据的存储模型.针对XML的典型树状结构,文章提出一种基于三叉链表的XML文档存储模型.在此基础上,讨论了XML数据的查询、更新、插入、删除、结点关系判断等运算的实现,分析了相关算法的效率.  相似文献   

保持数据约束的关系数据库至XML文档的转换   总被引:2,自引:0,他引:2  
XML已成为Internet上的技术趋势,在保留原有关系数据库的同时发展XML文档是目前的最佳选择,它需要在保持数据依赖约束基础上实现关系数据库与XML文档的转换.这一过程中,模式转换必须先于数据转换,因为现有的关系数据库通常是规范化的,重建XML文档树结构才能实现这一转换.为了达成此目的,首先依据已有的数据依赖约束将规范化的关系联合进一组表格,实现反向规范化,然后将这些联合表格映射为一组DOM,归并成XML文档树,根据用户选择的根结点,以及与它相连的结点形成一个期望的局部文档树,被选的XML文档树又映射为DTD格式的XML模式.这样就可以将联合表映射成一组DOM,并将其归并成单一DOM,最终转换成XML文档.  相似文献   

The revolution of XML is recognized as the trend of technology on the Internet to researchers as well as practitioners. Companies need to adopt XML technology. With investment in the current relational database systems, they want to develop new XML documents while running existing relational databases on production. They need to reengineer the relational databases into XML documents with constraints preservation. In the process, schema translation must be done before data conversion. Since the existing relational databases are usually normalized, they have to be reconstructed into XML document tree structures. This can be accomplished through denormalization by joining the normalized relations into tables according to their data dependencies constraints. The joined tables are mapped into DOMs, which are then integrated into XML document trees. The user specifies an XML document root with its relevant nodes to form a partitioned XML document tree to meet their requirements. The selected XML document tree is mapped into an XML schema in the form of DTD. We then load joined tables into DOMs, integrate them into a DOM, and transform it into an XML document.  相似文献   

一种基于RDBMS的XML数据的存储方法   总被引:1,自引:0,他引:1  
XML作为一种数据交换的标准在互联网上推出,使得XML数据和数据库的相互交换成为必要:一是因为WEB中大量的多样化数据需要进行有效的存储和管理;二是因为在现有的数据库中存储有大量的数据并且需要将这些数据转换为XML发布到WEB中。论文提出了一个基于关系数据库的数据转换框架,基于数据的完整性讨论XML数据存储策略。建立一个XML通用数据模型,把文档树分解成多个节点,根据一定的映射规则存储到关系表中,从而不用考虑文档的模式信息(DTD、XMLSchema)。最后通过一个具体的文档实例来说明这种策略的有效性。  相似文献   

We consider data exchange for XML documents: given source and target schemas, a mapping between them, and a document conforming to the source schema, construct a target document and answer target queries in a way that is consistent with the source information. The problem has primarily been studied in the relational context, in which data-exchange systems have also been built. Since many XML documents are stored in relations, it is natural to consider using a relational system for XML data exchange. However, there is a complexity mismatch between query answering in relational and in XML data exchange. This indicates that to make the use of relational systems possible, restrictions have to be imposed on XML schemas and mappings, as well as on XML shredding schemes. We isolate a set of five requirements that must be fulfilled in order to have a faithful representation of the XML data-exchange problem by a relational translation. We then demonstrate that these requirements naturally suggest the in-lining technique for data-exchange tasks. Our key contribution is to provide shredding algorithms for schemas, documents, mappings and queries, and demonstrate that they enable us to correctly perform XML data-exchange tasks using a relational system.  相似文献   

XML凭借其诸多优点,在短短的时间内迅速成为表示和交换信息的标准。大量XML数据的涌现给数据挖掘提出了新的挑战。传统关联规则挖掘是基于关系数据库的,因此现有许多XML数据关联规则挖掘的方法都或多或少地利用关系数据库-即把XML数据文档映射成关系数据库来完成的。在仔细研究了XML数据的访问接口后,给出了一个基于Apriori算法可直接从XML文档挖掘关联规则的类接口,并且在.NET平台下用C#语言实现了。  相似文献   

We are interested in specifying functional dependencies (FDs) for data-centric XML documents (XML documents that are used mainly for data storage). FDs are a natural constraint. Specifying FDs for XML documents is more difficult because unlike relational databases, XML documents do not have uniform structures. This paper introduces XML Template Functional Dependencies (XTFDs), which are able to specify FDs for XML documents. This paper also presents a necessary and sufficient condition for an XTFD to cause data redundancy in XML documents. Further, we propose Attribute Rule and Text String Rule as two procedures that can be repeatedly applied to remove redundancy caused by XTFDs. In addition, we prove that if an XML document has data redundancy with respect to an FD specified by using the tree tuple approach, it would have data redundancy with respect to an XTFD and show by example that XTFDs can specify some FDs for XML documents that the tree tuple approach cannot.  相似文献   

独立于模式的XML文档向关系数据的映射   总被引:4,自引:0,他引:4  
周伟胜  鱼滨 《微机发展》2005,15(1):100-103
Internet上存在着大量的需要处理不同来源的XML文档,然而许多XML文档无法用固定的DTD或Schema定义其结构,并且还存在大量无模式的XML文档需要处理,文中针对这种情况提出独立于模式的XML文档向关系数据库映射方法,该方法对数据操作有很好的执行效果。  相似文献   

Extensible Markup Language (XML) is a simple, flexible text format derived from SGML, which is originally designed to support large-scale electronic publishing. Nowadays XML plays a fundamental role in the exchange of a wide variety of data on the Web. As XML allows designers to create their own customized tags, enables the definition, transmission, validation, and interpretation of data between applications, devices and organizations, lots of works in soft computing employ XML to take control and responsibility for the information, such as fuzzy markup language, and accordingly there are lots of XML-based data or documents. However, most of mobile and interactive ubiquitous multimedia devices have restricted hardware such as CPU, memory, and display screen. So, it is essential to compress an XML document/element collection to a brief summary before it is delivered to the user according to his/her information need. Query-oriented XML text summarization aims to provide users a brief and readable substitution of the original retrieved documents/elements according to the user’s query, which can relieve users’ reading burden effectively. We propose a query-oriented XML summarization system QXMLSum, which extracts sentences and combines them as a summary based on three kinds of features: user’s queries, the content of XML documents/elements, and the structure of XML documents/elements. Experiments on the IEEE-CS datasets used in Initiative for the Evaluation of XML Retrieval show that the query-oriented XML summary generated by QXMLSum is competitive.  相似文献   

Active XML (AXML) documents combine extensional XML data with intentional data defined through Web service calls. The dynamic properties of these documents pose challenges to both storage and data materialization techniques. In this paper, we present ARAXA, a non-intrusive approach to store and manage AXML documents. We also define a methodology to materialize AXML documents at query time. The storage approach of ARAXA is based on plain relational tables and user-defined functions of Object-Relational DBMS to trigger the service calls. By using a DBMS we benefit from efficient storage tools and query optimization. Approaches without DBMS support have to process XML in main memory or provide for virtual memory solutions. One of the main advantages of ARAXA is that AXML documents do not need to be loaded into main memory at query processing time. This is crucial when dealing with large documents. The experimental results with ARAXA prototype show that our approach is scalable and capable of dealing with large AXML documents.  相似文献   

