期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

DOM tree browsing of a very large XML document: Design and implementation

Seung Min Suk I. 《Journal of Systems and Software》2009,82(11):1843-1858

Browsing the DOM tree of an XML document is an act of following the links among the nodes of the DOM tree to find some desired nodes without any knowledge for search. When the structure of the XML document is not known to a user, browsing is the basic operation performed for referring the contents of the XML document. If the size of the XML document is very large, however, using a general-purpose XML parser for browsing the DOM tree of the XML document to access arbitrary node may suffer from the lack of memory space for constructing the large DOM tree. To alleviate this problem, we suggest a method to browse the DOM tree of a very large XML document by splitting the XML document into n small XML documents and generating sequentially the DOM tree of each of those small n XML documents. For later reference, the information of some nodes accessed from the DOM tree already generated has been also kept using the concept of their virtual nodes. With our suggested approach, the memory space necessary for browsing the DOM tree of a very large XML document is reduced such that it can be managed by a personal computer. 相似文献

2.

一种XML文档模式有效性验证算法

董亚娟卑小贤刘西洋郑有才《计算机工程与应用》2005,41(16):86-89

XML Schema作为XML文档模式的定义语言,其语法本身不能保证定义模式的有效性。该文首先分析影响XML文档模式有效性的因素,并构造了XML文档模式图。依据XML实例文档特点,分析了XML文档模式图性质,其中包括递归引用可能导致死锁发生的原因。最后,给出了XML文档模式有效性验证算法,从而弥补了常见验证工具功能的不完整。相似文献

3.

利用关系表构建XML文档解析的树模型 总被引：2，自引：1，他引：1

祝青阳王东《计算机应用》2009,29(6):1719-1721

在对XML文档的数据解析和查询操作研究中,发现树能较好地反映XML文档的层次结构,但其查询效率较低,而关系表是一种适合存储大量数据且有较好查询效率与操作功能的数据结构。给出了一个把树和关系表相结合构建一种存储XML文档的数据模型;在这个模型的解析过程中,采用回调事件式的分段解析方法以减少解析时间和存储空间。这样既能较好保存XML文档的结构特点,又能提高其查询的效率和操作的便利性。通过对大数据量XML文档的解析和操作实验,实验结果证明这种数据模型在处理大型XML文档中具有明显优势。相似文献

4.

基于XML Schema抽象模型的XML模式验证方法

王伟良施佺曹渠江《计算机应用与软件》2007,24(3):41-43,60

XML模式验证作为处理XML数据的前提和保证有着重要的地位,XML Schema作为XML的定义语言,其本身并不能保证XML数据的有效性.利用XML Schema抽象模型,定义了XML Schema中每个复杂类型的模式信息,并用非终节点序对集描述XML文档,最后给出了XML模式验证算法,可以有效地验证XML文档的组织结构和内容类型. 相似文献

5.

面向XML Repository搜索引擎的研究与实现 总被引：1，自引：0，他引：1

乔娟杨炳儒《微计算机信息》2006,22(18):237-238

由于XML开发者可以随意定义自己的元素,就可能导致相同的元素表示不同的信息或相同的信息由不同的元素表示,这种现象使得人们交换XML文档相当困难。为了解决这一问题,许多团体组织开发了XMLRepository。目前主流的搜索机制并不适合XMLRepository,因此针对XMLRepository开发搜索引擎成为一个新的课题。本文通过分析XMLRepository的特点和主流搜索引擎的局限性,根据引入的“本体论”和“带有不完整信息的XML树”概念,为XML文档模式提出一种新的搜索引擎的模型XRDS,并通过实验验证。相似文献

6.

关于处理大型XML数据的NXD方法研究

李鹏飞吴洁丁秋林《微机发展》2006,16(3):179-181

XML作为SGML标记语言的一个子集,由于它能很好地表示结构化和半结构化数据,而逐渐成为Internet上或应用程序间数据交换和信息表示的标准。分析和处理XML文档的场合也越来越多,其方法和工具也有很多,然而,对于很大的文档,传统的处理方法存在着很多的缺点和不足之处。文中提出了一种新的分析处理XML文档的方法,即利用NativeXML Database(NXD),以提高分析处理的性能。相似文献

7.

Distributed XML design

S. Abiteboul G. Gottlob M. Manna 《Journal of Computer and System Sciences》2011,77(6):936-964

A distributed XML document is an XML document that spans several machines. We assume that a distribution design of the document tree is given, consisting of an XML kernel-documentT[f₁,…,fn] where some leaves are “docking points” for external resources providing XML subtrees (f₁,…,fn, standing, e.g., for Web services or peers at remote locations). The top-down design problem consists in, given a type (a schema document that may vary from a DTD to a tree automaton) for the distributed document, “propagating” locally this type into a collection of types, that we call typing, while preserving desirable properties. We also consider the bottom-up design which consists in, given a type for each external resource, exhibiting a global type that is enforced by the local types, again with natural desirable properties. In the article, we lay out the fundamentals of a theory of distributed XML design, analyze problems concerning typing issues in this setting, and study their complexity. 相似文献

8.

A Transaction Model for XML Databases 总被引：1，自引：0，他引：1

Dekeyser Stijn Hidders Jan Paredaens Jan 《World Wide Web》2004,7(1):29-57

The hierarchical and semistructured nature of XML data may cause complicated update behavior. Updates should not be limited to entire document trees, but should ideally involve subtrees and even individual elements. Providing a suitable scheduling algorithm for semistructured data can significantly improve collaboration systems that store their data—e.g., word processing documents or vector graphics—as XML documents. In this paper we show that concurrency control mechanisms in CVS, relational, and object-oriented database systems are inadequate for collaborative systems based on semistructured data. We therefore propose two new locking schemes based on path locks which are tightly coupled to the document instance. We also introduce two scheduling algorithms that can both be used with any of the two proposed path lock schemes. We prove that both schedulers guarantee serializability, and show that the conflict rules are necessary. 相似文献

9.

A Data Model for XML Databases

Vilas Wuwongse Kiyoshi Akama Chutiporn Anutariya Ekawit Nantajeewarawat 《Journal of Intelligent Information Systems》2003,20(1):63-80

相似文献

10.

Dynamically Updating XML Data: Numbering Scheme Revisited 总被引：2，自引：0，他引：2

Yu Jeffrey Xu Luo Daofeng Meng Xiaofeng Lu Hongjun 《World Wide Web》2005,8(1):5-26

Almost all existing approaches use certain numbering scheme to encode XML elements to facilitate query processing when XML data is stored in databases. For example, under the most popular region-based numbering scheme, the starting and ending positions of an element in a document are used as the code to identify the element so that the ancestor/descendant relationship between two elements can be determined by merely examining their codes. While such numbering scheme can greatly improve query performance, renumbering large amount of elements caused by updates becomes a performance bottleneck if XML documents are frequently updated. Unfortunately, no satisfactory work has been reported for efficient update of XML data. In this paper, we first formalize the XML data update problem by defining the basic operators to support most XML update queries. We then present a new numbering scheme that not only requires minimal code-length in comparison with existing numbering schema but also improves update performance when XML data is frequently updated at arbitrary positions. The fundamental difference between our new scheme and existing ones is that, instead of maintaining the explicit codes for elements, we only store the necessary information and generate the codes when they are needed in query processing. In addition to present the basic scheme, we also discuss some optimization techniques to further reduce the update cost. Results of a comprehensive performance study are provided to show the advantages of the new scheme. 相似文献

11.

一种支持数据更新的前缀编码方案

魏东平贾楠徐瑞敏《计算机系统应用》2011,20(3):189-192

目前大部分前缀编码方案都不能很好的支持XML文档的数据更新.提出的前缀编码方案不仅能高效地支持结构查询,快速准确的判断XML文档结构树中任意两个结点之间的父子、先后代以及兄弟关系,而且对插入的结点采用新的编码规则,避免了更新操作带来的编码调整问题,能有效支持XML文档更新. 相似文献

12.

Constructing ontologies by mining deep semantics from XML Schemas and XML instance documents

Fu Zhang Qiang Li 《国际智能系统杂志》2022,37(1):661-698

With the development of the Semantic Web and Artificial Intelligence techniques, ontology has become a very powerful way of representing not only knowledge but also their semantics. Therefore, how to construct ontologies from existing data sources has become an important research topic. In this paper, an approach for constructing ontologies by mining deep semantics from eXtensible Markup Language (XML) Schemas (including XML Schema 1.0 and XML Schema 1.1) and XML instance documents is proposed. Given an XML Schema and its corresponding XML instance document, 34 rules are first defined to mine deep semantics from the XML Schema. The mined semantics is formally stored in an intermediate conceptual model and then is used to generate an ontology at the conceptual level. Further, an ontology population approach at the instance level based on the XML instance document is proposed. Now, a complete ontology is formed. Also, some corresponding core algorithms are provided. Finally, a prototype system is implemented, which can automatically generate ontologies from XML Schemas and populate ontologies from XML instance documents. The paper also classifies and summarizes the existing work and makes a detailed comparison. Case studies on real XML data sets verify the effectiveness of the approach. 相似文献

13.

Efficiently publishing relational data as XML documents

Jayavel Shanmugasundaram Eugene Shekita Rimon Barr Michael Carey Bruce Lindsay Hamid Pirahesh Berthold Reinwald 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(2-3):133-154

XML is rapidly emerging as a standard for exchanging business data on the World Wide Web. For the foreseeable future, however, most business data will continue to be stored in relational database systems. Consequently, if XML is to fulfill its potential, some mechanism is needed to publish relational data as XML documents. Towards that goal, one of the major challenges is finding a way to efficiently structure and tag data from one or more tables as a hierarchical XML document. Different alternatives are possible depending on when this processing takes place and how much of it is done inside the relational engine. In this paper, we characterize and study the performance of these alternatives. Among other things, we explore the use of new scalar and aggregate functions in SQL for constructing complex XML documents directly in the relational engine. We also explore different execution plans for generating the content of an XML document. The results of an experimental study show that constructing XML documents inside the relational engine can have a significant performance benefit. Our results also show the superiority of having the relational engine use what we call an “outer union plan” to generate the content of an XML document. Received: 15 October 2000 / Accepted: 15 April 2001 Published online: 28 June 2001 相似文献

14.

CSBTT:一种基于二叉树遍历的XML文档编码模式

万里勇陈颖《计算机系统应用》2013,22(2):151-154

XML文档数据编码模式是XML文档查询处理的基础, 好的文档编码模式有利于提高文档的查询效率. 为了解决XML数据查询效率低、支持动态更新等问题. 本文在二叉树遍历的编码基础上, 引入二叉树的三叉链表存储结构对XML文档结点进行编码. 该编码利用自然数作为编码序号, 因此编码长度较短; 引入结点双亲指针, 方便结点之间结构关系的判定, 结点采用三叉树链式存储, 方便文档的更新操作. 相似文献

15.

基于XML结构的电子邮件的表示及其生成

钱龙华钱培德《计算机工程》2006,32(8):76-78

电子邮件通常按半结构化的MIME格式进行封装和传输,并以各个邮件用户代理专用的数据格式进行存储,无法同其它应用程序共享电子邮件,也无法充分利用电子邮件这一信息源。该文提出了一种面向邮件客户端的电子邮件的XML表示方法,将邮件的XML标记进一步地细化,目的是便于电子邮件的检索和重组,并给出了从邮件的MIME源码到邮件的XML文档的转换方法,为邮件的后续操作和邮件库的充分利用提供基础。相似文献

16.

一种XML数据库的数据模型 总被引：10，自引：0，他引：10

何震瀛李建中王朝坤《软件学报》2006,17(4):759-769

数据模型是XML数据管理领域研究的核心问题之一.现有的数据模型在表达XML数据库复杂的数据结构和操作方面仍有不足.以映射为基础,提出了一种新的数据模型.该数据模型给出了XML数据库复杂的数据结构和语义的精确定义,并提供了数据结构上操作代数的定义,包括路径表达式操作和数据维护操作.该数据模型已应用于一个基于XML的信息集成系统中.事实表明,它能够有效地支持XML数据管理的应用. 相似文献

17.

XML的并发加锁协议 总被引：3，自引：0，他引：3

庞引明谈子敬汪卫《计算机研究与发展》2004,41(7):1232-1239

随着XML数据库管理系统(XML DBMS)研究的日益深入,研究基于树型结构的XML数据的并发控制协议变得十分重要．由Silberschatz和Kedem提出的树加锁协议(tree protocol)是基于静态树结构数据而定义的．而XML数据是动态变化的树型结构数据．针对XML数据的特点,定义了一个操作集,它可以将一个树型结构的XML文档变化为另外一个合法的树型结构的：XML文档．该操作集的最大特点是其操作对象为一棵子树而非一个结点．在这个操作集基础上定义了XML动态树协议XDTP(XML dynamic tree protocol),并证明了该协议能继续保持静态树协议的优良特性：可串行化(serializability)和无死锁(deadlock-freedom)．在实际的数据集上进行了实验,结果表明XDTP有着较好的性能．相似文献

18.

一种新的XML文档编码机制 总被引：8，自引：1，他引：7

路燕张亮汪卫张彪施伯乐《计算机研究与发展》2004,41(3):500-503

XML查询中正则路径表达式的实现,需要快速判断元素间父子关系或祖先一后代关系。目前,基于树遍历的XML文档编码是一种主流的方法,但父子关系的判断需要在编码之外附加辅助的措施,部分实现不支持文档更新,提出一种新的编码方法,能够在常数复杂度的时间内实现两个元素间父子关系、祖先一后代关系的判断,计算祖先一后代结点间的辈数差异,并支持文档更新功能。相似文献

19.

Efficient memory representation of XML document trees 总被引：1，自引：0，他引：1

Giorgio Busatto Markus Lohrey Sebastian Maneth 《Information Systems》2008,33(4-5):456-474

Implementations that load XML documents and give access to them via, e.g., the DOM, suffer from huge memory demands: the space needed to load an XML document is usually many times larger than the size of the document. A considerable amount of memory is needed to store the tree structure of the XML document. In this paper, a technique is presented that allows to represent the tree structure of an XML document in an efficient way. The representation exploits the high regularity in XML documents by compressing their tree structure; the latter means to detect and remove repetitions of tree patterns. Formally, context-free tree grammars that generate only a single tree are used for tree compression. The functionality of basic tree operations, like traversal along edges, is preserved under this compressed representation. This allows to directly execute queries (and in particular, bulk operations) without prior decompression. The complexity of certain computational problems like validation against XML types or testing equality is investigated for compressed input trees. 相似文献

20.

基于XML的B2B可扩展数据交换标准框架的设计与实现 总被引：6，自引：5，他引：1

李晓唐威特《计算机工程与设计》2005,26(3):764-767

面向石化领域的B2B电子商务活动中涉及多家大型和特大型企业之间的分布式异构数据交换。提出针对石化领域的电子商务中在线物资采购业务的数据交换标准框架的设计准则,并给出了实现系统的体系结构。该数据交换标准与国际相关标准兼容;所交换的数据以XML为载体;包括用于规范数据交换过程的工作流模型、用于规范业务活动过程中的用户角色和数据交换文档类型的业务过程模型、用于规范该领域电子商务数据交换的业务术语的业务词汇表,以及用于规范交换文档的模式——XML Schema和用于规范信息交互方式消息协议。此外,框架中的消息机制基于SOAP,提供对Web服务的可扩展性。相似文献