期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On the equivalence between FDs in XML and FDs in relations

Millist W. Vincent Jixue Liu Mukesh Mohania 《Acta Informatica》2007,44(3-4):207-247

With the growing use of the eXtensible Markup Language (XML) in database technology as a format for the permanent storage of data, the topic functional dependencies in XML (XFDs) has assumed increased importance because of its central role in database design. Recently, two different approaches have been proposed for defining an XFD. The first uses the concept of a ‘tree tuple’, whereas the second uses the concept of a ‘closest node’. In general, the two approaches are not comparable, but are comparable when a Document Type Definition is present and there is no missing information in the XML document. The first contribution of this article shows that when the two XFD definitions are comparable, the definitions are equivalent, and so there is essentially a common definition of an XFD in complete XML documents. The second contribution is to provide justification for the definition of a ‘closest node’ XFD. We show that if a complete flat relation is mapped to an XML document by an arbitrary sequence of nest operations, the XML document satisfies a ‘closest node’ XFD if and only if the relation satisfies the corresponding functional dependency. The class of XML documents generated in this fashion is a subset of the class of XML documents for which the two definitions of XFDs coincide. Hence ‘tree tuple’ and ‘closest node’ XFDs both capture the semantics of FDs when a complete relation is mapped to an XML document via arbitrary nesting. 相似文献

2.

XML文档及其函数依赖到关系的映射 总被引：16，自引：2，他引：16

王庆周俊梅吴红伟萧建昌周傲英《软件学报》2003,14(7):1275-1281

有许多文章提出了根据DTD将XML映射成关系的方法,但都没有考虑XML的语义,而语义信息对数据存储模式设计、查询优化、更新异常检查等来说是十分重要的,如果在DTD上指定了XML的函数依赖,在映射到关系数据库中就需要将其考虑进去.基于Hybrid Inlining方法并考虑XML函数依赖,提出了一种既能保持XML文档的内容和结构信息,又能保持函数依赖信息的映射方法.通过这种方法可以减少存储冗余,同时证明了映射后的关系都满足第三范式. 相似文献

3.

一种基于结构索引的XML模式匹配方法 总被引：2，自引：0，他引：2

乔健陈彤兵汪卫施伯乐《计算机科学》2005,32(10):95-99

XML文档采用了树型的数据模型,对其查询通常是用带有选择谓词的模式树在XML数据中进行匹配.因此,找出XML文档中所有符合模式树结构的元素集,是XML查询处理的核心操作.本文提出了结构索引JoinGuide,并在此基础上提出了一种新的XML模式匹配方法.它使用JoinGuide来对模式树进行预匹配,这样在XML文档上查询时可以利用索引上的匹配结果来忽略部分连接谓词和不必要的候选XML元素序列.本文还提出了三种具体算法来利用索引匹配结果进行进一步的查询.实验结果表明本文中的模式树匹配方法优于以往的匹配方法,并且索引所需的空间很小. 相似文献

4.

基于XML的Web数据库技术 总被引：3，自引：0，他引：3

万常选《计算机与现代化》2002,(4):49-53,57

探讨了两种将关系数据转换的XML文档的语言描述及其实现技术，一种是利用RXL（Relational to XML Transformation Language)语言来定义一个关系数据库的XML视图，该XML视图的虚的，应用再利用XML查询语言XML-QL在虚的视图上构造一个查询，抽取XML视图中的数据片断并对抽取的部分进行物化，实现将关系数据转换为XML文档。另一种是利用并扩展SQL的功能来描述这种转换，嵌套的SQL表达式被利用来描述嵌套，扩展的SQL函数被利用来描述XML元素构造，实现将关系数据构造成XML文档。相似文献

5.

以XML文档发布关系数据 总被引：2，自引：0，他引：2

万常选《计算机应用与软件》2002,19(8):30-33,50

本文对以XML文档发布关系数据的新技术进行了综述，主要分析了两种发布关系数据到XML文档的语言描述及其实现技术，以及它们的优缺点，一种是利用并扩展SQL的功能来描述这种转换，嵌套的SQL表达式被利用来描述嵌套，扩展的SQL标量及聚集函数被利用来描述XML元素构造，实现将关系数据转换为XML文档，另一种是利用RXL（Relational to XML Transformation Language)语言来定义一个关系数据库的XML视图，该XML视图是虚的，其它应用可再利用XML查询语言XML－QL在虚拟的视图上构造一个查询，抽取XML视图中的数据片断并对抽取的部分进行物化，实现将关系数据转换为XML文档。相似文献

6.

XML文档的相似测度和结构索引研究 总被引：20，自引：0，他引：20

郑仕辉周傲英张龙《计算机学报》2003,26(9):1116-1122

提出了一个可用于定量度量XML文档间差异的方法(称为XED距离)。利用结点间的模拟关系，一个XML文档可以表示为一棵精简的、带权重的结构索引树，两个XML文档间的相似度可以通过计算它们的索引树间的编辑距离来测定，利用索引树可以大大提高判定两个XML文档结构相似度的效率，XED距离测度可用于XML文档的结构搜索、XML文档聚类、XML文档结构抽取、XML文档的变换检测以及XML视图的增量计算和维护等。相似文献

7.

Constraint Preserving Transformation from Relational Schema to XML Schema

Chengfei Liu Dr. Millist W. Vincent Jixue Liu 《World Wide Web》2006,9(1):93-110

XML has become the standard for publishing and exchanging data on the Web. However, most business data is managed and will remain to be managed by relational database management systems. As such, there is an increasing need to efficiently and accurately publish relational data as XML documents for Internet-based applications. One way to publish relational data is to provide virtual XML documents for relational data via an XML schema which is transformed from the underlying relational database schema such that users can access the relational database through the XML schema. In this paper, we discuss issues in transforming a relational database schema into the corresponding XML schema. We aim to preserve all integrity constraints defined in a relational database schema, to achieve high level of nesting and to avoid introducing data redundancy in the transformed XML schema. In the paper, we first propose a basic transformation algorithm which introduces no data redundancy, then we improve the algorithm by exploring further nesting of the transformed XML schema. 相似文献

8.

A methodology for clustering XML documents by structure

Theodore Dalamagas Tao Cheng Klaas-Jan Winkel Timos Sellis 《Information Systems》2006

The processing and management of XML data are popular research issues. However, operations based on the structure of XML data have not received strong attention. These operations involve, among others, the grouping of structurally similar XML documents. Such grouping results from the application of clustering methods with distances that estimate the similarity between tree structures. This paper presents a framework for clustering XML documents by structure. Modeling the XML documents as rooted ordered labeled trees, we study the usage of structural distance metrics in hierarchical clustering algorithms to detect groups of structurally similar XML documents. We suggest the usage of structural summaries for trees to improve the performance of the distance calculation and at the same time to maintain or even improve its quality. Our approach is tested using a prototype testbed. 相似文献

9.

Coordinating Mobile Agents by the XML-Based Tuple Space 总被引：1，自引：0，他引：1

下载免费PDF全文

卢正鼎李春林李腊元《计算机科学技术学报》2002,17(6):0-0

This paper presents Xspace,a programmable coordination paradigm for Internet applications based on mobile agents.The Xspace system fully exploits the advantages of the XML language and Linda-like coordination.It supports XML documents as tuple fields and multiple matching routines implementing different relations among XML documents,including those given by XML query languages,The Xspace uses Java as the implementation language;it is based on object-oriented XMLized tuple spaces to implement a portable and programmable coordination paradigm for mobile agents.The dsign and implementation procedures of Xspace are described in this paper,Experiment and performance evaluation are also made.Finally,some conclusinos and remarks are given. 相似文献

10.

一种启发式XML结构重构算法

刘波杨路明邓云龙《计算机应用》2008,28(7):1696-1699

基于海量XML文档查询时信息关联和服务请求多样性的需求,提出一个重构XML结构的频繁向量选择增量模式树(XFP-tree)算法。该算法以XML键为基础,利用向量矩阵处理方法、投影频繁模式树实现XML结构的分裂、合并、更改与取消等操作,同时讨论XML键向量矩阵频繁项集的划分规则及相应启发式策略与支持度阈值。对比其他关联算法,一系列仿真实验表明所提出算法具有一定的有效性及合理性,是重构XML结构的一种有效尝试。相似文献

11.

A survey on tree matching and XML retrieval

《Computer Science Review》2013

With the increasing number of available XML documents, numerous approaches for retrieval have been proposed in the literature. They usually use the tree representation of documents and queries to process them, whether in an implicit or explicit way. Although retrieving XML documents can be considered as a tree matching problem between the query tree and the document trees, only a few approaches take advantage of the algorithms and methods proposed by the graph theory. In this paper, we aim at studying the theoretical approaches proposed in the literature for tree matching and at seeing how these approaches have been adapted to XML querying and retrieval, from both an exact and an approximate matching perspective. This study will allow us to highlight theoretical aspects of graph theory that have not been yet explored in XML retrieval. 相似文献

12.

XML schema refinement through redundancy detection and normalization

Cong Yu H. V. Jagadish 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(2):203-223

As XML becomes increasingly popular, XML schema design has become an increasingly important issue. One of the central objectives of good schema design is to avoid data redundancies: redundantly stored information can lead not just only to a higher data storage cost but also to increased costs for data transfer and data manipulation. Furthermore, such data redundancies can lead to potential update anomalies, rendering the database inconsistent. One strategy to avoid data redundancies is to design redundancy-free schema from the start on the basis of known functional dependencies. We observe that XML databases are often “casually designed” and XML FDs may not be determined in advance. Under such circumstances, discovering XML data redundancies from the data itself becomes necessary and is an integral part of the schema refinement (or re-design) process. We present the design and implementation of the first system, DiscoverXFD, for efficient discovery of XML data redundancies. It employs a novel XML data structure and introduces a new class of partition-based algorithms. The XML data redundancies are defined on the basis of a new notion of XML functional dependency (XML FD) that (1) extends previous notions by incorporating set elements into the XML FD specification, and (2) maintains tuple-based semantics through the novel concept of Generalized Tree Tuple (GTT). Using this comprehensive XML FD notion, we introduce a new normal form (GTT-XNF) for XML documents, and provide comprehensive comparisons with previous studies. Given the set of data redundancies (in the form of redundancy-indicating XML FDs) discovered by DiscoverXFD, we describe a normalization algorithm for converting any original XML schema into one in GTT-XNF. 相似文献

13.

Rafael C. Carrasco Juan Ramón Rico-Juan 《Pattern recognition》2003,36(9):2197-2199

We describe a general approach to compute a similarity measure between distributions generated by probabilistic tree automata that may be used in a number of applications in the pattern recognition field. In particular, we show how this similarity can be computed for families of structured (XML) documents. In such case, the use of regular expressions to specify the right part of the expansion rules adds some complexity to the task. 相似文献

14.

XML函数依赖研究

董东郭瑞强李红《计算机应用与软件》2006,23(10):47-49,99

为了消除数据冗余，基于关系数据模型的函数依赖理论已经被广泛接受并应用于关系数据库的设计中。XML数据库中同样存在数据冗余。为了设计没有冗余的XML数据库，需要用一种简洁、易于理解的方法来定义XML数据问依赖关系。在无序的结点标记树数据模型上定义了XML子树间的函数依赖，并且给出了一套公理系统用来推导函数依赖，以解决函数依赖的蕴涵问题，最后证明了该公理系统是合理的。相似文献

15.

基于关系的概率XML数据存储方法研究

下载免费PDF全文

王建卫郝忠孝《计算机工程与应用》2011,47(23):130-132

根据概率数据的描述形式对概率数据分为基于关系的概率数据模型和基于XML的概率数据模型两类。基于关系的概率数据模型是为每个元组引入概率标记属性表示不确定性,使元组的存储、查询处理变得复杂;基于XML的概率数据模型是在普通XML树中添加表示概率属性结点,能够表示多粒度的概率信息。设计了映射为关系的概率XML数据的与PDTD无关的PXRel和PXParent两种存储模式,并通过实验验证了其有效性。相似文献

16.

A comparison of two approaches to utilizing XML in parametric databases for temporal data

《Information and Software Technology》2006,48(9):807-819

The parametric data model captures an object in terms of a single tuple. This feature eliminates unnecessary self-join operations to combine tuples scattered in a temporal relation. Despite this advantage, this model is relatively difficult to implement on top of relational databases because the sizes of attributes are unfixed. Since data boundaries are not problematic in XML, XML can be an elegant solution to implement parametric databases for temporal data. There are two approaches to implementing parametric databases using XML: (1) a native XML database with XQuery engine, and (2) an XML storage with a temporal query language. To determine which approach is appropriate in parametric databases, we consider four questions: the effectiveness of XML in modeling temporal data, the applicability of XML query languages, the user-friendliness of the query languages, and system performances of two approaches. By evaluating the four questions, we show that the latter approach is more appropriate to utilizing XML in parametric databases. 相似文献

17.

一种新的XML文档的存储平台SDML的实现技术

洪晓光《计算机科学》2005,32(2):80-83

目前,XML文档数据库(NXD—Native XML DBMS)的设计和存储正受到越来越多的关注,这是由于它可以灵活地表示各种数据,尤其是那些关系模式无法表达的复杂的数据。已经有一些NXD产品出现。而对XML文档的存储的好坏直接影响到它的查询效率,基于此我们自主提出了一种高效的XML文档存储平台SDML。详细讨论了它的存储结构和实现细节。特别提出了如何解决具有大量结构相同元素的存储方法,并给出了在其上进行查询、插入、删除和索引维护等操作的解决方案。给出了这种结构I／O费用代价,并进行了相关的实现,为NXD的存储优化提供一种新的途径。相似文献

18.

一种基于s-DOM的XML文档索引算法

王申康张雪燕《计算机应用研究》2005,22(2):87-89

改造XML树模型是提高XML查询效率的重要方法。通过分析现有的索引算法,对XML树模型进行了改造,提出了基于Signature的索引策略(s-DOM)。采用该策略预处理XML文档可以大大缩小搜索范围,从而提高了查询的效率。相似文献

19.

Collaborative clustering of XML documents

Sergio Greco Francesco Gullo Giovanni Ponti Andrea Tagarelli 《Journal of Computer and System Sciences》2011,77(6):988-1008

Clustering XML documents is extensively used to organize large collections of XML documents in groups that are coherent according to structure and/or content features. The growing availability of distributed XML sources and the variety of high-demand environments raise the need for clustering approaches that can exploit distributed processing techniques. Nevertheless, existing methods for clustering XML documents are designed to work in a centralized way. In this paper, we address the problem of clustering XML documents in a collaborative distributed framework. XML documents are first decomposed based on semantically cohesive subtrees, then modeled as transactional data that embed both XML structure and content information. The proposed clustering framework employs a centroid-based partitional clustering method that has been developed for a peer-to-peer network. Each peer in the network is allowed to compute a local clustering solution over its own data, and to exchange its cluster representatives with other peers. The exchanged representatives are used to compute representatives for the global clustering solution in a collaborative way. We evaluated effectiveness and efficiency of our approach on real XML document collections varying the number of peers. Results have shown that major advantages with respect to the corresponding centralized clustering setting are obtained in terms of runtime behavior, although clustering solutions can still be accurate with a moderately low number of nodes in the network. Moreover, the collaborativeness characteristic of our approach has revealed to be a convenient feature in distributed clustering as found in a comparative evaluation with a distributed non-collaborative clustering method. 相似文献

20.

消除数据冗余的XML模式设计方法

宋晓芸乐嘉锦《计算机工程》2004,30(17):70-72

分析当前流行的XML-Schema设计方法。针对XML文档存在数据冗余的现象，改进了设计方法，提出基于E-R模型的实体加权算法。该方法使数据类型定义基准量化，进而在保证XML树结构的前提下最小化数据冗余。相似文献