首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
We study the typechecking problem for XML (eXtensible Markup Language) transformers: given an XML transformation program and a DTD for the input XML documents, check whether every result of the program conforms to a specified output DTD. We model XML transformers using a novel device called a k-pebble transducer, that can express most queries without data-value joins in XML-QL, XSLT, and other XML query languages. Types are modeled by regular tree languages, a robust extension of DTDs. The main result of the paper is that typechecking for k-pebble transducers is decidable. Consequently, typechecking can be performed for a broad range of XML transformation languages, including XML-QL and a fragment of XSLT.  相似文献   

2.
We investigate the typechecking problem for XML transformations: statically verifying that every answer to a transformation conforms to a given output schema, for inputs satisfying a given input schema. As typechecking quickly turns undecidable for query languages capable of testing equality of data values, we return to the limited framework where we abstract XML documents as labeled ordered trees. We focus on simple top-down recursive transformations motivated by XSLT and structural recursion on trees. We parameterize the problem by several restrictions on the transformations (deleting, non-deleting, bounded width) and consider both tree automata and DTDs as input and output schemas. The complexity of the typechecking problems in this scenario ranges from PTIME to EXPTIME.  相似文献   

3.
Typechecking consists of statically verifying whether the output of an XML transformation is always conform to an output type for documents satisfying a given input type. We focus on complete algorithms which always produce the correct answer. We consider top–down XML transformations incorporating XPath expressions and abstract document types by grammars and tree automata. By restricting schema languages and transformations, we identify several practical settings for which typechecking can be done in polynomial time. Moreover, the resulting framework provides a rather complete picture as we show that most scenarios cannot be enlarged without rendering the typechecking problem intractable. So, the present research sheds light on when to use fast complete algorithms and when to reside to sound but incomplete ones.  相似文献   

4.
XML documents are becoming popular for business process integration. To achieve interoperability between applications, XML documents must also conform to various commonly used data type definitions (DTDs). However, most business data are not maintained as XML documents. They are stored in various native formats, such as database tables or LDAP directories. Hence, a middleware is needed to dynamically generate XML documents conforming to predefined DTDs from various data sources. As industrial consortia and large corporations have created various DTDs, it is both challenging and time-consuming to design the necessary middleware to conform to so many different DTDs. This problem is particularly acute for a small- or medium-sized enterprise because it lacks the IT skills to quickly develop such a middleware. In this paper, we present XLE, an XML Lightweight Extractor, as a practical approach to dynamically extracting DTD-conforming XML documents from heterogeneous data sources. XLE is based on a framework called DTD source annotation (DTDSA). It treats a DTD as the control structure of a program. The annotations become the program statements, such as functions and assignments. DTD-conforming XML documents are generated by parsing annotated DTDs. Basically, DTD annotations describe declaratively the mappings between target XML documents and the source data. The XLE engine implements a few basic annotations, providing a practical solution for many small- and medium-sized enterprises. However, XLE is designed to be versatile. It allows sophisticated users to plug in their own implementations to access new types of data or to achieve better performance. Heterogeneous data sources can be simply specified in the annotations. A GUI tool is provided to highlight the places where annotations are needed.  相似文献   

5.
Measuring the structural similarity between an XML document and a DTD has many relevant applications that range from document classification and approximate structural queries on XML documents to selective dissemination of XML documents and document protection. The problem is harder than measuring structural similarity among documents, because a DTD can be considered as a generator of documents. Thus, the problem is to evaluate the similarity between a document and a set of documents. An effective structural similarity measure should face different requirements that range from considering the presence and absence of required elements, as well as the structure and level of the missing and extra elements to vocabulary discrepancies due to the use of synonymous or syntactically similar tags. In the paper, starting from these requirements, we provide a definition of the measure and present an algorithm for matching a document against a DTD to obtain their structural similarity. Finally, experimental results to assess the effectiveness of the approach are presented.  相似文献   

6.
存在多值依赖的XML DTD规范化研究   总被引:1,自引:0,他引:1  
丘威  张立臣 《计算机科学》2007,34(2):149-151
XML DTD文档中可能包含由非函数依赖引起的数据冗余和操作异常,首先从消除DTD文档内数据冗余的角度出发研究了文档的规范化的问题,讨论了在DTD文档中存在多值依赖的情况下,如何规范XML文档,提出了以DTD为模式的XML文档的多值依赖的概念。然后基于多值依赖的概念,提出了XML文档的一种多值依赖范式MXNF。最后在此基础上提出了把一个XML文档的DTD无损联接地分解成为符合MXNF的规范化算法,来规范存在多值依赖的XML DTD文档,并给出了该算法的分析说明。  相似文献   

7.
XML access control models proposed in the literature enforce access restrictions directly on the structure and content of an XML document. Therefore access authorization rules (authorizations, for short), which specify access rights of users on information within an XML document, must be revised if they do not match with changed structure of the XML document. In this paper, we present two authorization translation problems. The first is a problem of translating instance-level authorizations for an XML document. The second is a problem of translating schema-level authorizations for a collection of XML documents conforming to a DTD. For the first problem, we propose an algorithm that translates instance-level authorizations of a source XML document into those for a transformed XML document by using instance-tree mapping from the transformed document instance to the source document instance. For the second problem, we propose an algorithm that translates value-independent schema-level authorizations of non-recursive source DTD into those for a non-recursive target DTD by using schema-tree mapping from the target DTD to the source DTD. The goal of authorization translation is to preserve authorization equivalence at instance node level of the source document. The XML access control models use path expressions of XPath to locate data in XML documents. We define property of the path expressions (called node-reducible path expressions) that we can transform schema-level authorizations of value-independent type by schema-tree mapping. To compute authorizations on instances of schema elements of the target DTD, we need to identify the schema elements whose instances are located by a node-reducible path expression of a value-independent schema-level authorization. We give an algorithm that carries out path fragment containment test to identify the schema elements whose instances are located by a node-reducible path expression.  相似文献   

8.
The flexibility of XML data model allows a more natural representation of uncertain data compared with the relational model. Matching twig pattern against XML data is a fundamental problem in querying information from XML documents. For a probabilistic XML document, each twig answer has a probabilistic value because of the uncertainty of data. The twig answers that have small probabilistic value are useless to the users, and usually users only want to get the answers with the k largest probabilistic values. To this end, existing algorithms for ordinary XML documents cannot be directly applicable due to the need for handling probability distributional nodes and efficient calculation of top-k probabilities of answers in probabilistic XML. In this paper, we address the problem of finding twig answers with top-k probabilistic values against probabilistic XML documents directly. We propose a new encoding scheme called PEDewey for probabilistic XML in this paper. Based on this encoding scheme, we then design two algorithms for finding answers of top-k probabilities for twig queries. One is called ProTJFast, to process probabilistic XML data based on element streams in document order, and the other is called PTopKTwig, based on the element streams ordered by the path probability values. Experiments have been conducted to study the performance of these algorithms.  相似文献   

9.
用XML对数据库查询的方法   总被引:14,自引:0,他引:14  
李京  庄成三 《计算机应用》2000,20(10):21-24
本文讨论了用XML查询数据库的具体实现方法。首先,提出了用DTD描述关系数据模式和利用ASP技术转化数据库的数据成XML文档方法;然后,用XML的查询语言XML-QL完成Web数据库上查询和数据集成等操作。  相似文献   

10.
We consider XML documents described by a document type definition (DTD). An XML-grammar is a formal grammar that captures the syntactic features of a DTD. We investigate properties of this family of grammars. We show that every XML-language basically has a unique XML-grammar. We give two characterizations of languages generated by XML-grammars, one is set-theoretic, the other is by a kind of saturation property. We investigate decidability problems and prove that some properties that are undecidable for general context-free languages become decidable for XML-languages. We also characterize those XML-grammars that generate regular XML-languages.
Résumé. Nous considérons des documents XML décrits par une définition de type de document (DTD). Une grammaire XML est une grammaire formelle qui retient les aspects syntaxiques d'une DTD. Nous étudions les propriétés de cette famille de grammaires. Nous montrons qu'un langage XML a essentiellement une seule grammaire XML. Nous donnons deux caractérisations des langages engendrés par les grammaires XML, la première est ensembliste, la deuxième est par une propriété de saturation. Nous examinons des problèmes de décision et nous prouvons que certaines propriétés qui sont indécidables pour les langages context-free généraux deviennent décidables pour les langages XML. Nous caractérisons également les grammaires XML qui engendrent des langages rationnels.


Received: 16 March 2001 / 19 March 2002  相似文献   

11.
12.
XML DTD的一种范式   总被引:5,自引:0,他引:5  
研究了XML DTD的规范化问题。由于DTD在设计上存在不足之处,DTD中可能包含类似于关系数据库模式中存在的异常依赖,从而导致XML文档包含冗余的数据和各种操作异常。提出了关于DTD的多值依赖的概念,然后基于多值依赖的概念,提出了一种XML的范式XNF,并且用DTD的关系表示形式给出了关于DTD的无损联接分解的概念。最后给出了把DTD无损联接地分解成XNF的一种算法。  相似文献   

13.
With the growing use of XML as a format for the permanent storage of data, the study of functional dependencies in XML (XFDs) is of fundamental importance in a number of areas such as understanding how to effectively design XML databases without redundancy or update problems, and data integration. In this article we investigate a particular type of XFD, called a weakclosest nodeXFD, that has been shown to extend the classical notion of a functional dependency in relational databases. More specifically, we investigate the implication problem for weak ‘closest node’ XFDs in the context of XML documents with no missing information. The implication problem is the most important one in dependency theory, and is the problem of determining if a set of dependencies logically implies another dependency. Our first, and main, contribution is to provide an axiom system for XFD implication. We prove that our axiom system is both sound and complete, and we then use this result to develop a sound and complete quadratic time closure algorithm for XFD implication. Our second contribution is to investigate the implication problem for XFDs in the presence of a Document Type Definition (DTD). We show that for a class of DTDs called structured DTDs, the implication problem for a set of XFDs and a structured DTD can be converted to the implication problem for a set of XFDs alone, and so is axiomatizable and efficiently solvable by the first contribution. We do this by augmenting the original set of XFDs with additional XFDs generated from the structure of the DTD.  相似文献   

14.
Due to an explosive increase of XML documents, it is imperative to manage XML data in an XML data warehouse. XML warehousing imposes challenges, which are not found in the relational data warehouses. In this paper, we firstly present a framework to build an XML data warehouse schema. For the purpose of scalability due to the increase of data volume, we propose a number of partitioning techniques for multi-version XML data warehouses, including document based partitioning, schema based partitioning, and cascaded (mixed) partitioning model. Finally, we formulate cost models to evaluate various types of queries for an XML data warehouse.  相似文献   

15.
We consider data exchange for XML documents: given source and target schemas, a mapping between them, and a document conforming to the source schema, construct a target document and answer target queries in a way that is consistent with the source information. The problem has primarily been studied in the relational context, in which data-exchange systems have also been built. Since many XML documents are stored in relations, it is natural to consider using a relational system for XML data exchange. However, there is a complexity mismatch between query answering in relational and in XML data exchange. This indicates that to make the use of relational systems possible, restrictions have to be imposed on XML schemas and mappings, as well as on XML shredding schemes. We isolate a set of five requirements that must be fulfilled in order to have a faithful representation of the XML data-exchange problem by a relational translation. We then demonstrate that these requirements naturally suggest the in-lining technique for data-exchange tasks. Our key contribution is to provide shredding algorithms for schemas, documents, mappings and queries, and demonstrate that they enable us to correctly perform XML data-exchange tasks using a relational system.  相似文献   

16.
Path queries have been extensively used to query semistructured data, such as the Web and XML documents. In this paper we introduce weighted path queries, an extension of path queries enabling several classes of optimization problems (such as the computation of shortest paths) to be easily expressed. Weighted path queries are based on the notion of weighted regular expression, i.e., a regular expression whose symbols are associated to a weight. We characterize the problem of answering weighted path queries and provide an algorithm for computing their answer. We also show how weighted path queries can be effectively embedded into query languages for XML data to express in a simple and compact form several meaningful research problems.  相似文献   

17.
This paper deals with the integration of multimedia and database technologies in order to describe web multimedia documents. We present a middleware to seamlessly handle database accesses as well as compositional, spatial and temporal constraints related to data presentation. Our approach is based on the concept of Templates. A template is a logical presentation unit that merge database queries with layout specifications. We choose an XML and SMIL approach to implement template. Template definition and invocation are mapped into a XML DTD. Each template is then translated into a SMIL document. In this paper, we give an example to show the advantages of our approach.  相似文献   

18.
针对XML流数据的复杂Twig Pattern查询处理   总被引:2,自引:0,他引:2  
XML流数据处理在研究领域引起了研究者的广泛兴趣.针对XML流数据的、具有嵌套AND/OR谓词的复杂Twig Pattern查询处理,提出一种新方法.为了提高查询处理性能,将所有Twig Pattern合并为一个共享前缀的查询树,其中,AND/OR谓词被表示为单独的抽象语法树,因而能够以文档顺序、单遍地处理复杂Twig Pattern的匹配,并避免了YFilter中对嵌套谓词进行后置处理所产生的中间结果.实验结果表明,该方法能够有效改善Twig Pattern的处理性能,尤其是在处理大文档的情况下.基于已  相似文献   

19.
DTD作为一种XML文档结构的模式语言得到了广泛的使用,它描述了相似的XML文档的结构。DTD的一致性是指对于一个给定的DTD,判断是否存在至少有一个XML文档满足DTD。在引入DTD一致性的形式化定义的基础上,分析了引起DTD不一致性的各种因素,提出了DTD一致性的判定方法。  相似文献   

20.
Program slicing is a well-known technique to extract the program statements that (potentially) affect the values computed at some point of interest. In this work, we introduce a novel slicing method for XML documents. Essentially, given an XML document (which is valid w.r.t. some DTD), we produce a new XML document (a slice) that contains the relevant information in the original XML document according to some criterion. Furthermore, we also output a new DTD such that the computed slice is valid w.r.t. this DTD. A prototype implementation of the XML slicer has been undertaken.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号