首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The capabilities of XSLT processing are widely used to transform XML documents into target XML documents. These target XML documents conform to output schemas of the used XSLT stylesheet. Output schemas of XSLT stylesheets can be used for a static analysis of the used XSLT stylesheet, to automatically detect the XSLT stylesheet of target XML documents or to reason on the output schema without access to the target XML documents. In this paper, we develop an approach to automatically determining the output schema of an XSLT stylesheet. We also describe several application scenarios of output schemas. The experimental evaluation shows that our prototype can determine the output schemas of nearly all typical XSLT stylesheets and the improvements in preciseness in several application scenarios when using output schemas in comparison to when not using output schemas.  相似文献   

2.
XML is a markup language used to describe data or documents. The main goal of XML is to facilitate the sharing of data across diverse information systems, especially via the Internet. XML Stylesheet Transformations (XSLT) is a standard approach to describing how to transform an XML document into another data format. The ever‐increasing number of Web technologies being used in our everyday lives commonly employs XSLT to support data exchange among heterogeneous environments, and the associated increasing burdens on XSLT processors have increased the demand for high‐performance XSLT processors. In this paper, we present an XSLT compiler, named Zebu, which can transform an XSLT stylesheet into the corresponding C program. The compiled program can be used to transform documents without the processing of XSLT stylesheets. The results of experimental testing using standard benchmarks show that the proposed XSLT compiler performs well in processing XML transformations. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

3.
Caching stores the results of previously answered queries in order to answer succeeding queries faster by reusing these results. We propose two different approaches for using caches of XSLT transformed XML data in order to answer queries. The first approach checks whether or not a current query Q can be directly answered from the result of a previously answered query Qi stored in the cache. The new query is otherwise submitted to the source over the network, the answer of the query is determined, transmitted back to the client, and stored in the cache. The second approach determines only the intersection Q−Qi and integrates the result of Q−-Qi into the previous results in the cache, which requires applying a numbering scheme for the output of the XSLT stylesheet. We show by experimental results that the second approach can significantly speed up the answering time in comparison to the first approach, but is not significantly slower in few worst cases than the second approach.  相似文献   

4.
首先介绍用XSLT实现SML文档查询时,采用的处理模型以及相关XSLT指令在XML文档查询中的运用,接着给出一个查询示例,最后针对XML文档的查询,比较了XSLT与XQUERY的差别。  相似文献   

5.
People who classify and identify things based on their observable or deducible properties (called “characters” by biologists) can benefit from databases and keys that assist them in naming a specimen. This paper discusses our approach to generating an identification tool based on the field guide concept. Our software accepts character lists either expressed as XML (which biologists rarely provide knowingly—although most databases can now export in XML) or via ODBC connections to the data author’s relational database. The software then produces an Electronic Field Guide (EFG) implemented as a collection of Java servlets. The resulting guide answers queries made locally to a backend, or to Internet data sources via http, and returns XML. If, however, the query client requires HTML (e.g., if the EFG is responding to a human-centric browser interface that we or the remote application provides), or if some specialized XML is required, then the EFG forwards the XML to a servlet that applies an XSLT transformation to provide the look and feel that the client application requires. We compare our approach to the architecture of other taxon identification tools. Finally, we discuss how we combine this service with other biodiversity data services on the web to make integrated applications.  相似文献   

6.
XML path summaries are compact structures representing all the simple parent-child paths of an XML document. Such paths have also been used in many works as a basis for partitioning the document’s content in a persistent store, under the form of path indices or path tables. We revisit the notions of path summaries and path-driven storage model in the context of current-day XML databases. This context is characterized by complex queries, typically expressed in an XQuery subset, and by the presence of efficient encoding techniques such as structural node identifiers. We review a path summary’s many uses for query optimization, and given them a common basis, namely relevant paths. We discuss summary-based tree pattern minimization and present some efficient summary-based minimization heuristics. We consider relevant path computation and provide a time- and memory-efficient computation algorithm. We combine the principle of path partitioning with the presence of structural identifiers in a simple path-partitioned storage model, which allows for selective data access and efficient query plans. This model improves the efficiency of twig query processing up to two orders of magnitude over the similar tag-partitioned indexing model. We have implemented the path-partitioned storage model and path summaries in the XQueC compressed database prototype [8]. We present an experimental evaluation of a path summary’s practical feasibility and of tree pattern matching in a path-partitioned store.  相似文献   

7.
XSLT提供了一种将XML文档转换为HTML的强有力的工具。然而,当这种转换需要涉及更多逻辑的时候,就会显现出它的不足之处。文中讲述了如何使用Java扩展XSLT,从而更好地发挥两种语言的特色。最后给出一实例来具体展示如何将XML节点传送到Java类并返回到样式表以进一步处理。  相似文献   

8.
The gap between storing data in relational databases and transferring data in form of XML has been closed e.g. by SQL/XML queries that generate XML data out of relational data sources. However, only few relational database systems support the evaluation of SQL/XML queries. And even in those systems supporting SQL/XML, the evaluation of such queries is quite slow compared to the evaluation of SQL queries. In this paper, we present S2CX, an approach that allows to efficiently evaluate SQL/XML queries on any relational database system, no matter whether it supports SQL/XML or not. As a result to an SQL/XML query, S2CX supports different output formats ranging from plain XML to different compressed XML representations including a succinct encoding of XML data, schema-aware compressed XML to grammar compressed XML. In many cases, S2CX produces compressed XML as a result to an SQL/XML query even faster than the evaluation of SQL/XML queries into non-compressed XML as provided by Oracle 11 g and by DB2. Furthermore, our approach to query evaluation scales better, i.e., the larger the dataset, the faster is our approach compared to SQL/XML query evaluation in Oracle 11 g and in DB2.  相似文献   

9.
Query matching on XML streams is challenging work for querying efficiency when the amount of queried stream data is huge and the data can be streamed in continuously. In this paper, the method Syntactic Twig-Query Matching (STQM) is proposed to process queries on an XML stream and return the query results continuously and immediately. STQM matches twig queries on the XML stream in a syntactic manner by using a lexical analyzer and a parser, both of which are built from our lexical-rules and grammar-rules generators according to the user's queries and document schema, respectively. For query matching, the lexical analyzer scans the incoming XML stream and the parser recognizes XML structures for retrieving every twig-query result from the XML stream. Moreover, STQM obtains query results without a post-phase for excluding false positives, which are common in many streaming query methods. Through the experimental results, we found that STQM matches the twig query efficiently and also has good scalability both in the queried data size and the branch degree of the twig query. The proposed method takes less execution time than that of a sequence-based approach, which is widely accepted as a proper solution to the XML stream query.  相似文献   

10.
XML is the standard data interchange format and XSLT is the W3C proposed standard for transforming and restructuring XML documents. It turns out that XSLT has very powerful query capabilities as well. Hovewer, due to its complex syntax and lack of formal specification, it is not a trivial task to decide whether two XSLT stylesheets yield the same result, even if for an XSLT subset. We isolate such fragment, powerful enough for expressing several interesting queries and for manipulating XML documents and show how to translate them into queries expressed in a properly extended version of TAX, a powerful XML query algebra, for which we provide a collection of equivalence rules. It is then possible to reason about XSLT equivalences, by translating XSLT stylesheets into XTAX expressions and then statically verifying their equivalence, by means of the mentioned equivalence rules.  相似文献   

11.
XML技术在化学深层网数据提取中的应用   总被引:1,自引:1,他引:0  
Internet上的化学数据库是宝贵的化学信息资源,如何有效地利用这些数据是化学深层网所要解决的问题。本文总结了化学深层网的特点,基于XML技术实现从数据库检索返回的半结构化HTML页面中提取数据的目标,使之成为可供程序直接调用做进一步计算的数据。在数据提取过程中,先采用JTidy规范化HTML,得到格式上完整、内容无误的XHTML文档,利用包含着XPath路径语言的XSLT数据转换模板实现数据转换和提取。其中XPath表达式的优劣决定了XSLT数据转换模板能否长久有效地提取化学数据,文中着重介绍了如何编辑健壮的XPath表达式,强调了XPath表达式应利用内容和属性特征实现对源树中数据的定位,并尽可能地降低表达式之间的耦合度,前瞻性地预测化学站点可能出现的变化并在XSLT数据转换模板中采取相应的措施以提高表达式的长期有效性。为创建化学深层网数据提取的XSLT数据提取模板提供方法指导。  相似文献   

12.
XML is an ordered data model and XQuery expressions return results that have a well-defined order. However, little work on how order is supported in XML query processing has been done to date. In this paper we study the issues related to handling order in the XML context, namely challenges imposed by the XML data model, the variety of order requirements of the XQuery language, and the need to maintain order in the presence of updates to the XML data. We propose an efficient solution that addresses all these issues. Our solution is based on a key encoding for XML nodes that serves as node identity and at the same time encodes order. We design rules for encoding order of processed XML nodes based on the XML algebraic query execution model and the node key encoding. These rules do not require any actual sorting for intermediate results during execution. Our approach enables efficient order-sensitive incremental view maintenance as it makes most XML algebra operators distributive with respect to bag union. We prove the correctness of our order encoding approach. Our approach is implemented and integrated with Rainbow, an XML data management system developed at WPI. We have tested the efficiency of our approach using queries that have different order requirements. We have also measured the relative cost of different components related to our order solution in different types of queries. In general the overhead of maintaining order in our approach is very small relative to the query processing time.  相似文献   

13.
Web包装器将网页内容转换为XML格式,用于系统集成。进行XML转换的XSLT技术能较好地支持包装器的信息抽取和组织。本文从包含查询接口、结果模式和映射规则的包装器描述文件(XML)出发,给出了自动生成可执行代码的技术方案。包装器的执行及其生成过程完全基于XSLT技术,系统具有较强的可移植性。提出“元数据对齐”方法进行内
容辅助定位,提高了对页面变化的容忍度。原型系统的实现验证了以上技术的可行性。  相似文献   

14.
XML database systems emerge as a result of the acceptance of the XML data model. Recent works have followed the promising approach of building XML database management systems on underlying RDBMSs. Achieving query processing performance reduces to two questions: (i) How should the XML data be decomposed into data that are stored in the RDBMS? (ii) How should the XML query be translated into an efficient plan that sends one or more SQL queries to the underlying RDBMS and combines the data into the XML result? We provide a formal framework for XML Schema-driven decompositions, which encompasses the decompositions proposed in prior work and extends them with decompositions that employ denormalized tables and binary-coded XML fragments. We provide corresponding query processing algorithms that translate the XML query conditions into conditions on the relational tables and assemble the decomposed data into the XML query result. Our key performance focus is the response time for delivering the first results of a query. The most effective of the described decompositions have been implemented in XCacheDB, an XML DBMS built on top of a commercial RDBMS, which serves as our experimental basis. We present experiments and analysis that point to a class of decompositions, called inlined decompositions, that improve query performance for full results and first results, without significant increase in the size of the database.Received: 21 December 2001, Accepted: 1 July 2003, Published online: 23 June 2004Edited by: A. HalevyAndrey Balmin: Andrey Balmin has been supported by NSF IRI-9734548.Yannis Papakonstantinou: The authors built the XCacheDB system while on leave at Enosys Software, Inc., during 2000.  相似文献   

15.
Keyword search enables inexperienced users to easily search XML database with no specific knowledge of complex structured query languages and XML data schemas. Existing work has addressed the problem of selecting data nodes that match keywords and connecting them in a meaningful way, e.g., SLCA and ELCA. However, it is time-consuming and unnecessary to serve all the connected subtrees to the users because in general the users are only interested in part of the relevant results. In this paper, we propose a new keyword search approach which basically utilizes the statistics of underlying XML data to decide the promising result types and then quickly retrieves the corresponding results with the help of selected promising result types. To guarantee the quality of the selected promising result types, we measure the correlations between result types and a keyword query by analyzing the distribution of relevant keywords and their structures within the XML data to be searched. In addition, relevant result types can be efficiently computed without keyword query evaluation and any schema information. To directly return top-k keyword search results that conform to the suggested promising result types, we design two new algorithms to adapt to the structural sensitivity of the keyword nodes over the keyword search results. Lastly, we implement all proposed approaches and present the relevant experimental results to show the effectiveness of our approach.  相似文献   

16.
XML元素级加密是指对XML文档中的某一特定部分进行加密。在讨论基于XSLT的元素级加密技术的基础之上,提出了使用扩展函数来实现XML元素级加密 /解密的功能,并通过具体的例子来说明其可行性。相对于其他元素级加密技术,该方法最大的优点就是避免了部署与维护的麻烦,它把所有的安全操作都限制在了样式表内,而不涉及XSLT处理器,具有高可用性。  相似文献   

17.
Distributing data collections by fragmenting them is an effective way of improving the scalability of a database system. While the distribution of relational data is well understood, the unique characteristics of the XML data and query model present challenges that require different distribution techniques. In this paper, we show how XML data can be fragmented horizontally and vertically. Based on this, we propose solutions to two of the problems encountered in distributed query processing and optimization on XML data, namely localization and pruning. Localization takes a fragmentation-unaware query plan and converts it to a distributed query plan that can be executed at the sites that hold XML data fragments in a distributed system. We then show how the resulting distributed query plan can be pruned so that only those sites are accessed that can contribute to the query result. We demonstrate that our techniques can be integrated into a real-life XML database system and that they significantly improve the performance of distributed query execution.  相似文献   

18.
RRSi: indexing XML data for proximity twig queries   总被引:2,自引:2,他引:0  
Twig query pattern matching is a core operation in XML query processing. Indexing XML documents for twig query processing is of fundamental importance to supporting effective information retrieval. In practice, many XML documents on the web are heterogeneous and have their own formats; documents describing relevant information can possess different structures. Therefore some “user-interesting” documents having similar but non-exact structures against a user query are often missed out. In this paper, we propose the RRSi, a novel structural index designed for structure-based query lookup on heterogeneous sources of XML documents supporting proximate query answers. The index avoids the unnecessary processing of structurally irrelevant candidates that might show good content relevance. An optimized version of the index, oRRSi, is also developed to further reduce both space requirements and computational complexity. To our knowledge, these structural indexes are the first to support proximity twig queries on XML documents. The results of our preliminary experiments show that RRSi and oRRSi based query processing significantly outperform previously proposed techniques in XML repositories with structural heterogeneity.
Vincent T. Y. NgEmail:
  相似文献   

19.
The problem of decentralized data sharing, which is relevant to a wide range of applications, is still a source of major theoretical and practical challenges, in spite of many years of sustained research. In this paper we focus on the challenge of efficiency of query evaluation in information integration systems that use the global-as-view approach, with the objective of developing query-processing strategies that would be widely applicable and easy to implement in real-life applications. Our algorithms take into account important features of today’s data sharing applications: XML as likely interface or representation for data sources; the potential for information overlap across data sources; and the need for inter-source processing, as in joins of data across sources. The focus of this paper is on performance-related characteristics of several alternative approaches that we propose for efficient query processing in information integration, including an approach that uses materialized restructured views. We use synthetic and real-life datasets in our implementation of an information integration system shell to provide experimental results that demonstrate that our algorithms are efficient and competitive in the information integration setting. In addition, our experimental results allow us to make context-specific recommendations on selecting query-processing approaches from our proposed alternatives. As such, our approaches could form a basis for scalable query processing in information integration and interoperability in many practical settings.  相似文献   

20.
BusSEngine: a business search engine   总被引:1,自引:1,他引:0  
With the emergence of World Wide Web, business’ databases are increasingly being queried directly by customers. The customers may not be aware of the underlying data and its structure, and might have never learned a query language that enables them to issue structured queries. Some of the business’ employees who query the databases may also not be aware of the structure of the data, but they are likely to be aware of some labels of elements containing data. We propose in this article: (1) an XML Keyword-Based search engine for answering business’ customers called BusSEngine-K, and (2) an XML loosely Structured-Based search engine for answering business’ employees called BusSEngine-L. The two engines employ novel context-driven search techniques and are built on top of XQuery search engine. The two engines were evaluated experimentally and compared with three recently proposed XML search engines. The results showed marked improvement.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号