首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 187 毫秒
1.
异构数据源集成系统查询分解和优化的实现   总被引:54,自引:0,他引:54  
王宁  王能斌 《软件学报》2000,11(2):222-228
通用异构数据源集成系统需要集成包括WWW在内的各种数据源,有些数据源既无规则的模式结构,又无强有力的查询功能,给全局查询的分解和优化造成一定的困难.异构数据源集成系统Versatile一方面利用局部动态字典的模板操作构造集成系统全局动态字典,作为查询分解和优化的依据.一方面采用基于缓存和数据源能力的查询分解和优化策略,以便充分利用数据源的查询能力,简化包装器的设计,并取得较高的查询效率.  相似文献   

2.
基于XML的关系型数据源包装器的设计   总被引:2,自引:0,他引:2  
用XML表示的半结构化数据越来越普遍,同时,大量的结构化数据组织存放在关系数据库中,如何集成这些不同结构的数据成了研究的一个热点。文章设计了一种关系型数据源包装器,它主要有两个功能模块,一个负责将XQuery查询转换为SQL查询,称为查询转换器;另一个负责将SQL查询返回的元组转换为XML数据格式,称为结果产生器。包装器是一种软件,它相当于数据源的外衣,不影响数据源的本身,它是异构数据集成系统的一个重要组成部分。  相似文献   

3.
为优化数据迁移对多数据源关联查询性能的影响,提出一个多数据源的关联查询优化模型(multi-source association query optimization model,MAQM),使用包装器对需要查询的存储系统进行包装,为用户提供统一的多数据源关联查询接口;提出区域划分策略,以存储系统的关系表为划分粒度,构建基于多数据源关联查询命令的区域有向图,划分出查询子任务.在区域有向图的基础上,对每一种中间结果的可能传输方向建立数据迁移代价模型,确定查询子任务的执行顺序.对比实验结果表明,MAQM相较于Oracle原生关联查询工具ODCH可以平均提升30%-40%的查询性能.  相似文献   

4.
由于异构数据源集成系统需要集成包括WWW在内的各种数据源,有些数据源既无规则的模式结构,又无强有力的查询功能,给查询规划造成一定的困难.在分析异构集成系统中查询规划生成需求的基础上,引入数据源能力描述的概念,进而提出数据源能力描述框架.该框架以数据源局部模式与中介模式的语义映射以及数据源查询能力的描述为支撑,较好的满足了查询规划的需求,并为查询优化提供保证.在此基础上,设计了一个基于数据源能力描述的查询规划系统框架,并通过一个完整的例子说明数据源能力描述框架在查询规划中的应用.  相似文献   

5.
根据空间数据源的特点给出一种表示空间数据源能力信息的方法,包括导出模式、查询能力和转换能力.在此基础上查询计算引擎针对用户查询集成多个分布式空间数据源的能力,通过构造模式图和函数图为用户查询构造相应的查询转换步骤,使用户能够仅给出单一查询,系统可以完全自动地访问多个空间数据源从而返回最终查询结果.该系统可作为空间信息集成的一个重要模块,并具有很强的可扩展性.  相似文献   

6.
包装器是自治异构数据源集成系统中的重要组成部分。随着与XML有关的标准不断制定和完善,越来越多的数据被用XML表示,同时必须注意的一个事实是,目前和在一个可以预见的未来,大部分应用系统,甚至是新的基于Web的应用系统,仍然将关系数据库系统作为数据存储和查询的首选。关系数据库系统的可靠性、技术的成熟性、丰富的工具、高性能等,都决定了这一点。本文探讨在自治异构数据源集成系统情境下,如何将XML查询翻译成SQL查询,如何将关系数据转换为XML数据表示。  相似文献   

7.
程骄杰  张忠能 《计算机工程》2004,30(Z1):640-642
给出了一种分布式异构数据源集成查询系统的设计与开发.能够实现对关系型数据库、文本文件和XML文档等不同数据源的透明存取.查询系统读取不同数据源的数据字典,通过模式管理器建立起集成模式.用户通过查询处理器建立基于集成模式的查询,系统根据连接信息和配置信息把对集成模式的查询分解成对每个数据源的子查询,各个数据源的查询结果返回后要被合并处理,再呈现给用户,实现了对异构数据源的有效查询.  相似文献   

8.
程骄杰  张忠能 《计算机工程》2004,30(12):640-642
给出了一种分布式异构数据源集成查询系统的设计与开发。能够实现对关系型数据库、本件和XML档等不同数据源的透明存取,查询系统读取不同数据源的数据字典,通过模式管理器建立起集成模式。用户通过查询处理器建立基于集成模式的查询,系统根据连接信息和配置信息把对集成模式的查询分解成对每个数据源的子查询,各个数据源的查询结果返回后要被合并处理,再呈现给用户,实现了对异构数据源的有效查询。  相似文献   

9.
随着关键词查询技术的飞速发展和互联网数据的迅猛增长,高效、准确的数据源选择变得十分有意义。提出了一种基于倒排列表的数据源选择方式,通过这种方式,能够在短时间内选择出相关度高的数据源,在这些数据源中执行检索,从而减少查询时间,给用户带来了更好的查询体验。从实验结果可以看出,这种方法在实际系统(例如机票查询系统)中可以得到很好的效果。为了在大规模的数据集上高效地实现相关算法,将min-hash算法应用到相似度估计中来,减少了查询空间和时间的消耗。与传统算法的比较结果表明:min-hash算法能够得到较高的精确度,并且极大地节省了算法的运行时间。  相似文献   

10.
为了有效地对异构专利数据源进行统一的查询,提出一个基于本体的异构专利数据源集成系统.该系统引入本体解决数据源集成中存在的语义异构,通过全局数据模式为用户提供统一的查询接口,将用户针对全局数据模式的查询重写为针对各个局部数据源的子查询.使用该系统,用户可以从异构的专利源中得到正确的查询结果.  相似文献   

11.
Transforming queries for efficient execution is particularly important in federated database systems since a more efficient execution plan can require many fewer data requests to be sent to the component databases. Also, it is important to do as much as possible of the selection and processing close to where the data are stored, making best use of facilities provided by the federation's component database management systems. In this paper we address the problem of processing complex queries including quantifiers, which have to be executed against different databases in an expanding heterogeneous federation. This is done by transforming queries within a mediator for global query improvement, and within wrappers to make the best use of the query processing capabilities of external databases. Our approach is based on pattern matching and query rewriting. We introduce a high level language for expressing rewrite rules declaratively, and demonstrate the use and flexibility of such rules in improving query performance for existentially quantified subqueries. Extensions to this language that allow generic rewrite rules to be expressed are also presented. The value of performing final transformations within a wrapper for a given remote database is shown in several examples that use AMOS II—an SQL3-like system.  相似文献   

12.
13.
14.
We present the design of ObjectGlobe, a distributed and open query processor for Internet data sources. Today, data is published on the Internet via Web servers which have, if at all, very localized query processing capabilities. The goal of the ObjectGlobe project is to establish an open marketplace in which data and query processing capabilities can be distributed and used by any kind of Internet application. Furthermore, ObjectGlobe integrates cycle providers (i.e., machines) which carry out query processing operators. The overall picture is to make it possible to execute a query with – in principle – unrelated query operators, cycle providers, and data sources. Such an infrastructure can serve as enabling technology for scalable e-commerce applications, e.g., B2B and B2C market places, to be able to integrate data and data processing operations of a large number of participants. One of the main challenges in the design of such an open system is to ensure privacy and security. We discuss the ObjectGlobe security requirements, show how basic components such as the optimizer and runtime system need to be extended, and present the results of performance experiments that assess the additional cost for secure distributed query processing. Another challenge is quality of service management so that users can constrain the costs and running times of their queries. Received: 30 October 2000 / Accepted: 14 March 2001 Published online: 7 June 2001  相似文献   

15.
数据源集成系统中动态字典构造方法研究   总被引:2,自引:1,他引:1  
本文从异构数据源集成系统的角度出发,引入模板和动态字典的概念,统一描述各种数据源数据的模式。动态字典不仅能描述对象的结构特征,还能描述对象的行为特征,完全符合面向对象特点。除此以外,本文还引入五种模板操作的定义,并证明OIM对象操作的模板可由相应的模板操作构成,从而给出不通过扫描数据库,而是利用局部动态字典的模板操作构造集成系统全局动态字典的方法。  相似文献   

16.
An XML-enabled data extraction toolkit for web sources   总被引:7,自引:0,他引:7  
The amount of useful semi-structured data on the web continues to grow at a stunning pace. Often interesting web data are not in database systems but in HTML pages, XML pages, or text files. Data in these formats are not directly usable by standard SQL-like query processing engines that support sophisticated querying and reporting beyond keyword-based retrieval. Hence, the web users or applications need a smart way of extracting data from these web sources. One of the popular approaches is to write wrappers around the sources, either manually or with software assistance, to bring the web data within the reach of more sophisticated query tools and general mediator-based information integration systems. In this paper, we describe the methodology and the software development of an XML-enabled wrapper construction system—XWRAP for semi-automatic generation of wrapper programs. By XML-enabled we mean that the metadata about information content that are implicit in the original web pages will be extracted and encoded explicitly as XML tags in the wrapped documents. In addition, the query-based content filtering process is performed against the XML documents. The XWRAP wrapper generation framework has three distinct features. First, it explicitly separates tasks of building wrappers that are specific to a web source from the tasks that are repetitive for any source, and uses a component library to provide basic building blocks for wrapper programs. Second, it provides inductive learning algorithms that derive or discover wrapper patterns by reasoning about sample pages or sample specifications. Third and most importantly, we introduce and develop a two-phase code generation framework. The first phase utilizes an interactive interface facility to encode the source-specific metadata knowledge identified by individual wrapper developers as declarative information extraction rules. The second phase combines the information extraction rules generated at the first phase with the XWRAP component library to construct an executable wrapper program for the given web source.  相似文献   

17.
Many systems and strategies have been proposed for processing nonterminating data streams. Each approach has advantages and disadvantages, including the kinds of queries that can be executed. We present a framework for characterizing the kinds of queries that can be executed over streams based on a notion of compact sets from topology. We first apply our framework to queries over punctuated data streams. Previous work on punctuations focused primarily on the behavior of individual query operators. We use our framework to determine if an entire query can benefit from punctuations available from stream sources. We then consider other common strategies proposed in the literature for executing queries over streams, and we discuss how our framework can characterize the kinds of queries each strategy can answer.  相似文献   

18.
With the rise of Big Data, providing high-performance query processing capabilities through the acceleration of the database analytic has gained significant attention. Leveraging Field Programmable Gate Array (FPGA) technology, this approach can lead to clear benefits. In this work, we present the design and implementation of AxleDB: An FPGA-based platform that enables fast query processing for database systems by melding novel database-specific accelerators with commercial-off-the-shelf (COTS) storage using modern interfaces, in a novel, unified, and a programmable environment. AxleDB can perform a large subset of SQL queries through its set of instructions that can map compute-intensive database operations, such as filter, arithmetic, aggregate, group by, table join, or sort, on to the specialized high-throughput accelerators. To minimize the amount of SSD I/O operations required, AxleDB also supports hardware MinMax indexing for databases. We evaluated AxleDB with five decision support queries from the TPC-H benchmark suite and achieved a speedup from 1.8X to 34.2X and energy efficiency from 2.8X to 62.1X, in comparison to the state-of-the-art DBMS, i.e., PostgreSQL and MonetDB.  相似文献   

19.
异构数据源集成系统旨在为用户提供一个一致的访问接口,由于参与集成的各数据源不仅高度自治、模式各异、更新频繁,而且查询功能有各自特殊的限制,给查询处理过程中数据源定位和查询优化造成一定的困难。本文在分析异构集成系统特征和功能需求的基础上,提出一种基于KQML的数据源能力描述框架,为各数据源灵活动态的发布自身能力提供保证。进而通过形式化的规范描述刻画数据源的结构特征和行为特征,为定位查询相关数据源奠定基础.并有助于全局查询处理器对查询计划进行优化,缩减查询的搜索空间,提高查询效率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号