首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This research investigates and approach to query processing in a multidatabase system that uses an objectoriented model to capture the semantics of other data models. The object-oriented model is used to construct a global schema, defining an integrated view of the different schemas in the environment. The model is also used as a self-describing model to build a meta-database for storing information about the global schema. A unique aspect of this work is that the object-oriented model is used to describe the different data models of the multidatabase environment, thereby extending the meta database with semantic information about the local schemas. With the global and local schemas all represented in an object-oriented form, structural mappings between the global schema and each local schema are then easily supported. An object algebra then provides a query language for expressing global queries, using the structural mappings to translate object algebra queries into SQL queries over local relational schema. The advantage of using an object algebra is that the object-oriented database can be viewed as a blackboard for temporary storage of local data and for establishing relationships between different databases. The object algebra can be used to directly retrieve temporarily-stored data from the object-oriented database or to transparently retrieve data from local sources using the translation process described in this paper.  相似文献   

2.
To meet users' growing needs for accessing pre-existing heterogeneous databases, a multidatabase system (MDBS) integrating multiple databases has attracted many researchers recently. A key feature of an MDBS is local autonomy. For a query retrieving data from multiple databases, global query optimization should be performed to achieve good system performance. There are a number of new challenges for global query optimization in an MDBS. Among them, a major one is that some local optimization information, such as local cost parameters, may not be available at the global level because of local autonomy. It creates difficulties for finding a good decomposition of a global query during query optimization. To tackle this challenge, a new query sampling method is proposed in this paper. The idea is to group component queries into homogeneous classes, draw a sample of queries from each class, and use observed costs of sample queries to derive a cost formula for each class by multiple regression. The derived formulas can be used to estimate the cost of a query during query optimization. The relevant issues, such as query classification rules, sampling procedures, and cost model development and validation, are explored in this paper. To verify the feasibility of the method, experiments were conducted on three commercial database management systems supported in an MDBS. Experimental results demonstrate that the proposed method is quite promising in estimating local cost parameters in an MDBS.  相似文献   

3.
A multidatabase system (MDBS) allows the users to simultaneously access heterogeneous,and autonomous databases using an integrated schema and a single global query language. The query optimization problem in MDBSs is quite different from the query optimization problem in distributed homogeneous databases due to schema heterogeneity and autonomy of local database systems. In this work, we consider the optimization of query distribution in case of data replication and the optimization of intersite joins, that is, the join of the results returned by the local sitesin response to the global subqueries. The algorithms presented for the optimization of intersite joins try to maximize the parallelism in execution and take the federated nature of the problem into account. It has also been shown through a comparativeperformance study that the proposed intersite join optimization algorithms are efficient.The approach presented can easily be generalized to any operation required for intersite query processing. The query optimization scheme presentedin this paper is being implemented within the scopeof a multidatabase system which is based on OMG‘sobject management architecture.  相似文献   

4.
On resolving schematic heterogeneity in multidatabase systems   总被引:4,自引:0,他引:4  
The objective of a multidatabase system is to provide a single uniform interface to accessing multiple independent databases being managed by multiple independent, and possibly heterogeneous, database systems. One crucial element in the design of a multidatabase system is the design of a data definition language for specifying a schema that represents the integration of the schemas of multiple independent databases. The design of such a language in turn requires a comprehensive classification of the conflicts (i.e., discrepancies) among the schemas of the independent databases and development of techniques for resolving (i.e., homogenizing) all of the conflicts in the classification. An earlier paper provided a comprehensive classification of schematic conflicts that may arise when integrating multiple independent relational database (RDB) schemas into a single multidatabase (MDB) schema. In this paper, we provide a comprehensive classification of techniques for resolving the schematic conflicts that may arise when integrating multiple RDB schemas, or RDB schemas and object-oriented database (OODB) schemas, or multiple OODB schemas. The classification of conflict resolution techniques includes not only those necessary for resolving schematic conflicts identified in the earlier paper, but also additional conflicts that arise when OODBs become part of the databases to be integrated. Most of the conflict resolution techniques discussed in the paper have already been incorporated into SQL/M, a multidatabase language implemented in UniSQL/M, a commercially available multidatabase system from UniSQL, Inc. which integrated SQL-based relational database systems and the UniSQL/X unified relational and object-oriented database system.  相似文献   

5.
A multidatabase system (MDBS) integrates information from multiple autonomous local databases. Performing global query optimization to achieve efficient query processing in such a system is challenging due to local autonomy of the data sources. Dynamic factors in the environment make the problem even more difficult. In this paper, we present two techniques, i.e., contention space partitioning and cost error controlling, to perform global query optimization in a dynamic MDBS. Both techniques generate an execution plan with multiple versions for a query in a dynamic MDBS, utilizing the multistate cost models built for the dynamic environment via our previous multistate query sampling method. The first technique partitions the contention space of a dynamic multidatabase environment into a given number of subspaces and chooses a good query execution plan version for each subspace, while the second technique selects a set of execution plan versions by using a given error tolerance to control query execution costs. Experiments demonstrate that the proposed techniques are quite promising for performing global query optimization in a dynamic MDBS. Compared with related work on dynamic query optimization, our approach has an advantage of avoiding the high overhead for modifying or re-generating an execution plan for a query based on dynamic runtime information. Research was supported by the US National Science Foundation under Grant # IIS-9811980 and The University of Michigan.  相似文献   

6.
New applications of information systems need to integrate a large number of heterogeneous databases over computer networks. Answering a query in these applications usually involves selecting relevant information sources and generating a query plan to combine the data automatically. As significant progress has been made in source selection and plan generation, the critical issue has been shifting to query optimization. This paper presents a semantic query optimization (SQO) approach to optimizing query plans of heterogeneous multidatabase systems. This approach provides global optimization for query plans as well as local optimization for subqueries that retrieve data from individual database sources. An important feature of our local optimization algorithm is that we prove necessary and sufficient conditions to eliminate an unnecessary join in a conjunctive query of arbitrary join topology. This feature allows our optimizer to utilize more expressive relational rules to provide a wider range of possible optimizations than previous work in SQO. The local optimization algorithm also features a new data structure called AND-OR implication graphs to facilitate the search for optimal queries. These features allow the global optimization to effectively use semantic knowledge to reduce the data transmission cost. We have implemented this approach in the PESTO (Plan Enhancement by SemanTic Optimization) query plan optimizer as a part of the SIMS information mediator. Experimental results demonstrate that PESTO can provide significant savings in query execution cost over query plan execution without optimization  相似文献   

7.
To enforce global serializability in a multidatabase environment the multidatabase transaction manager must take into account the indirect (transitive) conflicts between multidatabase transactions caused by local transactions. Such conflicts are difficult to resolve because the behavior or even the existence of local transactions is not known to the multidatabase system. To overcome these difficulties, we propose to incorporate additional data manipulation operations in the subtransactions of each multidatabase transaction. We show that if these operations create direct conflicts between subtransactions at each participating local database system, indirect conflicts can be resolved even if the multidatabase system is not aware of their existence. Based on this approach, we introduce optimistic and conservative multidatabase transaction management methods that require the local database systems to ensure only local serializability. The proposed methods do not violate the autonomy of the local database systems and guarantee global serializability by preventing multidatabase transactions from being serialized in different ways at the participating database systems. Refinements of these methods are also proposed for multidatabase environments where the participating database systems allow schedules that are cascadeless or transactions have analogous execution and serialization orders. In particular, we show that forced local conflicts can be eliminated in rigorous local systems, local cascadelessness simplifies the design of a global scheduler, and that local strictness offers no significant advantages over cascadelessness  相似文献   

8.
Advances in networking and database technology have made global information sharing a reality. Multidatabase systems (MDBSs) represent a promising approach to addressing the challenges of achieving interoperability among multiple pre-existing databases that are highly autonomous and possibly heterogeneous. The performance of an MDBS is greatly dependent on effectiveness of multidatabase query optimization (MQO). However, the unavailability of and uncertainty in the statistics essential to query optimization have made multidatabase query optimization (MQO) significantly more challenging than distributed query optimization. This research undertook to develop a fuzzy statistics-based MQO approach to addressing statistics estimation and uncertainty problems in an MDBS environment. We analyzed the statistics needed in an MDBS environment and classified them into three categories: point-based, distribution-function-based and dependency-based. Fuzzy numbers were adopted to represent point-based statistics, and a fuzzy polynomial regression method was developed for estimating distribution function-based statistics (i.e., attribute or join selectivity) from a set of subquery results. For dependency-based statistics, a fuzzy regression method was employed for estimating logical-parameter-based local cost functions. Furthermore, methods for ranking the fuzzy numbers that are fundamental to fuzzy-statistics-based MQO were also discussed. The proposed fuzzy statistics estimation methods were illustrated using examples to demonstrate its applicability in supporting MQO.  相似文献   

9.
多数据库系统一般具有四级模式结构,全局用户只能访问全局模式,而最终的数据必须从各局部数据库系统中获得,因此必须建立多数据库系统的模式映射,它表示了局部模式通过输出模式集成为全局模式的相应转换。本文给出了一种多数据库系统中的模式映射方法,并使用 模式映射树来存储和表达这种模式映射。  相似文献   

10.
In a multidatabase system that consists of object databases, the same real-world entity can be stored as objects in different databases with incompatible object identifiers. How to identify and integrate these objects representing the same entities such that (a) object duplication in the query result can be avoided, (b) information for the entity can be gathered, and (c) the specialization of multiple classes can be built is an important issue to provide a well structured global object schema and a more informative query result. In this paper, we extend our results on probabilistic query processing and joining relations on incompatible keys to solve the problem. Various data and schema conflicts such as missing data, inconsistent data and domain mismatch which may exist in classes from different databases are considered in the process of identification.Recommended by: Amit Sheth  相似文献   

11.
Existence of semantic conflicts between component databases severely impacts query processing in a multidatabase system. In this paper, we describe two types of semantic conflicts that have to be dealt with in the integration of databases modeling information about related sets of real-world entities. These are the entityidentification problem and theattribute value conflict problem. While thetwo-way outerjoin operation has been commonly used for resolving entity identification problem between two component relations, outerjoins using regular equality comparisons between component relation keys is shown to produce counter-intuitive entity identification result. We remedy this by defining a newkey-equality comparator in place of regular equality comparator, for outerjoins. For the attribute value conflict problem, we define aGeneralized Attribute Derivation (GAD) operation which allows user-defined attribute derivation functions to be used to compute new attributes from the component relations' attributes. By adding two-way outerjoin andGAD to the set of relational operations, the traditional algebraic transformation framework for relational queries is no longer adequate for multidatabase query processing and optimization. As a result, we introduceconstrained query tree as the multidatabase query representation. We show that some knowledge about query predicates and attribute derivation functions can be used to simplify queries. Such knowledge is modeled as an outerjoin graph attached to every outerjoin operation in the query tree. Based on this, we further extend the traditional algebraic transformation framework to include two-way outerjoins andGAD operations. Our framework demonstrates that properties of selection/join predicates and attribute derivation functions can be used to provide interesting transformation alternatives. This framework also serves as a formal ground for developing optimization strategies for multidatabase queries. Recommended by: Clement Yu  相似文献   

12.
Heterogeneities exist in a multidatabase environment. For example, a real world entity may be differently represented in relations of different databases. In particular, keys of these relations may be incompatible. In this paper, we consider processing entity join queries when data transmission cost dominates. An entity join operation ‘integrates’ tuples representing the same entities from different relations in which inconsistent data may exist. A natural way to process the entity join is to transmit both relations to a site, resolve the possible conflicts between corresponding attributes and process the join, which is very costly. In this paper, an approach is proposed to correctly transform a global query into local subqueries to preprocess entity join queries in multiple sites with an attempt to lower the cost of data transmission. Besides, an extension of the traditional semijoin, named extended semijoin, is proposed to further reduce the cost of data transmission for entity join query processing.  相似文献   

13.
多数据库系统中查询分解算法的研究   总被引:1,自引:0,他引:1  
多数据库系统允许用户使用一个集成模式和简单的全局查询语言同时访问多个异构的、自治的数据库系统。全局查询分解处理是多数据库系统中的一个很重要的问题。本文给出了一种多数据库环境中的模式信息管理方法,基于这些模式信息,我们提出一种易于实现的查询分解算法。由于多数据库查询分解处理与模式集成的实现紧密相关,所以本文对多数据库系统的模式集成作了一些描述。  相似文献   

14.
王进鹏  张亚非  苗壮 《计算机科学》2010,37(12):134-137
为实现异构关系数据库的语义集成,针对传统集成技术存在的问题,在对语义网等相关技术进行分析的基础上,研究基于本体的关系数据集成系统中的查询处理问题,提出了一种基于本体的关系数据库集成框架。设计了基于本体的关系数据的描述方法,使用本体作为集成的全局模式来描述关系模式的语义。设计了查询重写算法,该算法可以将基于全局模式的SPARQL查询重写为针对具体关系数据库的查询,从而实现对异构关系数据库的集成。实验表明,该算法具有良好的可扩展性。  相似文献   

15.
16.
基于分布对象Web的主流数据库集成系统   总被引:2,自引:0,他引:2  
面向新一代的分布式对象Web体系结构,将全局数据库技术,全局数据库事务管理技术,主流数据库对象化组件技术和先进的CORBA对象技术综合成一体,在支持异构分布式系统的CORBA机制,全局数据库模式,全局查询语言的定义,语法和语义分析,查询优化,主流数据库的对象化表示,事务的管理,并发控制,安全管理,全局事务的完整性处理,主流数据库的集成工具,多数据库系统的管理和维护策略等方面,针对实用化的目标开展研究并加以实现。  相似文献   

17.
Global query execution in a multidatabase system can be done parallelly, as all the local databases are independent. In this paper, a cost model that considers parallel execution of subqueries for a global query is developed. In order to obtain maximum parallelism in query execution, it is required to find a query execution plan that is represented in the form of a bushy tree and this query tree should be balanced to the maximal possible extent with respect to execution time. A new bottom up approach called Agglomerative Approach (AA) is proposed to construct balanced bushy trees with respect to execution time. By the deterministic nature of this approach, it generates local optimal solutions. This local minima problem will be severe in the case of graph queries, i.e., queries that are represented with a graph structure. A Simulated annealing Approach (SA) is employed to obtain a (near) optimal solution. These approaches (AA and SA) are suitable for handling on-line and off-line queries respectively. A Hybrid Approach (HA), that is an integration of AA and SA, is proposed to optimize queries for which the estimated time to be spent on optimization is known a priori. Results obtained with AA and SA on both tree and graph structured queries are presented.  相似文献   

18.
The object data management group (ODMG) object model offers a standard for object-oriented database designers, while attempting to address some issues of interoperability. This research is focused on the viability of using the ODMG data model as a canonical data model in a multidatabase environment, and where weaknesses are identified we have proposed amendments to enable the model to suit the specific needs of this type of distributed database system. This paper describes our efforts to extend its relational style algebra, and to provide query closure and a viewing mechanism for object query language to construct multidatabase schemas.  相似文献   

19.
This paper presents an approach to query decomposition in a multidatabase environment. The unique aspect of this approach is that it is based on performing transformations over an object algebra that can be used as the basis for a global query language. In the paper, we first present our multidatabase environment and semantic framework, where a global conceptual schema based on the Object Data Management Group standard encompasses the information from heterogeneous data sources that include relational databases as well as object-oriented databases and flat file sources. The meta-data about the global schema is enhanced with information about virtual classes as well as virtual relationships and inheritance hierarchies that exist between multiple sources. The AQUA object algebra is used as the formal foundation for manipulation of the query expression over the multidatabase. AQUA is enhanced with distribution operators for dealing with data distribution issues. During query decomposition we perform an extensive analysis of traversals for path expressions that involve virtual relationships and hierarchies for access to several heterogeneous sources. The distribution operators defined in algebraic terms enhance the global algebra expression with semantic information about the structure, distribution, and localization of the data sources relevant to the solution of the query. By using an object algebra as the basis for query processing, we are able to define algebraic transformations and exploit rewriting techniques during the decomposition phase. Our use of an object algebra also provides a formal and uniform representation for dealing with an object-oriented approach to multidatabase query processing. As part of our query processing discussion, we include an overview of a global object identification approach for relating semantically equivalent objects from diverse data sources, illustrating how knowledge about global object identity is used in the decomposition and assembly processes.  相似文献   

20.
Query languages for relational multidatabases   总被引:2,自引:0,他引:2  
With the existence of many autonomous databases widely accessible through computer networks, users will require the capability to jointly manipulate data in different databases. A multidatabase system provides such a capability through a multidatabase manipulation language, such as MSQL. We propose a theoretical foundation for such languages by presenting a multirelational algebra and calculus based on the relational algebra and calculus. The proposal is illustrated by various queries on an example multidatabase. It is shown that properties of the multirelational algebra may be used for optimization and that every multirelational algebra query can be expressed as a multirelational calculus query. The connection between the multirelational languages and MSQL, the multidatabase version of SQL, is also investigated.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号