首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 10 毫秒
1.
基于多库操作语言的异构数据库集成框架研究   总被引:1,自引:0,他引:1  
为解决当前异构数据库集成框架中全局模式维护复杂的问题,提出了基于多库操作语言的异构数据库集成框架HDIFBML.主要内容包含关键模块设计、全局模式及模式映射概念,并定义了一种多库操作语言SMSQL.SMSQL定义了模式构建语言集和模式映射语言集来构建,维护全局模式,并通过字段转化解决不同模式集成时的结构/语义冲突.实践应用表明,HDIFBML屏蔽了底层异构数据库执行细节,可以灵活地维护全局模式,具有良好的可操作性、可维护性和可扩展性.  相似文献   

2.
3.
This paper is concerned with the problem of integrating a number of existing off-the-shelf local database systems into a multidatabase system that maintains consistency in the face of concurrency and failures.The major difficulties in designing such systems stem from the requirements that local transactions be allowed to execute outside the multidatabase system control, and that the various local database systems cannot participate in the execution of a global commit protocol. A scheme based on the assumption that the component local database systems use the strict two-phase locking protocol is developed. Two major problems are addressed: How to ensure global transaction atomicity without the provision of a commit protocol, and how to ensure freedom from global deadlocks.  相似文献   

4.
Fingerprint has been widely used in a variety of biometric identification systems in the past several years due to its uniqueness and immutability. With the rapid development of fingerprint identification techniques, many fingerprint identification systems are in urgent need to deal with large-scale fingerprint storage and high concurrent recognition queries, which bring huge challenges to the system. In this circumstance, we design and implement a distributed and load-balancing fingerprint identification system named Pegasus, which includes a distributed feature extraction subsystem and a distributed feature storage subsystem. The feature extraction procedure combines the Hadoop Image Processing Interface (HIPI) library to enhance its overall processing speed; the feature storage subsystem optimizes MongoDB’s default load balance strategy to improve the efficiency and robustness of Pegasus. Experiments and simulations are carried out, and results show that Pegasus can reduce the time cost by 70% during the feature extraction procedure. Pegasus also balances the difference of access load among front-end mongos nodes to less than 5%. Additionally, Pegasus reduces over 40% of data migration among back-end data shards to obtain a more reasonable data distribution based on the operation load (insertion, deletion, update, and query) of each shard.  相似文献   

5.
Peculiarity oriented multidatabase mining   总被引:2,自引:0,他引:2  
Peculiarity rules are a new class of rules which can be discovered by searching relevance among a relatively small number of peculiar data. Peculiarity oriented mining in multiple data sources is different from, and complementary to, existing approaches for discovering new, surprising, and interesting patterns hidden in data. A theoretical framework for peculiarity oriented mining is presented. Within the proposed framework, we give a formal interpretation and comparison of three classes of rules, namely, association rules, exception rules, and peculiarity rules, as well as describe how to mine interesting peculiarity rules in multiple databases.  相似文献   

6.
This paper describes the integration of a multidatabase system and a knowledge-base system to support the data-integration component of a data warehouse. The multidatabase system integrates various component databases with a common query language; however, it does not provide capability for schema integration and other utilities necessary for data warehousing. In addition, the knowledge base system offers a declarative logic language with second-order syntax but first-order semantics for integrating the schemes of the data sources into the warehouse and for defining complex, recursively defined materialized views. Furthermore, deductive rules are also used for cleaning, checking the integrity and summarizing the data imported into the data warehouse. The knowledge base system features an efficient incremental view maintenance mechanism that is used for refreshing the data warehouse, without querying the data sources.  相似文献   

7.
Global committability in multidatabase systems   总被引:1,自引:0,他引:1  
Develops a formal basis for research into the reliability aspects of transaction processing in multidatabase systems (MDBSs). We define a new correctness notion called `global committability' for the correct unilateral commit and the retry recovery of global transactions in an autonomous MDBS environment. This notion makes it easier to ensure the isolation property of global transactions when the retry approach is applied. The formalization work illustrates that the conventional serializability and recoverability notions are not sufficient to specify the correct execution (i.e. isolated execution and recovery) of global transactions when the unilateral commit and the retry recovery are used to ensure the atomicity of global transactions. This work is significant because the unilateral commit and the retry recovery are an attractive complementary means to the undo recovery (whose correct schedule is specified by the conventional recoverability notion) for advanced transaction applications with the characteristics of site autonomy and long-lived execution  相似文献   

8.
Overview of multidatabase transaction management   总被引:8,自引:0,他引:8  
A multidatabase system (MDBS) is a facility that allows users access to data located in multiple autonomous database management systems (DBMSs). In such a system,global transactions are executed under the control of the MDBS. Independently,local transactions are executed under the control of the local DBMSs. Each local DBMS integrated by the MDBS may employ a different transaction management scheme. In addition, each local DBMS has complete control over all transactions (global and local) executing at its site, including the ability to abort at any point any of the transactions executing at its site. Typically, no design or internal DBMS structure changes are allowed in order to accommodate the MDBS. Furthermore, the local DBMSs may not be aware of each other and, as a consequence, cannot coordinate their actions. Thus, traditional techniques for ensuring transaction atomicity and consistency in homogeneous distributed database systems may not be appropriate for an MDBS environment. The objective of this article is to provide a brief review of the most current work in the area of multidatabase transaction management. We first define the problem and argue that the multidatabase research will become increasingly important in the coming years. We then outline basic research issues in multidatabase transaction management and review recent results in the area. We conclude with a discussion of open problems and practical implications of this research.  相似文献   

9.
Failure recovery in a multidatabase environment is addressed. It is shown that local autonomy considerations force the designer of a multidatabase system to trade off certain desirable properties to achieve reliability for transaction management. Representative techniques in the research literature are contrasted and compared. The author's approach to the problem is described  相似文献   

10.
Concurrency control in hierarchical multidatabase systems   总被引:1,自引:0,他引:1  
Over the past decade, significant research has been done towards developing transaction management algorithms for multidatabase systems. Most of this work assumes a monolithic architecture of the multidatabase system with a single software module that follows a single transaction management algorithm to ensure the consistency of data stored in the local databases. This monolithic architecture is not appropriate in a multidatabase environment where the system spans multiple different organizations that are distributed over various geographically distant locations. In this paper, we propose an alternative multidatabase transaction management architecture, where the system is hierarchical in nature. Hierarchical architecture has consequences on the design of transaction management algorithms. An implication of the architecture is that the transaction management algorithms followed by a multidatabase system must be composable– that is, it must be possible to incorporate individual multidatabase systems as elements in a larger multidatabase system. We present a hierarchical architecture for a multidatabase environment and develop techniques for concurrency control in such systems. Edited by R. Sacks-Davis. Received June 27, 1994 / Accepted September 26, 1995  相似文献   

11.
In a multidatabase system that consists of object databases, the same real-world entity can be stored as objects in different databases with incompatible object identifiers. How to identify and integrate these objects representing the same entities such that (a) object duplication in the query result can be avoided, (b) information for the entity can be gathered, and (c) the specialization of multiple classes can be built is an important issue to provide a well structured global object schema and a more informative query result. In this paper, we extend our results on probabilistic query processing and joining relations on incompatible keys to solve the problem. Various data and schema conflicts such as missing data, inconsistent data and domain mismatch which may exist in classes from different databases are considered in the process of identification.Recommended by: Amit Sheth  相似文献   

12.
Amultidatabase system is an interconnected collection of autonomous databases each managed by an autonomous database management system (DBMS). When integrating multiple DBMSs, the key is the autonomy of the underlying participants. Much research has been undertaken in the past five years aimed at describing and building an integrated multidatabase system, but to date the termautonomy has only been defined intuitively. This article provides a rigorous definition for autonomy tailored to the multidatabase environment specifically but applicable to any system environment that involves the collaboration of autonomous participants. The major contribution of this article is a technique that measures autonomy along multiple dimensions so a single numeric value describing the amount of autonomy violated by a particular system design is quantified. This has a two-fold implication. First, the technique described forces researchers to consider autonomy from several different aspects that may not be the central focus of their research, but must be considered because assumptions made regarding one aspect of a system may have implications in other areas. Second, the value can be used as a measure for direct comparison among different systems or proposals. Finally, the article demonstrates the quantification technique's applicability by applying it to several recent multidatabase research efforts.  相似文献   

13.
14.
Formal Methods in System Design - Continuous invariants are an important component in deductive verification of hybrid and continuous systems. Just like discrete invariants are used to reason about...  相似文献   

15.
16.
Replication is useful in multidatabase systems (MDBSs) because, as in traditional distributed database systems, it increases data availability in the presence of failures and decreases data retrieval costs by reading local or close copies of data. Concurrency control, however, is more difficult in replicated MDBSs than in ordinary distributed database systems. This is the case not only because local concurrency controllers may schedule global transactions inconsistently, but also because local transactions (at different sites) may access the same replicated data. In this article, we propose a decentralized concurrency control protocol for a replicated MDBS. The proposed strategy supports prompt and consistent updates of replicated data by both local and global applications without a central coordinator.  相似文献   

17.
In a multidatabase system, the participating databases are autonomous. The schemas of these databases may be different in various ways, while the same information is represented. A global query issued against the global database needs to be translated to a proper form before it can be executed in a local database. Since data requested by a query (or a part of a query) is sometimes available in multiple sites, the site (database) that processes the query with the least cost is the desired query processing site. The authors study the effect of differences in schemas on the cost of query processing in a multidatabase environment. They first classify schema conflicts to different types. For each type of conflict, they show how much more or less complex a translated query can become in comparison with the originally user-issued global query. Based on this observation, they propose an analytical method that considers the conflicts between local databases and finds the database(s) that renders the least execution cost in processing a global query. This research introduces a new level of query optimization (termed the schema-level optimization) in multidatabase environments. The results provide a new dimension of enhancement for the capability of a query optimizer in multidatabase systems  相似文献   

18.
Existence of semantic conflicts between component databases severely impacts query processing in a multidatabase system. In this paper, we describe two types of semantic conflicts that have to be dealt with in the integration of databases modeling information about related sets of real-world entities. These are the entityidentification problem and theattribute value conflict problem. While thetwo-way outerjoin operation has been commonly used for resolving entity identification problem between two component relations, outerjoins using regular equality comparisons between component relation keys is shown to produce counter-intuitive entity identification result. We remedy this by defining a newkey-equality comparator in place of regular equality comparator, for outerjoins. For the attribute value conflict problem, we define aGeneralized Attribute Derivation (GAD) operation which allows user-defined attribute derivation functions to be used to compute new attributes from the component relations' attributes. By adding two-way outerjoin andGAD to the set of relational operations, the traditional algebraic transformation framework for relational queries is no longer adequate for multidatabase query processing and optimization. As a result, we introduceconstrained query tree as the multidatabase query representation. We show that some knowledge about query predicates and attribute derivation functions can be used to simplify queries. Such knowledge is modeled as an outerjoin graph attached to every outerjoin operation in the query tree. Based on this, we further extend the traditional algebraic transformation framework to include two-way outerjoins andGAD operations. Our framework demonstrates that properties of selection/join predicates and attribute derivation functions can be used to provide interesting transformation alternatives. This framework also serves as a formal ground for developing optimization strategies for multidatabase queries. Recommended by: Clement Yu  相似文献   

19.
Toward multidatabase mining: identifying relevant databases   总被引:2,自引:0,他引:2  
Various tools and systems for knowledge discovery and data mining have been developed and are available for applications. However, when there are many databases, an immediate question is where one should start mining. It is not true that data mining is better the more databases there are. It is only true when the databases involved are relevant to the task at hand. By breaking away from the conventional data mining assumption that many databases should be joined into one, we argue that the first step for multidatabase mining is to identify databases that are most relevant to an application; without doing so, the mining process can be lengthy, aimless, and ineffective. A measure of relevance is thus proposed for mining tasks with the objective of finding patterns or regularities of certain attributes. An efficient algorithm for identifying relevant databases is described. Experiments are conducted to verify the measure's performance and to exemplify its application  相似文献   

20.
On resolving schematic heterogeneity in multidatabase systems   总被引:4,自引:0,他引:4  
The objective of a multidatabase system is to provide a single uniform interface to accessing multiple independent databases being managed by multiple independent, and possibly heterogeneous, database systems. One crucial element in the design of a multidatabase system is the design of a data definition language for specifying a schema that represents the integration of the schemas of multiple independent databases. The design of such a language in turn requires a comprehensive classification of the conflicts (i.e., discrepancies) among the schemas of the independent databases and development of techniques for resolving (i.e., homogenizing) all of the conflicts in the classification. An earlier paper provided a comprehensive classification of schematic conflicts that may arise when integrating multiple independent relational database (RDB) schemas into a single multidatabase (MDB) schema. In this paper, we provide a comprehensive classification of techniques for resolving the schematic conflicts that may arise when integrating multiple RDB schemas, or RDB schemas and object-oriented database (OODB) schemas, or multiple OODB schemas. The classification of conflict resolution techniques includes not only those necessary for resolving schematic conflicts identified in the earlier paper, but also additional conflicts that arise when OODBs become part of the databases to be integrated. Most of the conflict resolution techniques discussed in the paper have already been incorporated into SQL/M, a multidatabase language implemented in UniSQL/M, a commercially available multidatabase system from UniSQL, Inc. which integrated SQL-based relational database systems and the UniSQL/X unified relational and object-oriented database system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号