首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In schema integration, schematic discrepancies occur when data in one database correspond to metadata in another. We explicitly declare the context that is the meta information relating to the source, classification, property etc. of entities, relationships or attribute values in entity–relationship (ER) schemas. We present algorithms to resolve schematic discrepancies by transforming metadata into the attribute values of entity types, keeping the information and constraints of original schemas. Although focusing on the resolution of schematic discrepancies, our technique works seamlessly with the existing techniques resolving other semantic heterogeneities in schema integration.  相似文献   

2.
Matching query interfaces is a crucial step in data integration across multiple Web databases. The problem is closely related to schema matching that typically exploits different features of schemas. Relying on a particular feature of schemas is not sufficient. We propose an evidential approach to combining multiple matchers using Dempster–Shafer theory of evidence. First, our approach views the match results of an individual matcher as a source of evidence that provides a level of confidence on the validity of each candidate attribute correspondence. Second, it combines multiple sources of evidence to get a combined mass function that represents the overall level of confidence, taking into account the match results of different matchers. Our combination mechanism does not require the use of weighing parameters, hence no setting and tuning of them is needed. Third, it selects the top k attribute correspondences of each source attribute from the target schema based on the combined mass function. Finally it uses some heuristics to resolve any conflicts between the attribute correspondences of different source attributes. Our experimental results show that our approach is highly accurate and effective.  相似文献   

3.
Model independent assertions for integration of heterogeneous schemas   总被引:3,自引:0,他引:3  
Due to the proliferation of database applications, the integration of existing databases into a distributed or federated system is one of the major challenges in responding to enterprises' information requirements. Some proposed integration techniques aim at providing database administrators (DBAs) with a view definition language they can use to build the desired integrated schema. These techniques leave to the DBA the responsibility of appropriately restructuring schema elements from existing local schemas and of solving inter-schema conflicts. This paper investigates theassertion-based approach, in which the DBA's action is limited to pointing out corresponding elements in the schemas and to defining the nature of the correspondence in between. This methodology is capable of: ensuring better integration by taking into account additional semantic information (assertions about links); automatically solving structural conflicts; building the integrated schema without requiring conforming of initial schemas; applying integration rules to a variety of data models; and performing view as well as database integration. This paper presents the basic ideas underlying our approach and focuses on resolution of structural conflicts.  相似文献   

4.
Schemaless databases, and document-oriented databases in particular, are preferred to relational ones for storing heterogeneous data with variable schemas and structural forms. However, the absence of a unique schema adds complexity to analytical applications, in which a single analysis often involves large sets of data with different schemas. In this paper we propose an original approach to OLAP on collections stored in document-oriented databases. The basic idea is to stop fighting against schema variety and welcome it as an inherent source of information wealth in schemaless sources. Our approach builds on four stages: schema extraction, schema integration, FD enrichment, and querying; these stages are discussed in detail in the paper. To make users aware of the impact of schema variety, we propose a set of indicators inspired by the definition of attribute density. Finally, we experimentally evaluate our approach in terms of efficiency and effectiveness.  相似文献   

5.
6.
View integration: a step forward in solving structural conflicts   总被引:1,自引:0,他引:1  
Thanks to the development of the federated systems approach on the one hand and the emphasis on user involvement in database design on the other, the interest in schema integration techniques is significantly increasing. Theories, methods and tools have been proposed. Conflict resolution is the key issue. Different perceptions by schema designers may lead to different representations. A way must be found to support these different representations within a single system. Most current integration methodologies rely on modification of initial schemas to solve the conflicts. This approach needs a strong interaction with the database administrator, who has authority to modify the initial schemas. This paper presents an approach to view integration specifically intended to support the coexistence of different representations of the same real-world objects. The main characteristics of this approach are the following: automatic resolution of structural conflicts, conflict resolution performed without modification of initial views, use of a formal declarative approach for user definition of inter-view correspondences, applicability to a variety of data models, and automatic generation of structural and operational mappings between the views and the integrated schema. Allowing users' views to be kept unchanged should result in improved user satisfaction. Each user is able to define his own view of the database, without having to conform to some other user's view. Moreover, such a feature is essential in database integration if existing programs are to be preserved  相似文献   

7.
模式匹配就是在作为输入的模式中有对应语义关系的元素间产生一个映射.为了提高模式匹配的效率,提出了一种新型的模式匹配方法--源模式分裂模式匹配算法.它可以解决标准模式匹配难以解决的问题:1)源模式的某一个属性和多个目标模式的多个属性之间建立匹配关系;2)表格中的不同元组对应其他表格同一元组的不同属性值的匹配.在匹配过程中,该方法先搜索种类型属性,然后根据种类型属性建立选择条件,最后把源模式进行分裂形成视图,再重新生成候选匹配集合,从而提高模式匹配的质量.  相似文献   

8.
Schema integration aims to create a mediated schema as a unified representation of existing heterogeneous sources sharing a common application domain. These sources have been increasingly written in XML due to its versatility and expressive power. Unfortunately, these sources often use different elements and structures to express the same concepts and relations, thus causing substantial semantic and structural conflicts. Such a challenge impedes the creation of high-quality mediated schemas and has not been adequately addressed by existing integration methods. In this paper, we propose a novel method, named XINTOR, for automating the integration of heterogeneous schemas. Given a set of XML sources and a set of correspondences between the source schemas, our method aims to create a complete and minimal mediated schema: it completely captures all of the concepts and relations in the sources without duplication, provided that the concepts do not overlap. Our contributions are fourfold. First, we resolve structural conflicts inherent in the source schemas. Second, we introduce a new statistics-based measure, called path cohesion, for selecting concepts and relations to be a part of the mediated schema. The path cohesion is statistically computed based on multiple path quality dimensions such as average path length and path frequency. Third, we resolve semantic conflicts by augmenting the semantics of similar concepts with context-dependent information. Finally, we propose a novel double-layered mediated schema to retain a wider range of concepts and relations than existing mediated schemas, which are at best either complete or minimal, but not both. Performed on both real and synthetic datasets, our experimental results show that XINTOR outperforms existing methods with respect to (i) the mediated-schema quality using precision, recall, F-measure, and schema minimality; and (ii) the execution performance based on execution time and scale-up performance.  相似文献   

9.
The emergence of increasing number of collaborating organizations has made clear the need for supporting interoperability infrastructures, enabling sharing and exchange of data among organizations. Schema matching and schema integration are the crucial components of the interoperability infrastructures, and their semi-automation to interrelate or integrate heterogeneous and autonomous databases in collaborative networks is desired. The Semi-Automatic Schema Matching and INTegration (SASMINT) System introduced in this paper identifies and resolves several important syntactic, semantic, and structural conflicts among schemas of relational databases to find their likely matches automatically. Furthermore, after getting the user validation on the matched results, it proposes an integrated schema. SASMINT uses a combination of a variety of metrics and algorithms from the Natural Language Processing and Graph Theory domains for its schema matching. For the schema integration, it utilizes a number of derivation rules defined in the scope of the research work explained in this paper. Furthermore, a derivation language called SASMINT Derivation Markup Language (SDML) is defined for capturing and formulating both the results of matching and the integration that can be further used, for example for federated query processing from independent databases. In summary, the paper focuses on addressing: (1) conflicts among schemas that make automatic schema matching and integration difficult, (2) the main components of the SASMINT approach and system, (3) in-depth exploration of SDML, (4) heuristic rules designed and implemented as part of the schema integration component of the SASMINT system, and (5) experimental evaluation of SASMINT.  相似文献   

10.
This research investigates and approach to query processing in a multidatabase system that uses an objectoriented model to capture the semantics of other data models. The object-oriented model is used to construct a global schema, defining an integrated view of the different schemas in the environment. The model is also used as a self-describing model to build a meta-database for storing information about the global schema. A unique aspect of this work is that the object-oriented model is used to describe the different data models of the multidatabase environment, thereby extending the meta database with semantic information about the local schemas. With the global and local schemas all represented in an object-oriented form, structural mappings between the global schema and each local schema are then easily supported. An object algebra then provides a query language for expressing global queries, using the structural mappings to translate object algebra queries into SQL queries over local relational schema. The advantage of using an object algebra is that the object-oriented database can be viewed as a blackboard for temporary storage of local data and for establishing relationships between different databases. The object algebra can be used to directly retrieve temporarily-stored data from the object-oriented database or to transparently retrieve data from local sources using the translation process described in this paper.  相似文献   

11.
A global schema is a single, connected view of heterogeneous databases. Past research into the problem of global schema design has demonstrated the use of generalization for connecting disjoint schemas. Before generalization can be applied, the common attributes of the local schemas must be identified. We use the term attribute equivalence for the identification of such common attributes. This paper defines four types of attribute equivalences. The distinction between local and global equivalence is explained, and special attention is given to key attributes. We also discuss the placement of locally equivalent attributes in a global schema.  相似文献   

12.
王进鹏  张亚非  苗壮 《计算机科学》2010,37(12):134-137
为实现异构关系数据库的语义集成,针对传统集成技术存在的问题,在对语义网等相关技术进行分析的基础上,研究基于本体的关系数据集成系统中的查询处理问题,提出了一种基于本体的关系数据库集成框架。设计了基于本体的关系数据的描述方法,使用本体作为集成的全局模式来描述关系模式的语义。设计了查询重写算法,该算法可以将基于全局模式的SPARQL查询重写为针对具体关系数据库的查询,从而实现对异构关系数据库的集成。实验表明,该算法具有良好的可扩展性。  相似文献   

13.
This paper studies certain transformations of XML schemas, which are widely used in algorithms of the XML data management. In view of the fact that properties and functional characteristics of the XML documents considerably differ from those of data of other type, the solutions of a number of typical data management problems (such as the XML data validation, schema inference, and data translation to/from other models) for them are more complicated. The general idea of our approach to solving these problems is to transform the original structure (i.e., structural schema constraints) into another structure without loss of information about properties of the original data that are important for applications. The suggested technique has been successfully used in various algorithms for solving problems of this kind. In this paper, a systematic approach to solving these problems is discussed. Methods for reducing the XML schemas to several canonical forms are presented, and algorithms of solving the management problems for data satisfying schemas represented in the canonical forms are examined.  相似文献   

14.
A Methodology for Data Schema Integration in the Entity Relationship Model   总被引:1,自引:0,他引:1  
The conceptual design of databases is usually seen as divided into two steps: view modeling, during which user requirements are formally expressed by means of several user oriented conceptual schemata, and schema integration, whose goal is to merge such schemata into a unique global conceptual schema. This paper is devoted to describe a methodology for schema integration. An enriched entity relationship model is chosen as the data model. The integration process consists of three steps: first, several types of conflicts between the different user schemata are checked and solved; second, schemata are merged into a draft integrated schema, that is, third, enriched and restructured according to specific goals.  相似文献   

15.
Cooperative Information Systems have been proposed to allow a uniform access to heterogeneous data yet preserving their operational autonomy. They use global dictionaries defined on the basis of interscheme properties; these include nominal and structural properties, type conflicts and object cluster similarities. Whereas in the literature a certain number of techniques has been proposed for deriving nominal and structural properties, few approaches exist for detecting type conflicts and object cluster similarities. The type of an object indicates if it is an entity, a relationship or an attribute; type conflicts indicate the existence of objects representing the same concept yet having different types. Object cluster similarities denote similitudes between portions of different schemes. This paper proposes an automatic, probabilistic approach to the detection of type conflicts and object cluster similarities in database schemes. The method we are proposing here is based on considering pairs of objects having different types (resp., pairs of clusters), belonging to different schemes and on measuring their similarity. To this purpose object (resp., cluster) structures as well as object (resp., cluster) neighborhoods are analyzed to verify similitudes and differences. A number of examples shows the suitability of our techniques to effectively detect type conflicts and object cluster similarities.  相似文献   

16.
In this article we investigate an attribute-oriented induction approach for acquisition of abstract knowledge from data stored in a fuzzy database environment. We utilize a proximity-based fuzzy database schema as the medium carrying the original information, where lack of precise information about an entity can be reflected via multiple attribute values, and the classical equivalence relation is replaced with the broader fuzzy proximity relation. We analyze in detail the process of attribute-oriented induction by concept hierarchies, utilizing the original properties of fuzzy databases to support this established data mining technique. In our approach we take full advantage of the implicit knowledge about the similarity of original attribute values, included by default in the investigated fuzzy database schemas. © 2007 Wiley Periodicals, Inc. Int J Int Syst 22: 763–779, 2007.  相似文献   

17.
The integration of views and schemas is an important part of database design and evolution and permits the sharing of data across complex applications. The view and schema integration methodologies used to date are driven purely by semantic considerations, and allow integration of objects only if that is valid from both semantic and structural view points. We discuss a new integration method called structural integration that has the advantage of being able to integrate objects that have structural similarities, even if they differ semantically. This is possible by using the object-oriented Dual Model which allows separate representation of structure and semantics. Structural integration has several advantages, including the identification of shared common structures that is important for sharing of data and methods.  相似文献   

18.
Conceptual modelling as applied to database development can be described as a two stage process: schema modelling followed by schema integration. Schema modelling is the process of transforming individual user requirements into a conceptual schema: an implementation-independent map of data requirements. Schema integration is the process of combining individual conceptual schemas into a single, unified schema. Single-user tools for schema modelling have enjoyed much success partly because the process of schema modelling has become relatively well formalised. Although a number of formal approaches to conducting schema integration have been proposed, it appears that schema integration tools have not enjoyed the same level of success. This we attribute not so much to the problem of formalisation but to the inherent collaborative nature of schema integration work. This paper first discusses the importance of collaboration to schema integration work. It then describes SISIBIS, a demonstrator system employing the IBIS (Issue Based Information System) scheme to support collaborative database design.  相似文献   

19.
在给定关系模式的属性集及其函数依赖最小覆盖集的基础上,提出一种基于模式图的规范化XML模式设计方法。定义了模式图,在模式图中增加了Keys的描述信息,给出由函数依赖集构造模式图的算法。该模式图独立于具体的XML模式语言,经分析证明,所设计的模式满足XNF。  相似文献   

20.
《Information Systems》2002,27(4):245-275
Entity relationship (ER) schemas include cardinality constraints, that restrict the dependencies among entities within a relationship type. The cardinality constraints have direct impact on the application maintenance, since insertions or deletions of entities or relationships might affect related entities. Indeed, maintenance of a system or of a database can be strengthened to enforce consistency with respect to the cardinality constraints in a schema. Yet, once an ER schema is translated into a logical database schema, or translated within a system, the direct correlation between the cardinality constraints and maintenance transactions is lost, since the components of the ER schema might be decomposed among those of the logical database schema or the target system.In this paper, a full solution to the enforcement of cardinality constraints in EER schemas is given. We extend the enhanced ER (EER) data model with structure-based update methods that are fully defined by the cardinality constraints. The structure methods are provably terminating and cardinality faithful, i.e., they do not insert new inconsistencies and can only decrease existing ones. A refined approach towards measuring the cardinality consistency of a database is introduced. The contribution of this paper is in the automatic creation of update methods, and in building the formal basis for proving their correctness.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号