首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 66 毫秒
1.
Model independent assertions for integration of heterogeneous schemas   总被引:3,自引:0,他引:3  
Due to the proliferation of database applications, the integration of existing databases into a distributed or federated system is one of the major challenges in responding to enterprises' information requirements. Some proposed integration techniques aim at providing database administrators (DBAs) with a view definition language they can use to build the desired integrated schema. These techniques leave to the DBA the responsibility of appropriately restructuring schema elements from existing local schemas and of solving inter-schema conflicts. This paper investigates theassertion-based approach, in which the DBA's action is limited to pointing out corresponding elements in the schemas and to defining the nature of the correspondence in between. This methodology is capable of: ensuring better integration by taking into account additional semantic information (assertions about links); automatically solving structural conflicts; building the integrated schema without requiring conforming of initial schemas; applying integration rules to a variety of data models; and performing view as well as database integration. This paper presents the basic ideas underlying our approach and focuses on resolution of structural conflicts.  相似文献   

2.
The integration ofinf ormation systems is becoming increasingly important. A common requirement in distributed data-intensive applications, such as data warehousing and data mining, is that the various databases involved be joined in a process called schema integration. The entity-relationship (ER) model or a variant of the ER model is often used as the common data model. To aid the schema conforming, merging and restructuring phases of the integration process, various transformations have been defined to map between various equivalent ER representations. In this paper, we describe a different approach to integrate ER schemas. We focus on the resolution of structural conflicts, that is, when related real-world concepts are modeled using different constructs in different schemas. Unlike previous work, our approach proposes to resolve the structural conflict between an entity type in one schema and an attribute in another schema and show that the other structural conflicts are automatically resolved. This reduces the manual effort required in integration. We give a detailed algorithm to transform an attribute in one schema into an equivalent entity type in another schema without any loss of semantics, that is, our transformation is both information preserving and constraint preserving.  相似文献   

3.
In schema integration, schematic discrepancies occur when data in one database correspond to metadata in another. We explicitly declare the context that is the meta information relating to the source, classification, property etc. of entities, relationships or attribute values in entity–relationship (ER) schemas. We present algorithms to resolve schematic discrepancies by transforming metadata into the attribute values of entity types, keeping the information and constraints of original schemas. Although focusing on the resolution of schematic discrepancies, our technique works seamlessly with the existing techniques resolving other semantic heterogeneities in schema integration.  相似文献   

4.
针对XML在表示非结构化信息时语义表达能力不足,通过建立XML模式和本体之间的关联,提出了一种新的算法--概念搜索算法,此算法可在XML模式的元素和本体概念之间构建映射;在此基础上给出了一个初步的评价结果,并利用这个评价结果将来自B2B通信的真实世界模式映射到使用这个算法的不同本体.  相似文献   

5.
多数据库环境下的模式集成及查询分解   总被引:6,自引:0,他引:6  
俞红奇  丁宝康 《计算机工程》2000,26(10):124-126
在不同数据库表模式间定义了一种结构冲突,并通过元数据的方法,描述了具有结构冲突的表模式间的模式集成。讨论了将集成后的表模式上的查询转换到原表模式上的查询分解方法。  相似文献   

6.
由于缺乏足够的语义信息,不同模式的XML数据之间很难进行互操作。针对油气井工程中的XML数据集成需求,借助领域全局本体,提出一种模式无关的XML语义集成方法。该方法首先在XML Path路径与领域本体之间进行语义映射,屏蔽其模式差异;然后,按照模型映射方法将XML存储为关系数据;最后通过查询重写将SPARQL转换为SQL语句,实现语义查询。该方法对XML模式进行语义标注,利用关系数据库存储与查询XML数据,能有效处理领域XML数据的语义集成。  相似文献   

7.
模式匹配是模式集成、语义WEB及电子商务等领域的重点及难点问题. 为了有效利用专家知识提高匹配质量, 提出了一种基于部分已验证匹配关系的模式匹配模型. 在该模型中, 首先,人工验证待匹配模式元素间的少量对应关系, 进而推理出当前任务下部分已知的匹配关系及单独匹配器的缺省权重; 然后,基于上述已收集到的先验知识对多种匹配器所生成的相似度矩阵进行合并及调整, 并在全局范围内进行优化; 最后,对优化矩阵的选择性进行评估, 从而为不同匹配任务推荐最合理的候选匹配生成方案. 实验结果表明, 部分已验证匹配关系的使用有助于模式匹配质量的提高.  相似文献   

8.
王倩  王辉 《计算机工程》2012,38(4):76-78
为解决数据交换过程中的语义冲突问题,提出一种基于本体的语义冲突消解方案。利用ER模型实现关系模式到XML模式的语义映射,采用本体对经过初步语义转换的XML Schema进行语义标注。实验结果表明,该方案能减少由自然语言或符号不同引起的歧义,在一定程度上消除语义冲突。  相似文献   

9.
This paper addresses the problem of handling semantic heterogeneity during database schema integration. We focus on the semantics of terms used as identifiers in schema definitions. Our solution does not rely on the names of the schema elements or the structure of the schemas. Instead, we utilize formal ontologies consisting of intensional definitions of terms represented in a logical language. The approach is based on similarity relations between intensional definitions in different ontologies. We present the definitions of similarity relations based on intensional definitions in formal ontologies. The extensional consequences of intensional relations are addressed. The paper shows how similarity relations are discovered by a reasoning system using a higher-level ontology. These similarity relations are then used to derive an integrated schema in two steps. First, we show how to use similarity relations to generate the class hierarchy of the global schema. Second, we explain how to enhance the class definitions with attributes. This approach reduces the cost of generating or re-generating global schemas for tightly-coupled federated databases.  相似文献   

10.
Current microarray databases use different terminologies and structures and thereby limit the sharing of data and collating of results between laboratories. Consequently, an effective integrated microarray data model is required. One important process to develop such an integrated database is schema matching. In this paper, we propose an effective schema matching approach called MDSM, to syntactically and semantically map attributes of different microarray schemas. The contribution from this work will be used later to create microarray global schemas. Since microarray data is complex, we use microarray ontology to improve the measuring accuracy of the similarity between attributes. The similarity relations can be represented as weighted bipartite graphs. We determine the best schema matching by computing the optimal matching in a bipartite graph using the Hungarian optimisation method. Experimental results show that our schema matching approach is effective and flexible to use in different kinds of database models such as; database schema, XML schema, and web site map. Finally, a case study on an existing public microarray schema is carried out using the proposed method.  相似文献   

11.
模式匹配就是在作为输入的模式中有对应语义关系的元素间产生一个映射.为了提高模式匹配的效率,提出了一种新型的模式匹配方法--源模式分裂模式匹配算法.它可以解决标准模式匹配难以解决的问题:1)源模式的某一个属性和多个目标模式的多个属性之间建立匹配关系;2)表格中的不同元组对应其他表格同一元组的不同属性值的匹配.在匹配过程中,该方法先搜索种类型属性,然后根据种类型属性建立选择条件,最后把源模式进行分裂形成视图,再重新生成候选匹配集合,从而提高模式匹配的质量.  相似文献   

12.
In this work, we focus on XML data integration by studying rewritings of XML target schemas in terms of source schemas. Rewriting is very important in data integration systems where the system is asked to find and assemble XML documents from the data sources and produce documents that satisfy a target schema.As schema representation, we consider Visibly Pushdown Automata (VPAs), which accept Visibly Pushdown Languages (VPLs). The latter have been shown to coincide with the family of (word-encoded) regular tree languages, which are the basis of formalisms for specifying XML schemas. Furthermore, practical semi-formal XML schema specifications (defined by simple pattern conditions on XML) compile into VPAs that are exponentially more concise than other representations based on tree automata.Notably, VPLs enjoy a “well-behavedness” that facilitates us in addressing rewriting problems for XML data integration. Based on VPAs, we positively solve these problems, and present detailed complexity analyses.  相似文献   

13.
The Semantic Web is the next step of the current Web where information will become more machine-understandable to support effective data discovery and integration. Hierarchical schemas, either in the form of tree-like structures (e.g., DTDs, XML schemas), or in the form of hierarchies on a category/subcategory basis (e.g., thematic hierarchies of portal catalogs), play an important role in this task. They are used to enrich semantically the available information. Up to now, hierarchical schemas have been treated rather as sets of individual elements, acting as semantic guides for browsing or querying data. Under that view, queries like “find the part of a portal catalog which is not present in another catalog” can be answered only in a procedural way, specifying which nodes to select and how to get them. For this reason, we argue that hierarchical schemas should be treated as full-fledged objects so as to allow for their manipulation. This work proposes models and operators to manipulate the structural information of hierarchies, considering them as first-class citizens. First, we explore the algebraic properties of trees representing hierarchies, and define a lattice algebraic structure on them. Then, turning this structure into a boolean algebra, we present the operators S-union, S-intersection and S-difference to support structural manipulation of hierarchies. These operators have certain algebraic properties to provide clear semantics and assist the transformation, simplification and optimization of sequences of operations using laws similar to those of set theory. Also, we identify the conditions under which this framework is applicable. Finally, we demonstrate an application of our framework for manipulating hierarchical schemas on tree-like hierarchies encoded as RDF/s files.  相似文献   

14.
Matching query interfaces is a crucial step in data integration across multiple Web databases. The problem is closely related to schema matching that typically exploits different features of schemas. Relying on a particular feature of schemas is not sufficient. We propose an evidential approach to combining multiple matchers using Dempster–Shafer theory of evidence. First, our approach views the match results of an individual matcher as a source of evidence that provides a level of confidence on the validity of each candidate attribute correspondence. Second, it combines multiple sources of evidence to get a combined mass function that represents the overall level of confidence, taking into account the match results of different matchers. Our combination mechanism does not require the use of weighing parameters, hence no setting and tuning of them is needed. Third, it selects the top k attribute correspondences of each source attribute from the target schema based on the combined mass function. Finally it uses some heuristics to resolve any conflicts between the attribute correspondences of different source attributes. Our experimental results show that our approach is highly accurate and effective.  相似文献   

15.
This work describes the architecture of Contorsion, a semantic XPath processor that acts over an RDF mapping of XML. It contributes to a recent research trend that defines an XML-to-RDF mapping allowing XML documents interoperate at the semantic level. We use a model-mapping approach to represent instances of XML and XML Schema in RDF. This representation retains the node order, in contrast with the usual structure-mapping approach. The processor can be fed with an unlimited set of XML schemas and/or RDFS/OWL ontologies. The queries are resolved taking in consideration the structural and semantic connections descrived in the schemas and ontologies. Such behaviour, schema-awareness and semantic integration, can be useful for exploiting schema and ontology hierarchies in XPath queries.  相似文献   

16.
Schema matching is the task of providing correspondences between concepts describing the meaning of data in various heterogeneous, distributed data sources. It is recognized to be one of the basic operations required by the process of data and schema integration and its outcome serves in many tasks such as targeted content delivery and view integration. Schema matching research has been going on for more than 25 years now. An interesting research topic, that was largely left untouched involves the automatic selection of schema matchers to an ensemble, a set of schema matchers. To the best of our knowledge, none of the existing algorithmic solutions offer such a selection feature. In this paper we provide a thorough investigation of this research topic. We introduce a new heuristic, Schema Matcher Boosting (SMB). We show that SMB has the ability to choose among schema matchers and to tune their importance. As such, SMB introduces a new promise for schema matcher designers. Instead of trying to design a perfect schema matcher, a designer can instead focus on finding better than random schema matchers. For the effective utilization of SMB, we propose a complementary approach to the design of new schema matchers. We separate schema matchers into first-line and second-line matchers. First-line schema matchers were designed by-and-large as applications of existing works in other areas (e.g., machine learning and information retrieval) to schemata. Second-line schema matchers operate on the outcome of other schema matchers to improve their original outcome. SMB selects matcher pairs, where each pair contains a first-line matcher and a second-line matcher. We run a thorough set of experiments to analyze SMB ability to effectively choose schema matchers and show that SMB performs better than other, state-of-the-art ensemble matchers.  相似文献   

17.
XML在关系数据库中的存储问题是XML研究领域中的一个重要问题。在总结多种映射方法的基础上,提出了一种方法将多个相似的XML文档进行解析,根据映射关系,生成各自的关系模式,并分析归纳出一个集成的关系模式,然后创建一个关系数据库,并在映射关系的基础上提取并存储XML文档数据到关系数据库。此方法以较为简洁的结构保存了XML文档的数据信息,其最大的特点就是不用考虑文档的模式信息(DTD,XML Schema)。并通过一个具体的实验结果来说明这种方法的有效性。  相似文献   

18.
The capabilities of XSLT processing are widely used to transform XML documents into target XML documents. These target XML documents conform to output schemas of the used XSLT stylesheet. Output schemas of XSLT stylesheets can be used for a static analysis of the used XSLT stylesheet, to automatically detect the XSLT stylesheet of target XML documents or to reason on the output schema without access to the target XML documents. In this paper, we develop an approach to automatically determining the output schema of an XSLT stylesheet. We also describe several application scenarios of output schemas. The experimental evaluation shows that our prototype can determine the output schemas of nearly all typical XSLT stylesheets and the improvements in preciseness in several application scenarios when using output schemas in comparison to when not using output schemas.  相似文献   

19.
已有的基于模式映射的语义信息集成能够解决分布数据源之间的模式异构,对于普遍存在的上下文异构则无法解决.首先提出一种将暗含的上下文语义进行形式化描述的方法,然后在基于模式映射的语义信息集成基础上,增加上下文仲裁器以自动检测和解决上下文异构.详细介绍了上下文仲裁器的工作原理、设计思想与实现细节.  相似文献   

20.
Intuitively, data management and data integration tools should be well suited for exchanging information in a semantically meaningful way. Unfortunately, they suffer from two significant problems: they typically require a common and comprehensive schema design before they can be used to store or share information, and they are difficult to extend because schema evolution is heavyweight and may break backward compatibility. As a result, many large-scale data sharing tasks are more easily facilitated by non-database-oriented tools that have little support for semantics.The goal of the peer data management system (PDMS) is to address this need: we propose the use of a decentralized, easily extensible data management architecture in which any user can contribute new data, schema information, or even mappings between other peers schemas. PDMSs represent a natural step beyond data integration systems, replacing their single logical schema with an interlinked collection of semantic mappings between peers individual schemas.This paper considers the problem of schema mediation in a PDMS. Our first contribution is a flexible language for mediating between peer schemas that extends known data integration formalisms to our more complex architecture. We precisely characterize the complexity of query answering for our language. Next, we describe a reformulation algorithm for our language that generalizes both global-as-view and local-as-view query answering algorithms. Then we describe several methods for optimizing the reformulation algorithm and an initial set of experiments studying its performance. Finally, we define and consider several global problems in managing semantic mappings in a PDMS.Received: 16 December 2002, Accepted: 14 April 2003, Published online: 12 December 2003Edited by: V. Atluri  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号