期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Schema mediation for large-scale semantic data sharing

Alon Y. Halevy Zachary G. Ives Dan Suciu Igor Tatarinov 《The VLDB Journal The International Journal on Very Large Data Bases》2005,14(1):68-83

Intuitively, data management and data integration tools should be well suited for exchanging information in a semantically meaningful way. Unfortunately, they suffer from two significant problems: they typically require a common and comprehensive schema design before they can be used to store or share information, and they are difficult to extend because schema evolution is heavyweight and may break backward compatibility. As a result, many large-scale data sharing tasks are more easily facilitated by non-database-oriented tools that have little support for semantics.The goal of the peer data management system (PDMS) is to address this need: we propose the use of a decentralized, easily extensible data management architecture in which any user can contribute new data, schema information, or even mappings between other peers schemas. PDMSs represent a natural step beyond data integration systems, replacing their single logical schema with an interlinked collection of semantic mappings between peers individual schemas.This paper considers the problem of schema mediation in a PDMS. Our first contribution is a flexible language for mediating between peer schemas that extends known data integration formalisms to our more complex architecture. We precisely characterize the complexity of query answering for our language. Next, we describe a reformulation algorithm for our language that generalizes both global-as-view and local-as-view query answering algorithms. Then we describe several methods for optimizing the reformulation algorithm and an initial set of experiments studying its performance. Finally, we define and consider several global problems in managing semantic mappings in a PDMS.Received: 16 December 2002, Accepted: 14 April 2003, Published online: 12 December 2003Edited by: V. Atluri 相似文献

2.

2DCMA: An Effective Maintenance Algorithm of Materialized Views in Peer Data Management Systems

下载免费PDF全文

Biao Qin Shan Wang and Xiao-Yong Du 《计算机科学技术学报》2006,21(4):503-512

Update management is very important for data integration systems. So update management in peer data management systems （PDMSs） is a hot research area. This paper researches on view maintenance in PDMSs. First, the definition of view is extended and the peer view, local view and global view are proposed according to the requirements of applications. There are two main factors to influence materialized views in PDMSs. One is that schema mappings between peers are changed, and the other is that peers update their data. Based on the requirements, this paper proposes an algorithm called 2DCMA, which includes two sub-algorithms： data and definition consistency maintenance algorithm% to effectively maintain views. For data consistency maintenance, Mork＇s rules are extended for governing the use of updategrams and boosters. The new rule system can be used to optimize the execution plan. And are extended for the data consistency maintenance algorithm is based on the new rule system. Furthermore, an ECA rule is adopted for definition consistency maintenance. Finally, extensive simulation experiments are conducted in SPDMS. The simulation results show that the 2DCMA algorithm has better performance than that of Mork＇s when maintaining data consistency. And the 2DCMA algorithm has better performance than that of centralized view maintenance algorithm when maintaining definition consistency. 相似文献

3.

QUERY ROUTING IN A PEER-TO-PEER SEMANTIC LINK NETWORK 总被引：9，自引：0，他引：9

Hai Zhuge Jie Liu Liang Feng Xiaoping Sun Chao He 《Computational Intelligence》2005,21(2):197-216

A semantic link peer-to-peer (P2P) network specifies and manages semantic relationships between peers' data schemas and can be used as the semantic layer of a scalable Knowledge Grid. The proposed approach consists of an automatic semantic link discovery method, a tool for building and maintaining P2P semantic link networks (P2PSLNs), a semantic-based peer similarity measurement for efficient query routing, and the schema mapping algorithms for query reformulation and heterogeneous data integration. The proposed approach has three important aspects. First, it uses semantic links to enrich the relationships between peers' data schemas. Second, it considers not only nodes but also the XML structure in measuring the similarity between schemas to efficiently and accurately forward queries to relevant peers. Third, it copes with semantic and structural heterogeneity and data inconsistency so that peers can exchange and translate heterogeneous information within a uniform view. 相似文献

4.

一种基于对等语义连接网络的路由查询

唐红梅郑刚《电脑开发与应用》2008,21(6):68-71

对等网络（简称P2P）的非集中结构、良好的自治性及容错性等特征,使其成为Internet上有效的信息共享模型。提出一种非结构化对等语义连接网络（NSPSLN）指定和管理在节点数据框架之间的语义关系,从而实现一个基于节点类似于有效路由查询的测量尺度,再形成和不同种类数据的综合模式映射算法。通过研究,为分布式资源提供一种新的方法,并加速知识产生在合作的研究过程中的繁殖、熔化和管理的过程。相似文献

5.

Object-oriented query language access to relational databases: A semantic framework for query translation

Susan D. Urban Taoufik Ben Abdellatif 《Journal of Systems Integration》1995,5(2):123-156

This research investigates and approach to query processing in a multidatabase system that uses an objectoriented model to capture the semantics of other data models. The object-oriented model is used to construct a global schema, defining an integrated view of the different schemas in the environment. The model is also used as a self-describing model to build a meta-database for storing information about the global schema. A unique aspect of this work is that the object-oriented model is used to describe the different data models of the multidatabase environment, thereby extending the meta database with semantic information about the local schemas. With the global and local schemas all represented in an object-oriented form, structural mappings between the global schema and each local schema are then easily supported. An object algebra then provides a query language for expressing global queries, using the structural mappings to translate object algebra queries into SQL queries over local relational schema. The advantage of using an object algebra is that the object-oriented database can be viewed as a blackboard for temporary storage of local data and for establishing relationships between different databases. The object algebra can be used to directly retrieve temporarily-stored data from the object-oriented database or to transparently retrieve data from local sources using the translation process described in this paper. 相似文献

6.

Data integration with uncertainty 总被引：1，自引：0，他引：1

Xin Luna Dong Alon Halevy Cong Yu 《The VLDB Journal The International Journal on Very Large Data Bases》2009,18(2):469-500

This paper reports our first set of results on managing uncertainty in data integration. We posit that data-integration systems need to handle uncertainty at three levels and do so in a principled fashion. First, the semantic mappings between the data sources and the mediated schema may be approximate because there may be too many of them to be created and maintained or because in some domains (e.g., bioinformatics) it is not clear what the mappings should be. Second, the data from the sources may be extracted using information extraction techniques and so may yield erroneous data. Third, queries to the system may be posed with keywords rather than in a structured form. As a first step to building such a system, we introduce the concept of probabilistic schema mappings and analyze their formal foundations. We show that there are two possible semantics for such mappings: by-table semantics assumes that there exists a correct mapping but we do not know what it is; by-tuple semantics assumes that the correct mapping may depend on the particular tuple in the source data. We present the query complexity and algorithms for answering queries in the presence of probabilistic schema mappings, and we describe an algorithm for efficiently computing the top-k answers to queries in such a setting. Finally, we consider using probabilistic mappings in the scenario of data exchange. 相似文献

7.

CIMS环境下基于对象的信息集成框架 总被引：9，自引：0，他引：9

卢昭泉《计算机研究与发展》1998,35(5):436-441

信息集成是实现ＣＩＭＳ的关键．文中在集成模型的选择和集成系统的设计上引入了面向对象的思想，提出了一个基于对象的信息集成框架，并着重讨论了本地对象模式和全局对象模式设计和实现的关键技术．与传统的信息集成方法相比，基于对象的信息集成技术能更自然地表达已有数据库之间的语义信息，并保持已有应用程序的独立性，而且具有良好的可扩充性．相似文献

8.

Deep Web集成中数据模式映射失效检测方法研究

缪嘉嘉李爱平贾焰吴泉源《计算机研究与发展》2008,45(Z1):222-227

查询接口集成是Deep Web数据集成的关键,在动态环境下,Web数据源的变化会引起数据模式映射的失效,使得查询接口集成维护难度增加,因此数据模式映射失效检测是Deep Web数据集成研究中的热点问题.针对目前数据模式映射失效检测方法的局限,在模糊聚集算子的研究基础上,提出一种适用于数据模式映射失效检测的结果融合算法.通过实验对比测试,并对映射失效检测方法的性能和效率进行了分析和实验,结果证明了提出的方法对于失效模型的检测是有效的. 相似文献

9.

GridVine: An Infrastructure for Peer Information Management

Cudre-Mauroux P. Agarwal S. Aberer K. 《Internet Computing, IEEE》2007,11(5):36-44

GridVine is a semantic overlay infrastructure based on a peer-to-peer (P2P) access structure. Built following the principle of data independence, it separates a logical layer - in which data, schemas, and schema mappings are managed - from a physical layer consisting of a structured P2P network supporting decentralized indexing, key load-balancing, and efficient routing. The system is decentralized, yet fosters semantic interoperability through pair-wise schema mappings and query reformulation. GridVine's heterogeneous but semantically related information sources can be queried transparently using iterative query reformulation. The authors discuss a reference implementation of the system and several mechanisms for resolving queries collaboratively. 相似文献

10.

Instance-based domain ontological view creation towards semantic integration

Yunjiao Xue Hamada H. Ghenniwa Weiming Shen 《Expert systems with applications》2011,38(2):1193-1202

In many domains today there are very limited explicit ontologies established for implementing information systems. Traditional ontology-driven semantic integration approaches cannot be directly applied in integrating these information systems. Usually, the information systems have schemas, a type of formal information model, for their information repositories which to some extent imply the semantics of the information. Each schema actually reflects a specific view of the domain conceptualization. This paper investigates the theoretical foundation of ontologies and extends the traditional ontology concept to the ontological view concept. It proposes to use ontological views to address the challenge of semantic integration. The proposed approach adopts the schemas to create local ontological views, uses data instances of the information systems to discover semantic relationships between the concepts within the ontological views, and builds a domain ontological view based on the discovered equivalence mappings. It applies the hierarchical clustering technique on the data instances and, in the further analysis, uses the clusters to reduce the cost of processing a large amount of data. The matching of concept properties is based on the probability distribution of the data instances. The experimental results have demonstrated the effectiveness of this approach. 相似文献

11.

MapMerge: correlating independent schema mappings

Bogdan Alexe Mauricio Hernández Lucian Popa Wang-Chiew Tan 《The VLDB Journal The International Journal on Very Large Data Bases》2012,21(2):191-211

One of the main steps toward integration or exchange of data is to design the mappings that describe the (often complex) relationships between the source schemas or formats and the desired target schema. In this paper, we introduce a new operator, called MapMerge, that can be used to correlate multiple, independently designed schema mappings of smaller scope into larger schema mappings. This allows a more modular construction of complex mappings from various types of smaller mappings such as schema correspondences produced by a schema matcher or pre-existing mappings that were designed by either a human user or via mapping tools. In particular, the new operator also enables a new “divide-and-merge” paradigm for mapping creation, where the design is divided (on purpose) into smaller components that are easier to create and understand and where MapMerge is used to automatically generate a meaningful overall mapping. We describe our MapMerge algorithm and demonstrate the feasibility of our implementation on several real and synthetic mapping scenarios. In our experiments, we make use of a novel similarity measure between two database instances with different schemas that quantifies the preservation of data associations. We show experimentally that MapMerge improves the quality of the schema mappings, by significantly increasing the similarity between the input source instance and the generated target instance. Finally, we provide a new algorithm that combines MapMerge with schema mapping composition to correlate flows of schema mappings. 相似文献

12.

多XML数据源的语义集成与查询处理

韩恺《计算机工程与应用》2006,42(17):167-170,217

提出一种多XML数据源的语义集成和查询处理的途径,通过一定步骤将各个局部DTD模式集成为全局模式,同时生成全局模式到局部模式的映射。在查询处理中,查询被表示成查询树的形式,引入了补查询和连接子等概念,给出了查询分解和执行的具体算法,并首次提出并分析了XML集成环境下产生不确定查询结果的情况。相似文献

13.

基于本体的信息集成框架中包装器的设计 总被引：1，自引：0，他引：1

张慧黄刘生《计算机工程与应用》2004,40(19):119-122

将本体应用在信息集成框架中能够在语义层次上消除底层数据源的异构,但是本体只相当于一个知识库,在定义用户接口时,需要赋予其一个语法结构,这个语法结构可作为与用户交互的全局模式,从本体到全局模式的转换可以用包装器来实现。而此全局模式和各个数据源之间的局部模式也需要映射,这些映射也可以用包装器来实现。该文提出了基于本体的信息集成框架中一种包装器的设计,通过将本体转换为XMLSchema作为全局模式,并利用XSLT实现全局模式和局部模式的映射,从而屏蔽了数据源的异构性。相似文献

14.

Model independent assertions for integration of heterogeneous schemas 总被引：3，自引：0，他引：3

Prof. Stefano Spaccapietra Christine Parent Yann Dupont 《The VLDB Journal The International Journal on Very Large Data Bases》1992,1(1):81-126

Due to the proliferation of database applications, the integration of existing databases into a distributed or federated system is one of the major challenges in responding to enterprises' information requirements. Some proposed integration techniques aim at providing database administrators (DBAs) with a view definition language they can use to build the desired integrated schema. These techniques leave to the DBA the responsibility of appropriately restructuring schema elements from existing local schemas and of solving inter-schema conflicts. This paper investigates theassertion-based approach, in which the DBA's action is limited to pointing out corresponding elements in the schemas and to defining the nature of the correspondence in between. This methodology is capable of: ensuring better integration by taking into account additional semantic information (assertions about links); automatically solving structural conflicts; building the integrated schema without requiring conforming of initial schemas; applying integration rules to a variety of data models; and performing view as well as database integration. This paper presents the basic ideas underlying our approach and focuses on resolution of structural conflicts. 相似文献

15.

Mapping between heterogeneous XML and OWL transaction representations in B2B integration

Jorge Cardoso Christoph BusslerAuthor vitae 《Data & Knowledge Engineering》2011,70(12):1046-1069

相似文献

16.

SCOPE/CIMS系统中模式集成的形式化基础 总被引：3，自引：0，他引：3

石祥滨张斌王国仁于戈郑怀远赖翔飞《计算机学报》1998,21(11):1015-1021

大多数据库系统中，模式集成是将若干个已经存在的模式集成为一个统一模式的过程，是实现异构信息集成的关键问题之一，为满足面向对象的多数据源集成系统ＳＣＯＰＥ／ＣＩＭＳ中模式集成的需要，本文提出了一个支持模式集成的形式化基础，为实现一个半自动化的模式集成辅助工具奠定了基础，主要内容包括；（１）一个对应关系描述模型，以支持模式的分析与比较；（２）一套模式集成规则，以提供模式合并与重构的原则；（３）等价类的相似文献

17.

A self-organizing knowledge representation scheme for extensibleheterogeneous information environment

Sull W. Kashyap R.L. 《Knowledge and Data Engineering, IEEE Transactions on》1992,4(2):185-191

The self-organizing knowledge representation aspects in heterogeneous information environments involving object-oriented databases, relational databases, and rulebases are investigated. The authors consider a facet of self-organizability which sustains the structural semantic integrity of an integrated schemea regardless of the dynamic nature of local schemata. To achieve this objective, they propose an overall scheme for schema translation and schema integration with an object-oriented data model as common data model, and it is shown that integrated schemata can be maintained effortlessly by propagating updates in local schemata to integrated schemata unambiguously 相似文献

18.

基于上下文仲裁的语义异构解决方案 总被引：1，自引：0，他引：1

下载免费PDF全文

周建芳徐海银卢正鼎《计算机工程》2008,34(20):10-12

基于本体的语义信息集成主要解决分布异构的数据源之间的模式级异构和部分数据异构(包括同义字和同音异义字)。在基于本体的语义信息集成的基础上引入上下文机制来全面解决异构数据源之间的语义异构,弥补了基于模式映射的语义信息集成的不足,具有较好的适应性和扩展性。相似文献

19.

基于属性实例集合语义相似的模式匹配

李蓉蓉王晖陈冉《计算机科学》2011,38(12):151-155

近年来,模式匹配作为Web信息集成管理与应用中的重要问题,得到了广泛关注和研究。已有模式匹配方法大多是基于模式信息的,对数据实例信息利用则较少。针对数据集成环境下模式信息不全或存在冲突的模式信息导致模式匹配结果不正确的问题,给出了计算属性间语义相似性的方法以提高模式匹配的性能,分析了模式内语义相近多属性间的语义差别,进一步给出了基于带权二分图最大化算法的模式匹配方法。通过实验,说明基于实例集合语义相似的模式匹配方法能在模式信息不全面或存在冲突的情况下,得到更完整、更准确的模式匹配。相似文献

20.

信息集成中语义异构问题研究* 总被引：4，自引：0，他引：4

周建芳徐海银卢正鼎《计算机应用研究》2008,25(8):2349-2353

为了对多个分布异构的数据源进行无缝访问必须解决数据源之间的语义异构。分析了三个层次的语义异构,即模式异构、上下文异构和个体异构,并重点给出了消除上下文异构和个体异构的解决方法。针对已有语义信息集成片面解决三个层次语义异构中的一种的现状,提出了一个语义信息集成的体系结构,能够全面解决三个层次的语义异构。相似文献