首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Global viewing of heterogeneous data sources   总被引:10,自引:0,他引:10  
The problem of defining global views of heterogeneous data sources to support querying and cooperation activities is becoming more and more important due to the availability of multiple data sources within complex organizations and in global information systems. Global views are defined to provide a unified representation of the information in the different sources by analyzing conceptual schemas associated with them and resolving possible semantic heterogeneity. We propose an affinity based unification method for global view construction. In the method: (1) the concept of affinity is introduced to assess the level of semantic relationship between elements in different schemas by taking into account semantic heterogeneity; (2) schema elements are classified by affinity levels using clustering procedures so that their different representations can be analyzed for unification; (3) global views are constructed starting from selected clusters by unifying representations of their elements. Experiences of applying the proposed unification method and the associated tool environment ARTEMIS on databases of the Italian Public Administration information systems are described  相似文献   

2.
李润洲  方明 《计算机工程》2007,33(17):111-113
基于企业内多个分布异构关系数据库的集成需求,设计了一个向上支持集成访问界面,向下表述数据库网络位置、数据模式、数据内容的元数据字典模式。给出了面向集成环境和各异构数据库的通用查询请求表示。基于元数据字典,提出了一种因访问需求变化而引起相关数据库关系表动态变化的动态查询语句构造算法,并对算法进行了论证。  相似文献   

3.
Deep Web数据集成中查询处理的研究与进展   总被引:2,自引:0,他引:2  
随着Web上在线数据库的大量涌现,Deep Web数据集成成为当前信息领域的一个研究热点,而查询处理是其中的一个重要的组成部分。由于Web数据库具有规模大、自治性、异构性以及动态性等特点,使得Deep Web数据集成中的查询处理比传统的分布环境下的查询处理更具挑战性。围绕Deep Web数据集成中查询处理的三个关键研究点:模式匹配、Web数据库的选择以及查询转换,综述了近年来国际上相关的、具代表性的研究成果,分析了这些方法的优缺点,总结并展望了未来的发展方向。  相似文献   

4.
A multidatabase system (MDBS) allows the users to simultaneously access heterogeneous,and autonomous databases using an integrated schema and a single global query language. The query optimization problem in MDBSs is quite different from the query optimization problem in distributed homogeneous databases due to schema heterogeneity and autonomy of local database systems. In this work, we consider the optimization of query distribution in case of data replication and the optimization of intersite joins, that is, the join of the results returned by the local sitesin response to the global subqueries. The algorithms presented for the optimization of intersite joins try to maximize the parallelism in execution and take the federated nature of the problem into account. It has also been shown through a comparativeperformance study that the proposed intersite join optimization algorithms are efficient.The approach presented can easily be generalized to any operation required for intersite query processing. The query optimization scheme presentedin this paper is being implemented within the scopeof a multidatabase system which is based on OMG‘sobject management architecture.  相似文献   

5.
已有的基于模式映射的语义信息集成能够解决分布数据源之间的模式异构,对于普遍存在的上下文异构则无法解决.首先提出一种将暗含的上下文语义进行形式化描述的方法,然后在基于模式映射的语义信息集成基础上,增加上下文仲裁器以自动检测和解决上下文异构.详细介绍了上下文仲裁器的工作原理、设计思想与实现细节.  相似文献   

6.
多级能量异构传感器网络的负载均衡成簇算法   总被引:2,自引:0,他引:2  
在多级能量异构无线传感器网络中,节点的初始能量在一定的范围内随机分布,负载均衡和降低能耗是能量异构网络成簇算法的一个重要挑战.现有的分布式成簇算法主要是针对能量同构或二级异构网络设计的,无法实现节点能量多级异构时的负载均衡,因此提出了适用于多级能量异构传感网络的负载均衡成簇算法LBCA(load balance clustering algorithm).LBCA根据传感器网络的能量分布情况选择簇头节,最和实现负载均衡,可以有效地延长网络的稳定周期.簇头选择过程中,当探测区域能量分布均衡时,拥有较低平均通信能耗的节点将优先成为簇头节点,有利于降低探测区域内的总通信能耗;当探测区域能量分布不均衡时,具有较高剩余能量的节点将优先成为簇头节点,有利于实现探测区域内的负载均衡.将LBCA与主要的分布式成簇方案进行了比较,模拟实验结果显示,在多级能量异构传感器网络中,LBCA可以更好地实现负载均衡,极大地提高网络的稳定周期.  相似文献   

7.
多数据库系统中查询分解算法的研究   总被引:1,自引:0,他引:1  
多数据库系统允许用户使用一个集成模式和简单的全局查询语言同时访问多个异构的、自治的数据库系统。全局查询分解处理是多数据库系统中的一个很重要的问题。本文给出了一种多数据库环境中的模式信息管理方法,基于这些模式信息,我们提出一种易于实现的查询分解算法。由于多数据库查询分解处理与模式集成的实现紧密相关,所以本文对多数据库系统的模式集成作了一些描述。  相似文献   

8.
王进鹏  张亚非  苗壮 《计算机科学》2010,37(12):134-137
为实现异构关系数据库的语义集成,针对传统集成技术存在的问题,在对语义网等相关技术进行分析的基础上,研究基于本体的关系数据集成系统中的查询处理问题,提出了一种基于本体的关系数据库集成框架。设计了基于本体的关系数据的描述方法,使用本体作为集成的全局模式来描述关系模式的语义。设计了查询重写算法,该算法可以将基于全局模式的SPARQL查询重写为针对具体关系数据库的查询,从而实现对异构关系数据库的集成。实验表明,该算法具有良好的可扩展性。  相似文献   

9.
In statistical databases and data warehousing applications it is commonly the case that aggregate views are maintained as an underlying mechanism for summarising information. Where the databases or applications are distributed, or arise from independent data collections or system developments, there may be incompatibility, heterogeneity, and data inconsistency. These challenges need to be overcome if federations of aggregated databases are to be successfully incorporated into systems for database management, querying, retrieval, and knowledge discovery. In this paper we address the issue of integrating aggregate views that have semantically heterogeneous classification schemes. In previous work we have developed a methodology that is efficient but that cannot easily handle data inconsistencies. Our previous approach is therefore not particularly well-suited to very large databases or federations of large numbers of databases. We now address these scalability issues by introducing a methodology for heterogeneous aggregate view integration that constructs a dynamic shared ontology to which each of the aggregate views can be explicitly related. A maximum likelihood technique, implemented using the EM (Expectation-Maximisation) algorithm, is used to inherently handle data inconsistencies in the computation of integrated aggregates that are described in terms of the dynamic shared ontology.  相似文献   

10.
针对目前异构数据库间通信的问题,提出了种基于XML的数据交换技术.在具本实现过程中,进一步研究了XML模式与关系模式相互转换的方法,并给出了转换算法和转换规则。同时对XML模式中的语义约束同关系模式中的完整性约束之间的相互转换方法进行了分析。  相似文献   

11.
This paper presents a query processing algorithm, formulated and developed in support of the prototype architecture of the Distributed Access View Integrated Database (DAVID) which is a heterogeneous distributed database management system. The objective of the proposed query processing algorithm is to produce an inexpensive strategy for a given query. The inexpensive query strategy is obtained primarily by computing the most profitable semi-joins and by determining the best sequence of join operations per processing site. The latter is obtained by applying a zero-one integer linear program that uses a non-parametric statistical estimation technique to compute the sizes of the temporary clusters. A cluster is a subset of the cartesian product of a list of atomic and non-atomic domains and is the structure that can represent in a uniform way data stored in relational, hierarchical and network databases.Following some background information on the development of the DAVID prototype, this paper introduces the schema architecture. The schema architecture describes the mechanism by which the component heterogeneous database schemata are mapped into the uniform global schema. This is followed by the formulation of the query processing algorithm, its implementation and an illustration of its use in the context of NASA's Astrophysics Data System.Recommended by: Y. Breitbart  相似文献   

12.
为了实现异构环境中数据集成的目标,提出了基于XML、B/S三层架构的企业异构数据库之间数据共享的实施方案,设计和实现了一个通用的异构数据集成系统。文章介绍了该系统的核心体系结构、工作流程和各模块的功能;阐述了XML文档模式的验证和提取、XML文档间的映射、XML文档模式和数据库关系模式之间的映射等关键模块的设计和实现;最后简要说明了实现系统所采用的相关Java技术。  相似文献   

13.
基于元数据的分布异构数据集成研究   总被引:2,自引:0,他引:2  
将“虚拟法”和“数据仓库法”结合,通过统一的数据交换中心解决数字气田分布异构数据集成问题。建立包含全局模式和全局-局部映射的元数据字典,屏蔽异构数据细节,由此解决异构数据的各种冲突。提出一种由分布数据源的局部模式归并建立全局模式和映射表的算法,以提高模式集成的自动能力。  相似文献   

14.
In this paper, a new approach for centralised and distributed learning from spatial heterogeneous databases is proposed. The centralised algorithm consists of a spatial clustering followed by local regression aimed at learning relationships between driving attributes and the target variable inside each region identified through clustering. For distributed learning, similar regions in multiple databases are first discovered by applying a spatial clustering algorithm independently on all sites, and then identifying corresponding clusters on participating sites. Local regression models are built on identified clusters and transferred among the sites for combining the models responsible for identified regions. Extensive experiments on spatial data sets with missing and irrelevant attributes, and with different levels of noise, resulted in a higher prediction accuracy of both centralised and distributed methods, as compared to using global models. In addition, experiments performed indicate that both methods are computationally more efficient than the global approach, due to the smaller data sets used for learning. Furthermore, the accuracy of the distributed method was comparable to the centralised approach, thus providing a viable alternative to moving all data to a central location.  相似文献   

15.
基于XML的B2B可扩展数据交换标准框架的设计与实现   总被引:6,自引:5,他引:1  
面向石化领域的B2B电子商务活动中涉及多家大型和特大型企业之间的分布式异构数据交换。提出针对石化领域的电子商务中在线物资采购业务的数据交换标准框架的设计准则,并给出了实现系统的体系结构。该数据交换标准与国际相关标准兼容;所交换的数据以XML为载体;包括用于规范数据交换过程的工作流模型、用于规范业务活动过程中的用户角色和数据交换文档类型的业务过程模型、用于规范该领域电子商务数据交换的业务术语的业务词汇表,以及用于规范交换文档的模式——XML Schema和用于规范信息交互方式消息协议。此外,框架中的消息机制基于SOAP,提供对Web服务的可扩展性。  相似文献   

16.
《Information Sciences》2005,169(1-2):27-46
Information integration for distributed and heterogeneous data sources is still an open challenging, and schema matching is critical in this process. This paper presents an approach to automatic elements matching between XML application schemas using similarity measure and relaxation labeling. The semantic modeling of XML application schema has also been presented. The similarity measure method considers element categories and their properties. In an effort to achieve an optimal matching, contextual constraints are used in the relaxation labeling method. Based on the semantic modeling of XML application schemas, the compatible constraint coefficients are devised in terms of the structures and semantic relationships as defined in the semantic model. To examine the effectiveness of the proposed methods, an algorithm for XML schema matching has been developed, and corresponding computational experiments show that the proposed approach has a high degree of accuracy.  相似文献   

17.
对网络多个信息源跨库检索的结果进行Ontology建模,实现异构分布式数据源的数据抽取与合并.数据抽取首先将信息源的检索结果页面映射成有限标号树,其次应用抽取规则得到所需数据;给出按库合并算法,使得网络多数据源返回的结果得以高效合并.实验数据表明将Ontology建模应用于跨库检索结果处理有效而且正确,抽取准确率可以达到100%.  相似文献   

18.
随着信息化技术的发展,面对材料等相关领域数据的多源异构、扩展性强、爆炸增长等特点,传统关系数据库无法对数据进行存储,因此可利用NoSQL的无模式存储、高扩展性等特性来解决这一难题。作为NoSQL数据库常用的数据存储格式,JSON因简单性和灵活性备受欢迎。然而,NoSQL数据库缺乏模式信息,在JSON文档存入数据库之前,需要对其进行数据验证与分析。目前,大多数方法是基于JSON schema对JSON文档格式的规范性进行校验,无法有效解决JSON文档的异常检测以及语义歧义问题。为此,文中提出了面向NoSQL数据库的JSON文档异常检测与语义消歧模型doctorJSON。该模型基于JSON schema对存入的JSON文档分别设计了异常检测算法deoutJSON和语义消歧算法disemaJSON,以检测JSON文档存在的异常和歧义。在真实数据集与合成数据集上的实验验证了所提模型的有效性和执行效率。  相似文献   

19.
Establishing semantic interoperability among heterogeneous information sources has been a critical issue in the database community for the past two decades. Despite the critical importance, current approaches to semantic interoperability of heterogeneous databases have not been sufficiently effective. We propose a common ontology called semantic conflict resolution ontology (SCROL) that addresses the inherent difficulties in the conventional approaches, i.e., federated schema and domain ontology approaches. SCROL provides a systematic method for automatically detecting and resolving various semantic conflicts in heterogeneous databases. SCROL provides a dynamic mechanism of comparing and manipulating contextual knowledge of each information source, which is useful in achieving semantic interoperability among heterogeneous databases. We show how SCROL is used for detecting and resolving semantic conflicts between semantically equivalent schema and data elements. In addition, we present evaluation results to show that SCROL can be successfully used to automate the process of identifying and resolving semantic conflicts.  相似文献   

20.
Schemaless databases, and document-oriented databases in particular, are preferred to relational ones for storing heterogeneous data with variable schemas and structural forms. However, the absence of a unique schema adds complexity to analytical applications, in which a single analysis often involves large sets of data with different schemas. In this paper we propose an original approach to OLAP on collections stored in document-oriented databases. The basic idea is to stop fighting against schema variety and welcome it as an inherent source of information wealth in schemaless sources. Our approach builds on four stages: schema extraction, schema integration, FD enrichment, and querying; these stages are discussed in detail in the paper. To make users aware of the impact of schema variety, we propose a set of indicators inspired by the definition of attribute density. Finally, we experimentally evaluate our approach in terms of efficiency and effectiveness.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号