首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Integration of knowledge is one of the most important tasks for knowledge management in distributed environments. For the same subject in the real world, each different source may generate different versions of data. Of course, with local integrity constraints these data are consistent, but they may be inconsistent with global integrity constraints. This is a popular phenomenon in processing data from distributed sources in the real world. In this work we will investigate processing inconsistency of knowledge and its integration process using the temporal model of indeterminate valid time and probability. This data model is used to describe events that take place in the future with some certainty degrees. With this model, conflict is considered as a situation in which for the same event agents give different time intervals and probabilities about the occurrence of that event. In the integration process we need to determine a proper time interval and probability (based on consensus method) that properly represent that event. With this aim, two kinds of distance functions are defined as well as analyzed. In addition, some postulates and algorithms for knowledge integration are also worked out and analyzed.  相似文献   

2.
This paper describes a new approach of heterogeneous data source fusion. Data sources are either static or active: static data sources can be structured or semi-structured, whereas active sources are services. In order to develop data sources fusion systems in dynamic contexts, we need to study all issues raised by the matching paradigms. This challenging problem becomes crucial with the dominating role of the internet. Classical approaches of data integration, based on schemas mediation, are not suitable to the World Wide Web (WWW) environment where data is frequently modified or deleted. Therefore, we develop a loosely integrated approach that takes into consideration both conflict management and semantic rules which must be enriched in order to integrate new data sources. Moreover, we introduce an XML-based Multi-data source Fusion Language (MFL) that aims to define and retrieve conflicting data from multiple data sources. The system, which is developed according to this approach, is called MDSManager (Multi-Data Source Manager). The benefit of the proposed framework is shown through a real world application based on web data sources fusion which is dedicated to online markets indices tracking. Finally, we give an evaluation of our MFL language. The results show that our language improves significantly the XQuery language especially considering its expressiveness power and its performances.  相似文献   

3.
Discovering and reconciling value conflicts for numerical data integration   总被引:5,自引:0,他引:5  
The built-up in Information Technology capital fueled by the Internet and cost-effectiveness of new telecommunications technologies has led to a proliferation of information systems that are in dire need to exchange information but incapable of doing so due to the lack of semantic interoperability. It is now evident that physical connectivity (the ability to exchange bits and bytes) is no longer adequate: the integration of data from autonomous and heterogeneous systems calls for the prior identification and resolution of semantic conflicts that may be present. Unfortunately, this requires the system integrator to sift through the data from disparate systems in a painstaking manner. We suggest that this process can be partially automated by presenting a methodology and technique for the discovery of potential semantic conflicts as well as the underlying data transformation needed to resolve the conflicts. Our methodology begins by classifying data value conflicts into two categories: context independent and context dependent. While context independent conflicts are usually caused by unexpected errors, the context dependent conflicts are primarily a result of the heterogeneity of underlying data sources. To facilitate data integration, data value conversion rules are proposed to describe the quantitative relationships among data values involving context dependent conflicts. A general approach is proposed to discover data value conversion rules from the data. The approach consists of the five major steps: relevant attribute analysis, candidate model selection, conversion function generation, conversion function selection and conversion rule formation. It is being implemented in a prototype system, DIRECT, for business data using statistics based techniques. Preliminary study using both synthetic and real world data indicated that the proposed approach is promising.  相似文献   

4.
由于对象关系映射(ORM)采用了一种完全不同的数据访问方式,为了使原有信息系统的数据迁移到采用了ORM框架的新系统后仍然保持数据的完整性和一致性,使用poi读取excel文件的结构化数据,使用java反射技术将读取的数据组装成对象。使用开源的ORM框架Hibernate将对象导入数据库中。结果表明该方案可以保证信息系统初始化后数据的完整性,从而解决了采用ORM框架的信息系统数据初始化问题。  相似文献   

5.
在很多的实际应用中需要集成地访问异构的数据源,而异构数据源的数据结构模式未知且不固定。为此,为了解决上述问题我们在DM异构数据源集成系统中设计了一种针对半结构化数据的自描述的数据模型SSDM(SemiStructured Data Model),并在此基础上构造了查询语言SDQL(SemiStructured Data Query language)和针对不同数据源的捆绑器,从而很好地解决了这个问题。  相似文献   

6.
7.
在很多的实际应用中需要集成地访问异构的数据源,我们在DM异构数据源集成系统中采用了一种自描述的数据模型DEM,并在此基础上构造查询语言(MSL)和针对不同数据源的捆绑器。从而很好地实现了异构数据源的集成。  相似文献   

8.
In this article, we describe a new approach to applying distributed artificial intelligence techniques to manufacturing processes. The construction of intelligent systems is one of the most important techniques among artificial intelligence research. Our goal is to develop an integrated intelligent system for real time manufacturing processes. An integrated intelligent system is a large knowledge integration environment that consists of several symbolic reasoning systems (expert systems) and numerical computation packages. These software programs are controlled by a meta-system which manages the selection, operation and communication of these programs. A meta-system can be implemented in different language environments and applied to many disciplines. This new architecture can serve as a universal configuration to develop high performance intelligent systems for many complicated industrial applications in real world domains.To whom all correspondence should be addressed.  相似文献   

9.
Multiphysics simulations are playing an increasingly important role in computational science and engineering for applications ranging from aircraft design to medical treatments. These simulations require integration of techniques and tools from multiple disciplines, and in turn demand new advanced technologies to integrate independently developed physics solvers effectively. In this paper, we describe some numerical, geometrical, and system software components required by such integration, with a concrete case study of detailed, three-dimensional, parallel rocket simulations involving system-level interactions among fluid, solid, and combustion, as well as subsystem-level interactions. We package these components into a software framework that provides common-refinement based methods for transferring data between potentially non-matching meshes, novel and robust face-offsetting methods for tracking Lagrangian surface meshes, as well as integrated support for parallel mesh optimization, remeshing, algebraic manipulations, performance monitoring, and high-level data management and I/O. From these general, reusable framework components we construct domain-specific building blocks to facilitate integration of parallel, multiphysics simulations from high-level specifications that are easy to read and can also be visualized graphically. These reusable building blocks are integrated with independently developed physics codes to perform various multiphysics simulations.  相似文献   

10.
Integrated access to multiple data sources requires a homogeneous interface provided by a federated schema. Such a federated schema should correctly reflect the semantics of the component schemata of which it is composed. Since the semantics of a database schema is also determined by a set of semantic integrity constraints, a correct schema integration has to deal with integrity constraints existing in the different component schemata. Traditionally, most schema integration approaches solely concentrate on the structural integration of given database schemata. Local integrity constraints are often simply neglected. Their relationship to global extensional assertions, which form the basic integration constraints, are even ignored completely. In this paper, we discuss the impact of global extensional assertions and local integrity constraints on federated schemata. In particular, we point out the correspondence between local integrity constraints and global extensional assertions. The knowledge about the correspondences between the given integrity constraints and extensional assertions can then be utilized for an augmented schema integration process.  相似文献   

11.
12.
Irresponsible and negligent use of natural resources in the last five decades has made it an important priority to adopt more intelligent ways of managing existing resources, especially the ones related to energy. The main objective of this paper is to explore the opportunities of integrating internal data already stored in Data Warehouses together with external Big Data to improve energy consumption predictions. This paper presents a study in which we propose an architecture that makes use of already stored energy data and external unstructured information to improve knowledge acquisition and allow managers to make better decisions. This external knowledge is represented by a torrent of information that, in many cases, is hidden across heterogeneous and unstructured data sources, which are recuperated by an Information Extraction system. Alternatively, it is present in social networks expressed as user opinions. Furthermore, our approach applies data mining techniques to exploit the already integrated data. Our approach has been applied to a real case study and shows promising results. The experiments carried out in this work are twofold: (i) using and comparing diverse Artificial Intelligence methods, and (ii) validating our approach with data sources integration.  相似文献   

13.
一个基于CORBA的异构数据源集成系统的设计   总被引:28,自引:0,他引:28  
提出一个基于CORBA(common object request broker architecture)的即插即用的异构多数据源集成系统的设计方案.由于采用具有较强描述能力的OIM(object model for integration)对象模型作为集成系统的公共数据模型,该系统不仅能集成各种异构数据源,包括数据库系统、文件系统、WWW上HTML文件中的数据,而且能集成随时插入的新数据源中的数据.着重讨论系统的总体结构、OIM对象模型、查询处理及界面设计.  相似文献   

14.
Dealing with discrepancies in data is still a big challenge in data integration systems. The problem occurs both during eliminating duplicates from semantic overlapping sources as well as during combining complementary data from different sources. Though using SQL operations like grouping and join seems to be a viable way, they fail if the attribute values of the potential duplicates or related tuples are not equal but only similar by certain criteria. As a solution to this problem, we present in this paper similarity-based variants of grouping and join operators. The extended grouping operator produces groups of similar tuples, the extended join combines tuples satisfying a given similarity condition. We describe the semantics of this operator, discuss efficient implementations for the edit distance similarity and present evaluation results. Finally, we give examples of application from the context of a data reconciliation project for looted art.  相似文献   

15.
Data integration with uncertainty   总被引:1,自引:0,他引:1  
This paper reports our first set of results on managing uncertainty in data integration. We posit that data-integration systems need to handle uncertainty at three levels and do so in a principled fashion. First, the semantic mappings between the data sources and the mediated schema may be approximate because there may be too many of them to be created and maintained or because in some domains (e.g., bioinformatics) it is not clear what the mappings should be. Second, the data from the sources may be extracted using information extraction techniques and so may yield erroneous data. Third, queries to the system may be posed with keywords rather than in a structured form. As a first step to building such a system, we introduce the concept of probabilistic schema mappings and analyze their formal foundations. We show that there are two possible semantics for such mappings: by-table semantics assumes that there exists a correct mapping but we do not know what it is; by-tuple semantics assumes that the correct mapping may depend on the particular tuple in the source data. We present the query complexity and algorithms for answering queries in the presence of probabilistic schema mappings, and we describe an algorithm for efficiently computing the top-k answers to queries in such a setting. Finally, we consider using probabilistic mappings in the scenario of data exchange.  相似文献   

16.
介绍了流通领域安全监控平台的系统总体设计,阐述了本平台基于VPN安全通道、动态ID技术和基于PKI技术自主研发的电子签章,将监管机制通过应用集成技术和异构数据源整合技术融入企业经营管理系统中。对平台的数据共享、系统加密和监控平台的实现进行了详细描述,对进一步规范企业行为,保障人民群众利益起到了良好的效果,对政府、对社会提供强有力的支撑和服务。  相似文献   

17.
In this paper we investigate techniques for decreasing the overhead of semantic integrity enforcement or equivalently the overhead of transaction validation with respect to a set of semantic integrity (SI) assertions. We discuss the problem of semantic integrity enforcement from two points of view. First we describe three approaches to decrease the overhead of SI enforcement. Second we analyze the cost of several SI enforcement methods in centralized and distributed database systems based on slow and fast (local) networks.  相似文献   

18.
Different from traditional association-rule mining, a new paradigm called Ratio Rule (RR) was proposed recently. Ratio rules are aimed at capturing the quantitative association knowledge, We extend this framework to mining ratio rules from distributed and dynamic data sources. This is a novel and challenging problem. The traditional techniques used for ratio rule mining is an eigen-system analysis which can often fall victim to noise. This has limited the application of ratio rule mining greatly. The distributed data sources impose additional constraints for the mining procedure to be robust in the presence of noise, because it is difficult to clean all the data sources in real time in real-world tasks. In addition, the traditional batch methods for ratio rule mining cannot cope with dynamic data. In this paper, we propose an integrated method to mining ratio rules from distributed and changing data sources, by first mining the ratio rules from each data source separately through a novel robust and adaptive one-pass algorithm (which is called Robust and Adaptive Ratio Rule (RARR)), and then integrating the rules of each data source in a simple probabilistic model. In this way, we can acquire the global rules from all the local information sources adaptively. We show that the RARR technique can converge to a fixed point and is robust as well. Moreover, the integration of rules is efficient and effective. Both theoretical analysis and experiments illustrate that the performance of RARR and the proposed information integration procedure is satisfactory for the purpose of discovering latent associations in distributed dynamic data source.  相似文献   

19.
Enterprise Resource Planning systems tend to deploy Supply Chain Management and/or Customer Relationship Management techniques, in order to successfully fuse information to customers, suppliers, manufacturers and warehouses, and therefore minimize system-wide costs while satisfying service level requirements. Although efficient, these systems are neither versatile nor adaptive, since newly discovered customer trends cannot be easily integrated with existing knowledge. Advancing on the way the above mentioned techniques apply on ERP systems, we have developed a multi-agent system that introduces adaptive intelligence as a powerful add-on for ERP software customization. The system can be thought of as a recommendation engine, which takes advantage of knowledge gained through the use of data mining techniques, and incorporates it into the resulting company selling policy. The intelligent agents of the system can be periodically retrained as new information is added to the ERP. In this paper, we present the architecture and development details of the system, and demonstrate its application on a real test case.  相似文献   

20.
介绍了CoXML系统中提出的3种新的XML完整性约束技术:基于XML的数据交换中的函数依赖转换方法,面向XML Schema的键约束转换方法和基于XPath的XML文档键约束验证方法,并且通过CoXML系统的实现验证了这些技术的有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号