首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The architecture of an intelligent multistrategy assistant for knowledge discovery from facts, INLEN, is described and illustrated by an exploratory application. INLEN integrates a database, a knowledge base, and machine learning methods within a uniform user-oriented framework. A variety of machine learning programs are incorporated into the system to serve as high-levelknowledge generation operators (KGOs). These operators can generate diverse kinds of knowledge about the properties and regularities existing in the data. For example, they can hypothesize general rules from facts, optimize the rules according to problem-dependent criteria, determine differences and similarities among groups of facts, propose new variables, create conceptual classifications, determine equations governing numeric variables and the conditions under which the equations apply, deriving statistical properties and using them for qualitative evaluations, etc. The initial implementation of the system, INLEN 1b, is described, and its performance is illustrated by applying it to a database of scientific publications.  相似文献   

2.
分布式环境下约束性关联规则的快速挖掘   总被引:2,自引:0,他引:2  
研究人员针对单机环境提出了约束性关联规则的挖掘算法,但它们不适用于分布式环境.为此本文讨论分布式环境下约束性关联规则的快速挖掘技术,提出一种基于分布式环境的约束性关联规则快速挖掘算法DCAR,其中包括局部约束性频繁项目集挖掘算法MLFC和全局约束性频繁项目集挖掘算法MGFC.该算法根据布尔约束条件产生向导集,采用一种新的候选项集生成函数Reorder-gen,该函数通过向导集高效地产生分布式环境中满足约束条件的、数量较少且完备的候选项集,并且求解全局约束性频繁项集过程中,传送局部候选项集支持数的通信量为O(n),从而提高了算法的挖掘效率.将本文提出的算法加以实现,实验结果表明DCAR算法高效可行,其效率大约是DMA-IC算法的2-3倍.  相似文献   

3.
数据挖掘算法广泛地应用于数据分析。工业、科学和商业领域需要分析地理上分布的大量数据集,而网格能有效地提供高性能应用和分布式的基础设施。为了利用网格实现数据挖掘和知识表示,文中根据知识网格的概念,在GlobusToolkit的基础上,分析了知识网格的体系结构和它的主要组件,根据数据挖掘的过程设计了一种网格数据挖掘系统软件模型,并指出了该模型应提供的服务,这些服务会屏蔽所有关于网格底层的所有细节,使最终用户只关心知识发现的过程。  相似文献   

4.
Statistical Models for Data Mining   总被引:1,自引:0,他引:1  
We review the background to the papers presented in this special issue and give a short introduction to each. We also briefly describe the workshop on Statistical models for data mining, held in Pavia (Italy), in October 2000, where the papers were presented.  相似文献   

5.
Query Decomposition for a Distributed Object-Oriented Mediator System   总被引:2,自引:0,他引:2  
The mediator-wrapper approach to integrate data from heterogeneous data sources has usually been centralized in the sense that a single mediator system is placed between a number of data sources and applications. As the number of data sources increases, the centralized mediator architecture becomes an administrative and performance bottleneck. This paper presents a query decomposition algorithm for a distributed mediation architecture where the communication among the mediators is on a higher level than the communication between a mediator and a data source. Some of the salient features of the proposed approach are: (i) exploring query execution schedules that contain data flow to the sources, necessary when integrating object-oriented sources that provide services (programs) and not only data; (ii) handling of functions with multiple implementations at more than one mediator or source; (iii) multi-phase query decomposition using a combination of heuristics and cost-based strategies; (iv) query plan tree rebalancing by distributed query recompilation.  相似文献   

6.
We describe MediaWeaver—a software framework for composing distributed media in the context of university research and instruction. Authors compose networked media, software tools and mediastreams, and can freely annotate media by media of any form using schema of their own design. Faculty and student authors compose distributed media using common Macintosh, World Wide Web and NeXTSTEP applications, supported by services from UNIX workstations.The MediaWeaver system mediates between network multimedia services and interface kits with which novice programmers and non-programmers may easily create radically different interactive views into shared mediabases. The network services include search engine abstractions, filters, relational modeling frameworks.MediaWeaver has supported collaborative projects in history, drama, music, art, anthropology, environmental studies, and other fields since 1993. Applications range from traditional relational text databases and indexed HTML WWW sites to course readers, research archives, journals and seminar spaces.  相似文献   

7.
分布式综合知识发现系统结构研究   总被引:2,自引:0,他引:2  
利用多Agent技术,采用多层次结构,建立基于内在机理研究基础上的分布式综合知识发现系统(DKD(D&K))总体结构模型。该模型设计了基于双库协同机制的分布式KDD*的知识发现线路,使得分布式数据的预处理、挖掘算法及挖掘结果的评价和导航等研究贯穿于一体,形成了一个完整的系统。该模型不仅较好的继承原综合知识发现系统KD(D&K)的主要特征,而且紧密结合了分布式数据库已经成熟的技术方法,并且与现在国际上比较典型的分布式知识发现系统比较有一定的优越性。  相似文献   

8.
This paper presents a multi-agent model of a distributed information system, using what is described as an engineering approach to real world application environment. The objective is to define, using proven ideas in the industrial context, the agent-based behaviour of the distributed system, which must operate correctly and effectively in an error-prone environment. Issues such as stability, robustness and scalability have also been addressed, along with some new ideas on a high-level communication strategies, as distinct from protocol-based communications. The work is being carried out under the DREAM theme at Keele, an earlier version of the approach having been successfully applied to agent-based manufacturing in an international project called HMS, in which some of the world’s major manufacturing industries participated.  相似文献   

9.
The problem of retrieving information from a collection of heterogeneous distributed databases has attracted a number of solutions. However, the task of integrating established database systems is complicated not only by the differences between the database systems themselves, but also by the differences in structure and semantics of the information contained within them. The problem is exacerbated when one needs to provide access to such a system for naive end-users.This paper is concerned with a Knowledge-Based Systems approach to solving this problem for clearly bounded situations, in which both the domain and the types of query are constrained. At the user interface, dialogue is conducted in terms of concepts with which the user is familiar, and these are then mapped into appropriate database queries. To achieve this a model for query decomposition and answer construction has been used. This model is based around the development of an Intensional Structure containing information necessary for the recapture of semantic information lost in the query decomposition process and required in the answer construction process. The model has been successfully implemented in combination with an embedded KBS, within a five-layer representation model.  相似文献   

10.
遥感影像数据挖掘研究进展   总被引:3,自引:0,他引:3  
周小成  汪小钦 《遥感信息》2005,(3):58-62,42
遥感影像数据挖掘是一个有着广阔应用前景的研究领域。由于遥感影像数据库的海量特征,遥感影像数据挖掘已成为空间数据挖掘的主流。依据遥感影像数据挖掘的方法和目的,从图像索引和检索、图像分类、图像聚类、空间关联规则挖掘、影像变化检测以及高光谱数据挖掘六个方面对遥感影像数据挖掘的国内外研究现状进行了综述。并指出了遥感影像数据挖掘和知识发现中应该着力解决和注意的几个问题。  相似文献   

11.
An implication rule Q → R is a statement of the form "for all objects in the database, if an object has the attribute–value pairs Q then it has also the attribute–value pairs R ." This simple type of rule is theoretically interesting, because it supports reasoning, similar to functional dependencies in database theory, and it may be of practical significance because the size of the set of implication rules that hold in a relation can remain substantially high even when mining real data and considering only most general covers; i.e., covers containing rules with unredundant right and left sizes. Motivated by these observations, we focus on the extraction of short-rule covers, which cannot be efficiently mined by standard rule miners. We present an algorithm driven by "negative examples" (i.e., satisfy Q but not R ) to prune the rule-candidate lattice associated with each "positive example" (i.e., satisfies both Q and R ). The algorithm scales up quite well with respect to the number of objects and it is particularly suitable for databases with attributes described by large domains. Furthermore, a perfect hash function ensures extraction of short-rule covers even from databases containing a large number of attributes.  相似文献   

12.
The Internet has solved the age-old problem of network connectivity and thus enabling the potential access to, and data sharing among large numbers of databases. However, enabling users to discover useful information requires an adequate metadata infrastructure that must scale with the diversity and dynamism of both users' interests and Internet accessible databases. In this paper, we present a model that partitions the information space into a distributed, highly specialized domain ontologies. We also introduce inter-ontology relationships to cater for user-based interests across ontologies defined over Internet databases. We also describe an architecture that implements these two fundamental constructs over Internet databases. The aim of the proposed model and architecture is to eventually facilitate data discovery and sharing for Internet databases.  相似文献   

13.
Efficient Rule-Based Attribute-Oriented Induction for Data Mining   总被引:3,自引:0,他引:3  
Data mining has become an important technique which has tremendous potential in many commercial and industrial applications. Attribute-oriented induction is a powerful mining technique and has been successfully implemented in the data mining system DBMiner (Han et al. Proc. 1996 Int'l Conf. on Data Mining and Knowledge Discovery (KDD'96), Portland, Oregon, 1996). However, its induction capability is limited by the unconditional concept generalization. In this paper, we extend the concept generalization to rule-based concept hierarchy, which enhances greatly its induction power. When previously proposed induction algorithm is applied to the more general rule-based case, a problem of induction anomaly occurs which impacts its efficiency. We have developed an efficient algorithm to facilitate induction on the rule-based case which can avoid the anomaly. Performance studies have shown that the algorithm is superior than a previously proposed algorithm based on backtracking.  相似文献   

14.
The probability reasoning method, fuzzy reasoning method, evidential reasoning method and other reasoning methods are main techniques employed in intelligent systems for processing uncertain and vague information. The concept of inclusion degree was proposed earlier and it has been proved that the methods mentioned above are examples of inclusion degrees. In this paper, we introduce type S1 and type S2 inclusion degrees, discuss the relationship between them, and further propose inclusion degrees on interval numbers, divisions, vectors and set vectors. This paper addresses an uncertainty analysis method with different inclusion degrees for intelligent systems and other systems such as fuzzy relational databases.  相似文献   

15.
一个基于网格服务的分布式关联规则挖掘算法   总被引:4,自引:0,他引:4  
分布式关联规则挖掘在知识发现中占着不可忽视的地位,在以往分布式算法的基础上提出了一个加优先权值的PDDM算法,并将修改后的算法与抽样算法、知识网格的思想相结合形成一个GDS算法.GDS算法改善了以往分布式算法中通信量过载,算法难于扩展的问题,而且只扫描一遍数据库,减缓了大数据集挖掘中的I/O问题.理论分析和试验结果表明提出的算法是有效可行的.  相似文献   

16.
分布式系统中关联规则挖掘研究   总被引:5,自引:0,他引:5  
在分布式系统中如何挖掘关联规则是数据挖掘领域研究的一个重要课题。本文对关联规则分布式挖掘问题进行探讨,给出了关联规则分布式挖掘系统DAMINER的体系结构,提出了一种基于DAMINER的关联规则分布式挖掘算法ARDM。该算法具有通信代价小和时间开销少等优点。  相似文献   

17.
提出了基于超结构的分布式系统的关联规则挖掘的分布式算法 (HSDM) ,该算法与现有的相关分布式挖掘算法相比 ,具有明显的优点 .该算法不需要产生候选项集 ,只需两次扫描各站点局部数据库 ,挖掘速度快 .该算法还采用自底向上的挖掘方式 ,能够对其超结构进行有效剪枝 ,从而大大减少了各站点之间的数据交换 ,提高了算法的效率  相似文献   

18.
一种基于Web服务的分布式数据挖掘体系结构   总被引:4,自引:0,他引:4  
分布式数据挖掘是数据挖掘领域的一个新兴研究课题,而其主要问题是知识共享和软组件重用。结合Web服务技术的跨平台、统一数据表示格式以及可实现软组件重用和数据重用等优点,文中提出了一种基于Web服务的分布式数据挖掘体系,可实现分布式异构环境下的大容量数据的数据挖掘.旨在对异构数据库的数据挖掘进行一些有意义的探讨。  相似文献   

19.
崔建  李强  杨龙坡 《计算机科学》2011,38(4):216-220
为进一步解决对大型事务数据库进行关联规则挖掘时产生的CPU时间开销大和I/O操作频繁的问题,给出了一种基于垂直数据分布的改进关联规则挖掘算法,称为VARMLDb算法。该算法首先有效地把数据库分为内存可以满足要求的若干划分,然后结合有向无环图和垂直数据形式diffse、差集来存储和计算频繁项集,极大地减少了存储中间结果所需的内存大小,解决了传统垂直数据挖掘算法对稠密数据库挖掘效率低下的问题,使该算法可有效地适用于大型稠密数据库的关联规则挖掘。整个算法吸取CARMA算法的优势,只需扫描两次数据库便可完成挖掘过程。实验结果表明该算法是正确的,在大型稠密数据库中,VARMLDb算法具有较高的执行效率。  相似文献   

20.
提出了一个基于数据挖掘的分布式入侵检测系统的设计模型,介绍了模型的体系结构,并对几种数据挖掘算法进行了分析。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号