首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 93 毫秒
1.
面向属性归纳下的多层次决策规则获取算法   总被引:1,自引:0,他引:1  
梁德翠  胡培 《信息与控制》2012,41(1):69-74,82
针对信息系统中容错能力差、样本量小以及条件相同而决策结果不一致等问题,提出了一种在面向属性归纳下基于变精度粗糙集模型的多层次决策规则获取算法.首先,在条件属性的概念层次下分析高低层次决策表在变精度模型中下近似、正域、边界域和负域间关系.基于各层次决策表关系图,先由最高层决策表自顶向下按经典粗糙集模型获取确定性规则,然后再由最底层决策表自底向上获取更抽象的规则.实例分析说明了该算法的可行性.  相似文献   

2.
面向属性的归纳与概念聚类   总被引:3,自引:1,他引:3  
伍小荣  谢立宏 《计算机工程》2003,29(5):92-93,123
面向属性的归纳是新近提出的一种广泛用于数据库中知识发现的方法,文章指出这种方法与一种机器学习方法-概念聚类之间的紧密联系,并描述如何使用一个概念聚类算法进行面向属性的归纳。  相似文献   

3.
面向属性的归纳与概念聚类   总被引:2,自引:0,他引:2  
面向属性的归纳是新近提出的一种广泛用于数据库中的知识发现的方法,提出这种方法与一种机器学习方法--概念聚类之间的紧密联系,并描述如何使用一个概念聚类算法进行面向属性的归纳。  相似文献   

4.
侯丽珊  苗夺谦 《计算机科学》2002,29(12):127-128
1 引言粗糙集理论自1982年由波兰科学家Z.Pawlak提出发展到现在,知识约简已经成为其研究的核心内容之一。知识约简包括属性约简和属性值约简,属性约简是对整个知识库而言的,在保证信息量不减少的前提下,去掉冗余的属性;而属性值的约简是针对每条信息(决策)的描述而言的,在不影响对其分类或决策的能力的情况下,不考虑某些属性的取值。本文所提到的约简如果没有特别说明,特指属性的约简。一般来讲,约简是不唯一的,人们当然希望能够找到具有最少属性的约简,即最小约简。遗憾的是,求解最小约简是NP完全问题,任何非穷举的算法都不能保证所得结果是最优的,这时不得  相似文献   

5.
面向属性的量化归纳   总被引:6,自引:0,他引:6  
数据简化是数据库中知识发现的一个重要研究方向。面向属性的归纳(AOG)可以用于数据简化。首先从数据简化的角度分析讨论了AOG及其不足;AOG的单一属性阈值控制是布尔型控制,在有例外存在的情况下,可能造成数据过度简化,失去数据简化的意义。其次在AOG的基础上提出了面向属性的量化归纳(QAOG)以弥补AOG的不足;引入记录阈值的概念,用属性阈值和记录阈值同时进行控制,使控制从布尔型变成数量型,对没有例外存在的情况产生与AOG相同的效果,而对有例外存在的情况产生比AOG更好的效果,还给出了一个有效的QAOG算法。  相似文献   

6.
杨兴旺 《数字社区&智能家居》2009,5(7):5196-5197,5209
多年来,排课算法是众多专家学者感兴趣的课题,同时也取得了诸多研究成果,诸如基于图论的排课算法、利用人工智能进行排课等。但这些算法都相对复杂,在软件实现上有一定的难度。该文利用回溯算法来解决排课问题,方法简单,易于软件实现。  相似文献   

7.
以面向属性归纳的算法实现知识发现的可行性研究   总被引:1,自引:0,他引:1  
面向属性归纳的方法以关系的或面向属性的操作实现知识发现过程。面向属性归纳的含义还表现在知识获取工具的核心与基础是一组动作基元,包括问题表达,知识提升,人机交互等。文中着重讨论了知识生成规则,基于置信度的归纳策略,并以野生大豆数据库为基础设计了一个算法。  相似文献   

8.
多年来,排课算法是众多专家学者感兴趣的课题,同时也取得了诸多研究成果,诸如基于图论的排课算法、利用人工智能进行排课等。但这些算法都相对复杂,在软件实现上有一定的难度。该文利用回溯算法来解决排课问题,方法简单,易于软件实现。  相似文献   

9.
在知识发现过程中用户感兴趣的往往是一些高层次、适当概括的简化信息,面向属性的归纳是目前主要的数据归约方法,一般是仅考虑原始数据所提供简单的统计信息.本文提出的基于量化扩展概念格的属性归纳算法,采用概念的爬升进行相应的泛化来完成多层、多属性归纳.与面向属性归纳算法比较,该算法的泛化路径不是唯一的,在量化扩展概念格的哈斯图中容易找到合适的泛化路径和阈值,得到满足用户要求合理的属性归纳结果,以提供用户所需的不同粒度的知识.  相似文献   

10.
基于回溯思想的银行家算法优化   总被引:1,自引:0,他引:1  
本文针对传统银行家算法逐次分配资源的限制,提出利用回溯思想整体检测系统安全性,达到优化算法的效果。  相似文献   

11.
The recent progress in high-speed communication networks and large-capacity storage devices has led to a tremendous increase in the number of databases and the volume of data in them. This has created a need to discover structural equivalence relationships from the databases since queries tend to access information from structurally equivalent media objects residing in different databases. The more databases there are, the more query-processing performance improvement can be achieved when the structural equivalence relationships are automatically discovered. In response to such a demand, association rule mining has emerged and proven to be a highly successful technique for discovering knowledge from large databases. In this paper, we explore a generalized affinity-based association rule mining approach to discover the quasi-equivalence relationships from a network of databases. The algorithm is implemented and two empirical studies on real databases are conducted. The results show that the proposed generalized affinity-based association rule mining approach not only correctly exploits the set of quasi-equivalent media objects from the databases, but also outperforms the basic association rule mining approach in the discovery of the quasi-equivalent media object pairs. Received 16 September 1999 / Revised 12 September 2000 / Accepted in revised form 9 January 2001  相似文献   

12.
一种基于遗传算法的知识挖掘算法   总被引:14,自引:0,他引:14  
数据挖掘是近年来数据库领域中出现的一个新兴研究热点。传统的数据挖掘是从大量数据中获取知识。该文基于遗传算法提出了一种新的知识获取方法-GAKDK,通过对知识树的交叉交异操作,从现有规则中获取未发现的知识。  相似文献   

13.
ESTIMATING DBLEARN'S POTENTIAL FOR KNOWLEDGE DISCOVERY IN DATABASES   总被引:1,自引:0,他引:1  
We propose a procedure for estimating DBLEARN's potential for knowledge discovery, given a relational database and concept hierarchies. This procedure is most useful for evaluating alternative concept hierarchies for the same database. The DBLEARN knowledge discovery program uses an attribute-oriented inductive-inference method to discover potentially significant high-level relationships in a database. A concept forest, with at most one concept hierarchy for each attribute, defines the possible generalizations that DBLEARN can make for a database. The potential for discovery in a database is estimated by examining the complexity of the corresponding concept forest. Two heuristic measures are defined based on the number, depth, and height of the interior nodes. Higher values for these measures indicate more complex concept forests and arguably more potential for discovery. Experimental results using a variety of concept forests and four commercial databases show that in practice both measures permit quite reliable decisions to be made; thus, the simplest may be most appropriate.  相似文献   

14.
Large databases are becoming increasingly common in civil infrastructure applications. Although it is relatively simple to specifically query these databases at a low level, more abstract questions like ‘How does the environment affect pavement cracking?’ are difficult to answer with traditional methods. Data mining techniques can provide a solution for learning abstract knowledge from civil infrastruc-ture databases. However, data mining needs to be performed within a systematic process to ensure correct and reproducible results. Many decisions must be made during this process, making it difficult for novice analysts to apply data mining techniques thoroughly. This paper presents an application of a knowledge discovery process to data collected for an ‘intelligent’ building. The knowledge discovery process is illustrated and explained through this case study. Additionally, we discuss the importance of this case study in the context of a research effort to develop an interactive guide for the knowledge discovery process.  相似文献   

15.
刘洋  张卓  周清雷 《计算机科学》2014,41(12):164-167
医疗健康数据通常属性较多,且存在连续型、离散型并存的混合数据,这在很大程度上限制了知识发现方法对医疗健康数据的挖掘效率。以模糊粗糙集理论为基础,研究混合数据上的分类规则挖掘方法,通过引入规则获取算法的泛化阈值,来控制获取规则集的大小和复杂程度,提高粗糙集知识发现方法在医疗健康数据上的分类效率。最后通过对比实验验证了该算法在医疗决策表上挖掘规则的有效性。  相似文献   

16.
A Multistrategy Approach to Relational Knowledge Discovery in Databases   总被引:1,自引:0,他引:1  
When learning from very large databases, the reduction of complexity is extremely important. Two extremes of making knowledge discovery in databases (KDD) feasible have been put forward. One extreme is to choose a very simple hypothesis language, thereby being capable of very fast learning on real-world databases. The opposite extreme is to select a small data set, thereby being able to learn very expressive (first-order logic) hypotheses. A multistrategy approach allows one to include most of these advantages and exclude most of the disadvantages. Simpler learning algorithms detect hierarchies which are used to structure the hypothesis space for a more complex learning algorithm. The better structured the hypothesis space is, the better learning can prune away uninteresting or losing hypotheses and the faster it becomes.We have combined inductive logic programming (ILP) directly with a relational database management system. The ILP algorithm is controlled in a model-driven way by the user and in a data-driven way by structures that are induced by three simple learning algorithms.  相似文献   

17.
EDM: A general framework for Data Mining based on Evidence Theory   总被引:16,自引:0,他引:16  
Data Mining or Knowledge Discovery in Databases [1, 15, 23] is currently one of the most exciting and challenging areas where database techniques are coupled with techniques from Artificial Intelligence and mathematical sub-disciplines to great potential advantage. It has been defined as the non-trivial extraction of implicit, previously unknown and potentially useful information from data. A lot of research effort is being directed towards building tools for discovering interesting patterns which are hidden below the surface in databases. However, most of the work being done in this field has been problem-specific and no general framework has yet been proposed for Data Mining. In this paper we seek to remedy this by proposing, EDM — Evidence-based Data Mining — a general framework for Data Mining based on Evidence Theory.

Having a general framework for Data Mining offers a number of advantages. It provides a common method for representing knowledge which allows prior knowledge from the user or knowledge discoveryd by another discovery process to be incorporated into the discovery process. A common knowledge representation also supports the discovery of meta-knowledge from knowledge discovered by different Data Mining techniques. Furthermore, a general framework can provide facilities that are common to most discovery processes, e.g. incorporating domain knowledge and dealing with missing values.

The framework presented in this paper has the following additional advantages. The framework is inherently parallel. Thus, algorithms developed within this framework will also be parallel and will therefore be expected to be efficient for large data sets — a necessity as most commercial data sets, relational or otherwise, are very large. This is compounded by the fact that the algorithms are complex. Also, the parallelism within the framework allows its use in parallel, distributed and heterogeneous databases. The framework is easily updated and new discovery methods can be readily incorporated within the framework, making it ‘general’ in the functional sense in addition to the representational sense considered above. The framework provides an intuitive way of dealing with missing data during the discovery process using the concept of Ignorance borrowed from Evidence Theory.

The framework consists of a method for representing data and knowledge, and methods for data manipulation or knowledge discovery. We suggest an extension of the conventional definition of mass functions in Evidence Theory for use in Data Mining, as a means to represent evidence of the existence of rules in the database. The discovery process within EDM consists of a series of operations on the mass functions. Each operation is carried out by an EDM operator. We provide a classification for the EDM operators based on the discovery functions performed by them and discuss aspects of the induction, domain and combination operator classes.

The application of EDM to two separate Data Mining tasks is also addressed, highlighting the advantages of using a general framework for Data Mining in general and, in particular, using one that is based on Evidence Theory.  相似文献   


18.
Efficient Rule-Based Attribute-Oriented Induction for Data Mining   总被引:3,自引:0,他引:3  
Data mining has become an important technique which has tremendous potential in many commercial and industrial applications. Attribute-oriented induction is a powerful mining technique and has been successfully implemented in the data mining system DBMiner (Han et al. Proc. 1996 Int'l Conf. on Data Mining and Knowledge Discovery (KDD'96), Portland, Oregon, 1996). However, its induction capability is limited by the unconditional concept generalization. In this paper, we extend the concept generalization to rule-based concept hierarchy, which enhances greatly its induction power. When previously proposed induction algorithm is applied to the more general rule-based case, a problem of induction anomaly occurs which impacts its efficiency. We have developed an efficient algorithm to facilitate induction on the rule-based case which can avoid the anomaly. Performance studies have shown that the algorithm is superior than a previously proposed algorithm based on backtracking.  相似文献   

19.
阐述了传统遗传算法的基本思想、原理和步骤及其在数据挖掘(规则集发现)中的应用,给出了基于遗传算法的知识规则挖掘算法的基本思想和关键问题,包括知识规则表示、适应度函数定义等,继而提出多种群并行进化结构,利用精英重组策略,产生池进化模型以及自适应参数的手段调整并行遗传算法进行数据挖掘。在算法具体实现过程中,采用了动态变异交叉概率等方法,有效避免了并行遗传算法中早熟现象的发生。以北美香菇数据为例,进行并行遗传算法挖掘分类规则,实验说明了该算法在发现和进化规则方面的有效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号