首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
一种基于粗集理论的增量式属性约简算法   总被引:3,自引:1,他引:2  
增量式学习中,当信息系统的对象和决策属性不变而不断增加条件属性时,为了获得该系统的约简属性,一般方法是对决策表中的所有数据重新计算,但这种方法显然效率很低且不必要.在粗集理论的基础上,给出相对区分矩阵和绝对区分矩阵的定义,提出一种新的增量式属性约简算法.通过实例得知:由该算法得到的属性约简与传统算法得到的属性约简结果相同,但该算法不仅降低了时间复杂度而且其分类质量一般要优于原来的分类质量,所以该属性约简具有一定的实用价值.  相似文献   

2.
Attribute reduction based on evidence theory in incomplete decision systems   总被引:3,自引:0,他引:3  
Wei-Zhi Wu 《Information Sciences》2008,178(5):1355-1371
Attribute reduction is a basic issue in knowledge representation and data mining. This paper deals with attribute reduction in incomplete information systems and incomplete decision systems based on Dempster-Shafer theory of evidence. The concepts of plausibility reduct and belief reduct in incomplete information systems as well as relative plausibility reduct and relative belief reduct in incomplete decision systems are introduced. It is shown that in an incomplete information system an attribute set is a belief reduct if and only if it is a classical reduct and a plausibility consistent set must be a classical consistent set. In a consistent incomplete decision system, the concepts of relative reduct, relative plausibility reduct, and relative belief reduct are all equivalent. In an inconsistent incomplete decision system, an attribute set is a relative plausibility reduct if and only if it is a relative reduct, a plausibility consistent set must be a belief consistent set, and a belief consistent set is not a plausibility consistent set in general.  相似文献   

3.
从粒度计算的角度对粗糙集理论的属性约简进行研究,分别基于代数方法和信息论方法定义了粒度差和粒度熵的概念,并在此基础上提出了两种新的属性约简算法.实验分析表明,这两种可靠有效的粒度计算方法都能得到信息表的最小约简,为进一步研究知识的粒度计算提供了可行的方法.  相似文献   

4.
一种基于粗糙集理论的规则提取方法   总被引:2,自引:1,他引:2  
规则提取是实现智能信息系统的重要环节,也是一个难点。针对信息系统中的规则提取问题,提出了一种基于粗糙集的研究方法,并对规则提取涉及到的属性约简、属性值约简等问题进行了研究。根据粗糙集中的不可分辨关系建立了可辫识向量,以利用可辨识向量的加法法则运算求得核属性以及属性重要性,然后以核属性为基础、属性重要性为启发信息,求得信息表的一个属性约简。在此基础上,利用条件属性与决策属性之间的对应关系,对信息表中的每条规则通过删除冗余属性值来完成信息表的属性值约简,最终实现规则提取。数值实例和试验表明本算法是有效、可行的。  相似文献   

5.
Tabu search for attribute reduction in rough set theory   总被引:2,自引:0,他引:2  
In this paper, we consider a memory-based heuristic of tabu search to solve the attribute reduction problem in rough set theory. The proposed method, called tabu search attribute reduction (TSAR), is a high-level TS with long-term memory. Therefore, TSAR invokes diversification and intensification search schemes besides the TS neighborhood search methodology. TSAR shows promising and competitive performance compared with some other CI tools in terms of solution qualities. Moreover, TSAR shows a superior performance in saving the computational costs.  相似文献   

6.
In this paper, we present a new method of data decomposition to avoid the necessity of reasoning from data with missing attribute values. We define firstly a general binary relation on the original incomplete dataset. This binary relation generates data subsets without missing values. These data subsets are used to generate a topological base relation which decomposes datasets. We investigate a new approach to find the missing values in incomplete datasets. New pre-topological approximations are initiated and some of their properties are proved. Also, pre-topological measures are defined and studied. Finally, the reducts and the core of incomplete information system are determined.  相似文献   

7.
一种基于粗糙集理论的快速并行属性约简算法   总被引:2,自引:0,他引:2  
将并行计算的思想融入基于粗糙集理论的快速属性约简中,提出了一种基于粗糙集理论的快速并行属性约简算法.该算法在保证约简结果是Pawlak约简的情况下,将属性约简任务划分到多个处理器中同时处理,从而大大提高了属性约简的效率.仿真实验结果说明了该算法的高效性.  相似文献   

8.
9.
基于可辨识矩阵的启发式属性约简方法及其应用   总被引:23,自引:2,他引:23  
在基于可辨识矩阵的属性约简算法的基础上,提出了基于可辨识矩阵的计算属性重要性的方法,并以此作为启发式知识来约简决策表中的冗余属性。这种方法直接源于评审数据,思路清晰,拟合结果表明本约简算法合理、可靠。  相似文献   

10.
增量式属性约简是一种针对动态数据集的新型属性约简方法.然而目前的增量式属性约简很少有对不完备混合型的信息系统进行研究.针对这类问题提出一种属性增加时的增量式属性约简算法.在不完备混合型信息系统下引入邻域容差关系.基于邻域容差关系的粒化单调性,提出信息系统属性增加时邻域容差条件熵的增量式更新方法,并提出了不完备混合型信息...  相似文献   

11.
12.
基于粒计算的属性约简改进算法   总被引:1,自引:0,他引:1  
粒计算是基于问题求解、模式分类及信息处理的多层次粒结构分析方法,它是粗糙集、模糊集、数据挖掘以及人工智能等多领域交叉的一门新学科。在讨论知识粒度的基本概念和性质后,介绍了通过计算属性对约简核的重要度SigCore(A)(a)来进行信息系统约简的方法。考虑到有的信息系统没有约简核,提出了基于粒计算的约简算法的改进。改进后的算法既可以用于有约简核的系统,也可以用于没有约简核的系统。数值实验证实了算法的有效性。  相似文献   

13.
一种决策表增量属性约简算法   总被引:11,自引:0,他引:11  
胡峰  代劲  王国胤 《控制与决策》2007,22(3):268-272
为了对动态变化的决策表进行属性约简处理,在改进的分辨矩阵的基础上,提出一种增量式属性约简算法,当决策表添加新的记录后.能快速得到新决策表的所有约简和最小约筒.此外,通过对不相容决策表的正区域的决策值和边界域对原决策表进行分解.得到了一种分布式增量属性约简模型.仿真研究表明了算法的正确性和高效性.  相似文献   

14.
概率差别矩阵与不完备信息系统属性约简   总被引:1,自引:0,他引:1  
闫德勤 《计算机科学》2005,32(8):164-166
差别矩阵的概念是基于粗糙集理论对信息系统进行属性约简的一个重要内容。针对不完备信息系统的属性约简本文提出了一种概率差别矩阵的概念与构造方法,给出了相关的定理。在此基础上提出了一种利用概率差别矩阵对不完备信息系统属性约简的方法,并给出了应用举例。  相似文献   

15.
粗糙集的划分贴近度及基于划分贴近度的属性约简算法   总被引:1,自引:0,他引:1  
Rough集理论是近年来发展起来的一种处理不确定、不精确、不完整数据的数学工具.属性约简是粗糙集的核心内容之一.本文提出了一个新的不确定性度量一划分贴近度,并基于划分贴近度分别提出了对一般信息系统和决策信息系统进行属性约简的算法,对决策信息系统进行约简的算法不仅可以对一致决策表进行约简,还可以对不一致决策表进行有效的约简.  相似文献   

16.
Rough set reduction has been used as an important preprocessing tool for pattern recognition, machine learning and data mining. As the classical Pawlak rough sets can just be used to evaluate categorical features, a neighborhood rough set model is introduced to deal with numerical data sets. Three-way decision theory proposed by Yao comes from Pawlak rough sets and probability rough sets for trading off different types of classification error in order to obtain a minimum cost ternary classifier. In this paper, we discuss reduction questions based on three-way decisions and neighborhood rough sets. First, the three-way decision reducts of positive region preservation, boundary region preservation and negative region preservation are introduced into the neighborhood rough set model. Second, three condition entropy measures are constructed based on three-way decision regions by considering variants of neighborhood classes. The monotonic principles of entropy measures are proved, from which we can obtain the heuristic reduction algorithms in neighborhood systems. Finally, the experimental results show that the three-way decision reduction approaches are effective feature selection techniques for addressing numerical data sets.  相似文献   

17.
More than two decades ago the imbalanced data problem turned out to be one of the most important and challenging problems. Indeed, missing information about the minority class leads to a significant degradation in classifier performance. Moreover, comprehensive research has proved that there are certain factors increasing the problem’s complexity. These additional difficulties are closely related to the data distribution over decision classes. In spite of numerous methods which have been proposed, the flexibility of existing solutions needs further improvement. Therefore, we offer a novel rough–granular computing approach (RGA, in short) to address the mentioned issues. New synthetic examples are generated only in specific regions of feature space. This selective oversampling approach is applied to reduce the number of misclassified minority class examples. A strategy relevant for a given problem is obtained by formation of information granules and an analysis of their degrees of inclusion in the minority class. Potential inconsistencies are eliminated by applying an editing phase based on a similarity relation. The most significant algorithm parameters are tuned in an iterative process. The set of evaluated parameters includes the number of nearest neighbours, complexity threshold, distance threshold and cardinality redundancy. Each data model is built by exploiting different parameters’ values. The results obtained by the experimental study on different datasets from the UCI repository are presented. They prove that the proposed method of inducing the neighbourhoods of examples is crucial in the proper creation of synthetic positive instances. The proposed algorithm outperforms related methods in most of the tested datasets. The set of valid parameters for the Rough–Granular Approach (RGA) technique is established.  相似文献   

18.
The study introduces and discusses a principle of justifiable granularity, which supports a coherent way of designing information granules in presence of experimental evidence (either of numerical or granular character). The term “justifiable” pertains to the construction of the information granule, which is formed in such a way that it is (a) highly legitimate (justified) in light of the experimental evidence, and (b) specific enough meaning it comes with a well-articulated semantics (meaning). The design process associates with a well-defined optimization problem with the two requirements of experimental justification and specificity. A series of experiments is provided as well as a number of constructs carried for various formalisms of information granules (intervals, fuzzy sets, rough sets, and shadowed sets) are discussed as well.  相似文献   

19.
The algebraic structures of generalized rough set theory   总被引:1,自引:0,他引:1  
Rough set theory is an important technique for knowledge discovery in databases, and its algebraic structure is part of the foundation of rough set theory. In this paper, we present the structures of the lower and upper approximations based on arbitrary binary relations. Some existing results concerning the interpretation of belief functions in rough set backgrounds are also extended. Based on the concepts of definable sets in rough set theory, two important Boolean subalgebras in the generalized rough sets are investigated. An algorithm to compute atoms for these two Boolean algebras is presented.  相似文献   

20.
Of all of the challenges which face the effective application of computational intelligence technologies for pattern recognition, dataset dimensionality is undoubtedly one of the primary impediments. In order for pattern classifiers to be efficient, a dimensionality reduction stage is usually performed prior to classification. Much use has been made of rough set theory for this purpose as it is completely data-driven and no other information is required; most other methods require some additional knowledge. However, traditional rough set-based methods in the literature are restricted to the requirement that all data must be discrete. It is therefore not possible to consider real-valued or noisy data. This is usually addressed by employing a discretisation method, which can result in information loss. This paper proposes a new approach based on the tolerance rough set model, which has the ability to deal with real-valued data whilst simultaneously retaining dataset semantics. More significantly, this paper describes the underlying mechanism for this new approach to utilise the information contained within the boundary region or region of uncertainty. The use of this information can result in the discovery of more compact feature subsets and improved classification accuracy. These results are supported by an experimental evaluation which compares the proposed approach with a number of existing feature selection techniques.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号