首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
基于粒度计算的特征选择方法   总被引:1,自引:0,他引:1  
从粒度计算的划分模型出发,重新定义了相容决策表的约简,并给出了一种新的基于粒度计算的属性约简算法.该算法以信息熵作为启发信息,通过逐渐增加属性构成条件属性集相对于决策属性的约简,再通过删除约简中的所有不必要属性,得到最小约简.该算法有效地降低了计算属性约简的时间复杂度,可以用于较大规模数据集的特征选择.在5个公开的基因表达数据集上的实验证明了该算法能找到高区分能力的特征子集.  相似文献   

2.
Adaptive multilevel rough entropy evolutionary thresholding   总被引:1,自引:0,他引:1  
In this study, comprehensive research into rough set entropy-based thresholding image segmentation techniques has been performed producing new and robust algorithmic schemes. Segmentation is the low-level image transformation routine that partitions an input image into distinct disjoint and homogenous regions using thresholding algorithms most often applied in practical situations, especially when there is pressing need for algorithm implementation simplicity, high segmentation quality, and robustness. Combining entropy-based thresholding with rough set results in the rough entropy thresholding algorithm.The authors propose a new algorithm based on granular multilevel rough entropy evolutionary thresholding that operates on a multilevel domain. The MRET algorithm performance has been compared to the iterative RET algorithm and standard k-means clustering methods on the basis of β-index as a representative validation measure. Performance in experimental assessment suggests that granular multilevel rough entropy threshold based segmentations - MRET - present high quality, comparable with and often better than k-means clustering based segmentations. In this context, the rough entropy evolutionary thresholding MRET algorithm is suitable for specific segmentation tasks, when seeking solutions that incorporate spatial data features with particular characteristics.  相似文献   

3.
    
More than two decades ago the imbalanced data problem turned out to be one of the most important and challenging problems. Indeed, missing information about the minority class leads to a significant degradation in classifier performance. Moreover, comprehensive research has proved that there are certain factors increasing the problem’s complexity. These additional difficulties are closely related to the data distribution over decision classes. In spite of numerous methods which have been proposed, the flexibility of existing solutions needs further improvement. Therefore, we offer a novel rough–granular computing approach (RGA, in short) to address the mentioned issues. New synthetic examples are generated only in specific regions of feature space. This selective oversampling approach is applied to reduce the number of misclassified minority class examples. A strategy relevant for a given problem is obtained by formation of information granules and an analysis of their degrees of inclusion in the minority class. Potential inconsistencies are eliminated by applying an editing phase based on a similarity relation. The most significant algorithm parameters are tuned in an iterative process. The set of evaluated parameters includes the number of nearest neighbours, complexity threshold, distance threshold and cardinality redundancy. Each data model is built by exploiting different parameters’ values. The results obtained by the experimental study on different datasets from the UCI repository are presented. They prove that the proposed method of inducing the neighbourhoods of examples is crucial in the proper creation of synthetic positive instances. The proposed algorithm outperforms related methods in most of the tested datasets. The set of valid parameters for the Rough–Granular Approach (RGA) technique is established.  相似文献   

4.
Wei-Zhi Wu 《Information Sciences》2011,181(18):3878-3897
Granular computing and acquisition of if-then rules are two basic issues in knowledge representation and data mining. A formal approach to granular computing with multi-scale data measured at different levels of granulations is proposed in this paper. The concept of labelled blocks determined by a surjective function is first introduced. Lower and upper label-block approximations of sets are then defined. Multi-scale granular labelled partitions and multi-scale decision granular labelled partitions as well as their derived rough set approximations are further formulated to analyze hierarchically structured data. Finally, the concept of multi-scale information tables in the context of rough set is proposed and the unravelling of decision rules at different scales in multi-scale decision tables is discussed.  相似文献   

5.
知识的不确定性度量研究是人工智能领域的一个热点问题。在回顾几种经典知识不确定性度量方法的基础上,系统研究了这些度量方法之间的联系与区别,结果表明信息粒度等度量与知识粒度是等价的,而Rough熵和协同熵等则可看作是信息熵的派生,并通过实例验证了结论的正确性。  相似文献   

6.
The algebraic structures of generalized rough set theory   总被引:1,自引:0,他引:1  
Rough set theory is an important technique for knowledge discovery in databases, and its algebraic structure is part of the foundation of rough set theory. In this paper, we present the structures of the lower and upper approximations based on arbitrary binary relations. Some existing results concerning the interpretation of belief functions in rough set backgrounds are also extended. Based on the concepts of definable sets in rough set theory, two important Boolean subalgebras in the generalized rough sets are investigated. An algorithm to compute atoms for these two Boolean algebras is presented.  相似文献   

7.
张贤勇 《计算机科学》2013,40(9):216-220
近似空间中,精度与程度结合形成的双量化是一个创新课题.利用笛卡尔积进行量化信息合成,基于变精度上近似与程度下近似探讨双量化边界及其算法.首先,基于上述两个近似,自然地构建了双量化扩张粗糙集模型,定义了双量化扩张边界.接着,分析了该边界的双量化语义,得到了该边界的精确刻画与数学性质;为计算该边界,提出了近似集算法与信息粒算法,进行了算法分析与算法比较,得到了信息粒算法具有更优的算法空间复杂性的重要结论.最后,应用一个医疗实例对该边界及其算法进行了说明.该边界扩张了经典Pawlak边界,并对局部不确定性进行了双量化的完备与精细刻画,这对双量化的不确定性分析与应用具有重要意义.  相似文献   

8.
粒计算是一种处理不确定性数据的理论方法,涵盖粗糙集、模糊集、商空间、词计算等。目前,数据的粒化与粒的计算主要涉及集合的运算与度量,集合运算的低效制约着粒计算相关算法的应用领域。为此,提出了一种二进制粒计算模型,给出了粒的三层结构,包括粒子、粒群与粒库,并定义了二进制粒子及二进制粒子的运算,将传统的集合运算转化为二进制数的计算,进一步给出了二进制粒子的距离度量,将等价类的集合表示方式转化为粒子的距离度量表示方式,给出了粒子距离的相关性质。该模型定义了二进制粒群距离的概念,给出了二进制粒群距离的计算方法,提出了基于二进制粒群距离的属性约简方法,证明了该方法与经典粗糙集约简方法的等价性,并以二进制粒群距离作为启发式信息,给出了两种约简算法。  相似文献   

9.
Applying rough sets to market timing decisions   总被引:1,自引:0,他引:1  
A lot of research has been done to predict economic development. The problem studied here is about the stock prediction for use of investors. More specifically, the stock market's movements will be analyzed and predicted. We wish to retrieve knowledge that could guide investors on when to buy and sell.Through a detailed case study on trading S&P 500 index, rough sets is shown to be an applicable and effective tool to achieve this goal. Some problems concerning time series transformation, indicator selection, trading system building in real implementation are also discussed.  相似文献   

10.
粒计算的α_决策逻辑语言   总被引:1,自引:0,他引:1  
提出一种用于粒计算的α_决策逻辑语言.该语言是由Tarski意义下的模型和可满足性所描述的一种特殊的经典谓词逻辑.由属性值域的模糊子集代替经典的单值信息函数所得到的广义信息系统对应于模型,借助于模糊集理论的水平截集的概念,归纳地定义对泉在一定阈值水平下满足某公式.最后讨论如何利用α_决策逻辑语言描述不同的粒世界及分析形式概念和决策规则.  相似文献   

11.
粒度计算研究综述   总被引:29,自引:4,他引:25  
粒度计算GrC(Granular Computing)是信息处理的一种新的概念和计算范式,覆盖了所有有关粒度的理论、方法、技术和工具的研究.它是词计算理论、粗糙集理论、商空间理论、区间计算等的超集,也是软计算科学的一个分支,已成为模糊的、不完整的、不精确的及海量的信息处理的重要工具和人工智能研究领域的热点之一.本文综述了粒度计算的发展动机、研究现状及发展趋势,着重介绍了粒度计算的主要理论模型与方法及其在不同领域的应用,对存在的问题进行了分析,并提出了进一步研究的方向.  相似文献   

12.
In this paper, we discuss the importance of information systems in modeling interactive computations performed on (complex) granules and we propose a formal approach to interactive computations based on generalized information systems and rough sets which can be combined with other soft computing paradigms such as fuzzy sets or evolutionary computing, but also with machine learning and data mining techniques. Information systems are treated as dynamic granules used for representing the results of the interaction of attributes with the environment. Two kinds of attributes are distinguished, namely, the perception attributes, including sensory attributes, and the action attributes. Sensory attributes are the basic perception attributes, other perception attributes are constructed on the basis of the sensory ones. Actions are activated when their guards, being often complex and vague concepts, are satisfied to a satisfactory degree. The guards can be approximated on the basis of measurements performed by sensory attributes rather than defined exactly. Satisfiability degrees for guards are results of reasoning called the adaptive judgment. The approximations are induced using hierarchical modeling. We show that information systems can be used for modeling more advanced forms of interactions in hierarchical modeling. The role of hierarchical interactions is emphasized in the modeling of interactive computations. Some illustrative examples of interactions used in the ACT-R 6.0 system are reported. ACT-R 6.0 is based on a cognitive architecture and can be treated as an example of a highly interactive complex granule which can be involved in hierarchical interactions. For modeling of interactive computations, we propose much more general information systems than the studied dynamic information systems (see, e.g., Ciucci (2010) [8] and Pa?asiński and Pancerz (2010) [32]). For example, the dynamic information systems are making it possible to consider incremental changes in information systems. However, they do not contain the perception and action attributes necessary for modeling interactive computations, in particular for modeling intrastep interactions.  相似文献   

13.
Soft sets and soft rough sets   总被引:4,自引:0,他引:4  
In this study, we establish an interesting connection between two mathematical approaches to vagueness: rough sets and soft sets. Soft set theory is utilized, for the first time, to generalize Pawlak’s rough set model. Based on the novel granulation structures called soft approximation spaces, soft rough approximations and soft rough sets are introduced. Basic properties of soft rough approximations are presented and supported by some illustrative examples. We also define new types of soft sets such as full soft sets, intersection complete soft sets and partition soft sets. The notion of soft rough equal relations is proposed and related properties are examined. We also show that Pawlak’s rough set model can be viewed as a special case of the soft rough sets, and these two notions will coincide provided that the underlying soft set in the soft approximation space is a partition soft set. Moreover, an example containing a comparative analysis between rough sets and soft rough sets is given.  相似文献   

14.
    
Classification of objects and background in a complex scene is a challenging early vision problem. Specifically, the problem is compounded under poor illumination conditions. In this paper, the problem of object and background classification has been addressed under unevenly illuminated conditions. The challenge is to extract the actual boundary under poorly illuminated portion. This has been addressed using the notion of rough sets and granular computing. In order to take care of differently illuminated portions over the image, adaptive window growing approach has been employed to partition the image into different windows and optimum threshold for classification over a given window has been determined by the four proposed granular computing based schemes. Over a particular window, illumination varies significantly posing a challenge for classification. In order to deal with these issues, we have proposed four schemes based on heterogeneous and non-homogeneous granulation. They are; (i) Heterogeneous Granulation based Window Growing (HGWG), (ii) Empirical Non-homogeneous Granulation based Window Growing (ENHWG), (iii) Fuzzy Gradient Non-homogeneous based Window Growing (FNHWG), (iv) Fuzzy Gradient Non-homogeneous Constrained Neighbourhood based Window Growing (FNHCNGWG). The proposed schemes have been tested with images from Berkeley image database, specifically with unevenly illuminated images having single and multiple objects. The performance of the proposed schemes has been evaluated based on four metrics. The performance of the FNHWG and FNHCNGWG schemes has been compared with Otsu, K-means, FCM, PCM, Pal’s method, HGWG, ENHWG and Fuzzy Non-homogeneous Neighbourhood based Window growing (FNHNGWG) schemes and found to be superior to the existing ones.  相似文献   

15.
基于元信息的粗糙集规则并行挖掘方法   总被引:1,自引:0,他引:1  
苏健  高济 《计算机科学》2003,30(3):35-39
1.引言在当前的信息化时代,为从大量积累的历史数据中获取有用的知识,使得数据挖掘已成为研究热点。Pawlak教授提出粗糙集合理论,经过众多学者的研究和完善,已成为数据挖掘的重要手段。在大数据环境下,数据挖掘方法的速度将直接影响整个数据挖掘系统的性能,如何有效地提高数据挖掘方法的速度,是迫切需要解决的问题。与此同时,计算机网络存在大量的运算资源,充分利用这些资源是提高数据挖掘方法速度的有效途径。为此,本文提出  相似文献   

16.
粗糙集理论和自组织特征映射SOFM(Self-Organizing-Feature-Map)神经网络在聚类分析中有各自的优势和劣势,结合SOFM神经网络和粗糙集理论提出一种算法.该算法利用粗糙集理论的属性约简去掉样本的冗余属性,并将处理过的数据作为SOFM神经网络的训练样本,从而减小了SOFM网络的规模,因此提高了样本的聚类效率.  相似文献   

17.
基于分治的属性约简复杂性分析   总被引:2,自引:0,他引:2  
属性约简是粗糙集理论研究的主要内容之一,该文采用了分治策略,提出了一个新的属性约简方法,将计算整个全域上的属性约简问题转化为计算相应划分的子区域上属性约简问题。将原有计算POSX0(Y)的算法复杂度O(|A||U|2)犤4犦降为O(|A|(|Y1|2+|Y2|2+……+|Yn|2)),对于一般比较大的|U|来说,明显地提高了属性约简可计算性和计算效率。  相似文献   

18.
P2P网贷在爆发式增长的同时,也面临着巨大的信用风险。针对这一问题,首先提出了综合评判法这种信用评估方法,并且运用融360的数据对其进行了验证。其次,提出了利用粒计算的信息融合方法将第三方数据与已有信用评估值进行信息融合,进而不仅对原有信用评估值信息进行了补充,而且使彼此得到相互印证。最后通过模拟第三方数据对运用粒计算的信息融合方法进行了验证。  相似文献   

19.
Nowadays, multi-label classification methods are of increasing interest in the areas such as text categorization, image annotation and protein function classification. Due to the correlation among the labels, traditional single-label classification methods are not directly applicable to the multi-label classification problem. This paper presents two novel multi-label classification algorithms based on the variable precision neighborhood rough sets, called multi-label classification using rough sets (MLRS) and MLRS using local correlation (MLRS-LC). The proposed algorithms consider two important factors that affect the accuracy of prediction, namely the correlation among the labels and the uncertainty that exists within the mapping between the feature space and the label space. MLRS provides a global view at the label correlation while MLRS-LC deals with the label correlation at the local level. Given a new instance, MLRS determines its location and then computes the probabilities of labels according to its location. The MLRS-LC first finds out its topic and then the probabilities of new instance belonging to each class is calculated in related topic. A series of experiments reported for seven multi-label datasets show that MLRS and MLRS-LC achieve promising performance when compared with some well-known multi-label learning algorithms.  相似文献   

20.
Feature selection (attribute reduction) from large-scale incomplete data is a challenging problem in areas such as pattern recognition, machine learning and data mining. In rough set theory, feature selection from incomplete data aims to retain the discriminatory power of original features. To address this issue, many feature selection algorithms have been proposed, however, these algorithms are often computationally time-consuming. To overcome this shortcoming, we introduce in this paper a theoretic framework based on rough set theory, which is called positive approximation and can be used to accelerate a heuristic process for feature selection from incomplete data. As an application of the proposed accelerator, a general feature selection algorithm is designed. By integrating the accelerator into a heuristic algorithm, we obtain several modified representative heuristic feature selection algorithms in rough set theory. Experiments show that these modified algorithms outperform their original counterparts. It is worth noting that the performance of the modified algorithms becomes more visible when dealing with larger data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号