共查询到18条相似文献,搜索用时 203 毫秒
1.
基于子集类蚁群模型的属性相对约简算法 总被引:2,自引:0,他引:2
粗糙集属性约简是一个典型的NP-hard问题。提出了一种基于子集类蚁群模型的属性相对约简算法,该算法采用转移概率对每个属性随机搜索,直到获得一个分类能力与决策属性分类能力一致的属性子集。提出的基于信息素变异的蚁群算法,不仅提高了解的质量,而且有效避免了早熟收敛。106组病例数据的实验结果表明,该算法能够发现较好的决策表相对约简与决策规则。 相似文献
2.
基于关联规则的特征选择算法 总被引:2,自引:0,他引:2
关联规则能够发现数据库中属性之间的关联,通过优先选择短规则用于相关属性的选择,有可能得到最小的属性子集.基于此,本文提出一种基于关联规则的特征选择算法,实验结果表明在属性子集大小和分类精度上优于多种特征选择方法.同时,对支持度和置信度对算法效果的影响进行探索,结果表明高的支持度和置信度并不导致高的分类精度和小的特征子集,而充足的规则数是基于关联规则特征选择算法高效的必要条件. 相似文献
3.
粗糙集理论是一种有效的处理不一致、不精确和不完备等各种信息的数学分析工具。属性约简算法是粗糙集理论的关键技术之一,是数据挖掘研究的一个重要课题,也是知识获取中研究的关键问题之一。高效的属性约简算法使属性约简的求解被证实是一个NP-Hard问题,它通常是一个预处理阶段,使适应决策表上的分类分析。本文提出一种有效的方法——SEGMENT-SIG,可以得到最小约简子集,保持决策表的分类一致性。本文对算法最坏的时间计算复杂度进行了分析,该算法的输出是两种不同的分类器,一个是IF-THEN规则体系,另一个是决策树。 相似文献
4.
5.
6.
7.
近邻(Nearest Neighbor,NN)算法是一种简单实用的监督分类算法。但NN算法在分类未知类标的样例时,需要存储整个训练集,还要计算该样例到训练集中每一个样例之间的距离,所以NN算法的计算复杂度非常高。为了克服这一缺点,P.Hart提出了压缩近邻(Condensed Nearest Neighbor,CNN)规则算法,即从整个训练集中找原样例集的一致子集(一致子集是能正确分类训练集中其他样例的子集)。其计算复杂度依然比较高,特别是对于大型数据库,寻找其一致子集是非常耗费时间的。针对这一问题,提出了基于粗糙集技术的压缩近邻规则算法。该算法分为3步,首先利用粗糙集方法求属性约简(特征选择),以将冗余的属性去掉。然后选取靠近边界域的样例,以将冗余的样例去掉。最后从选出的样例中计算一致子集。该算法能同时沿垂直方向和水平方法进行数据约简。实验结果显示,所提出的方法是行之有效的。 相似文献
8.
提出一种基于属性分辨度的不完备决策表规则提取算法, 它是一种例化方向的方法. 首先从空集开始, 逐步 选择当前最重要的条件属性对对象集分类, 从广义决策值唯一的相容块提取确定规则, 从其他的相容块提取不确定 规则; 然后设计属性必要性判断步骤去除每条规则的冗余属性; 最后通过规则约简过程来简化所获得的规则, 增强规 则的泛化能力. 实验结果表明, 所提出的算法效率更高, 并且所获得的规则简洁有效.
相似文献9.
《计算机应用与软件》2014,(8)
基于相关的属性选择算法是一种属性子集评价方法,该算法通过启发式评价消除属性子集中属性之间的相关性,使用评价值选择与类属性相关度高而属性之间相关度低的属性子集。提出在基于相关的属性选择算法中加入属性之间相关度方差的影响,能够在基于相关的属性选择算法选择的属性集子的基础上,去除属性子集中那些与其他属性相关度大的属性。通过实验证明,改进后的算法选择的属性子集属性数不多于基于相关的属性选择算法选择的属性子集属性数。使用改进算法选择的属性子集,在对分类器分类正确率影响很小的情况下,有较高的分类效率。 相似文献
10.
研究了一致性分类问题中挖掘分类规则的算法,提出了一种基于属性重要性及partition的分类规则的新方法。首先根据属性的重要性度量选择一个属性,根据该属性与决策属性构造信息系统的分区,然后对每一个分区进行描述,并根据是否会在信息系统中产生冲突对描述进行简化,最后根据简化后的描述形成分类规则。 相似文献
11.
The dominance-based rough set approach is proposed as a methodology for plunge grinding process diagnosis. The process is analyzed and next its diagnosis is considered as a multi-criteria decision making problem based on the modelling of relationships between different process states and their symptoms using a set of rules induced from measured process data. The development of the diagnostic system is characterized by three phases. Firstly, the process experimental data is prepared in the form of a decision table. Using selected methods of signal processing, each process running is described by 17 process state features (condition attributes) and 5 criteria evaluating process state and results (decision attributes). The semantic correlation between all the attributes is modelled. Next, the phase of condition attributes selection and knowledge extraction are strictly integrated with the phase of the model evaluation using an iterative approach. After each loop of the iterative feature selection procedure the induction of rules is conducted using the VC-DomLEM algorithm. The classification capability of the induced rules is carried out using the leave-one-out method and a set of measures. The classification accuracy of individual models is in the range of 80.77–98.72 %. The induced set of rules constitutes a classifier for an assessment of new process run cases. 相似文献
12.
在决策表中,决策规则的可信度和对象覆盖度是衡量决策能力的重要指标。以知识粗糙熵为基础,提出决策熵的概念,并定义其属性重要性;然后以条件属性子集的决策熵来度量其对决策分类的重要性,自顶向下递归构造决策树;最后遍历决策树,简化所获得的决策规则。该方法的优点在于构造决策树及提取规则前不进行属性约简,计算直观,时间复杂度较低。实例分析的结果表明,该方法能获得更为简化有效的决策规则。 相似文献
13.
决策树是数据挖掘任务中分类的常用方法。在构造决策树的过程中,分离属性的选择标准直接影响到分类的效果,传统的决策树算法往往是基于信息论度量的。基于粗糙集的理论提出了一种基于属性重要度和依赖度为属性选择标准的决策树规则提取算法。使用该算法,能提取出明确的分类规则,比传统的ID3算法结构简单,并且能提高分类效率。 相似文献
14.
C4.5算法是一种非常有影响力的决策树生成算法,但该方法生成的决策树分类精度不高,分支较多,规模较大.针对C4.5算法存在的上述问题,本文提出了一种基于粗糙集理论与CAIM准则的C4.5改进算法.该算法采用基于CAIM准则的离散化方法对连续属性进行处理,使离散化过程中的信息丢失程度降低,提高分类精度.对离散化后的样本用基于粗糙集理论的属性约简方法进行属性约简,剔除冗余属性,减小生成的决策树规模.通过实验验证,该算法可以有效提高C4.5算法生成的决策树分类精度,降低决策树的规模. 相似文献
15.
Rough set feature selection and rule induction for prediction of malignancy degree in brain glioma 总被引:1,自引:0,他引:1
The degree of malignancy in brain glioma is assessed based on magnetic resonance imaging (MRI) findings and clinical data before operation. These data contain irrelevant features, while uncertainties and missing values also exist. Rough set theory can deal with vagueness and uncertainty in data analysis, and can efficiently remove redundant information. In this paper, a rough set method is applied to predict the degree of malignancy. As feature selection can improve the classification accuracy effectively, rough set feature selection algorithms are employed to select features. The selected feature subsets are used to generate decision rules for the classification task. A rough set attribute reduction algorithm that employs a search method based on particle swarm optimization (PSO) is proposed in this paper and compared with other rough set reduction algorithms. Experimental results show that reducts found by the proposed algorithm are more efficient and can generate decision rules with better classification performance. The rough set rule-based method can achieve higher classification accuracy than other intelligent analysis methods such as neural networks, decision trees and a fuzzy rule extraction algorithm based on Fuzzy Min-Max Neural Networks (FRE-FMMNN). Moreover, the decision rules induced by rough set rule induction algorithm can reveal regular and interpretable patterns of the relations between glioma MRI features and the degree of malignancy, which are helpful for medical experts. 相似文献
16.
17.
An incremental algorithm generating satisfactory decision rules and a rule post-processing technique are presented. The rule induction algorithm is based on the Apriori algorithm. It is extended to handle preference-ordered domains of attributes (called criteria) within Variable Consistency Dominance-based Rough Set Approach. It deals, moreover, with the problem of missing values in the data set. The algorithm has been designed for medical applications which require: (i) a careful selection of the set of decision rules representing medical experience and (ii) an easy update of these decision rules because of data set evolving in time, and (iii) not only a high predictive capacity of the set of decision rules but also a thorough explanation of a proposed decision. To satisfy all these requirements, we propose an incremental algorithm for induction of a satisfactory set of decision rules and a post-processing technique on the generated set of rules. Userʼns preferences with respect to attributes are also taken into account. A measure of the quality of a decision rule is proposed. It is used to select the most interesting representatives in the final set of rules. 相似文献
18.
Wani M.A. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2001,31(4):650-657
This paper describes a new algorithm for obtaining rules automatically from training examples. The algorithm is applicable to examples involving both objects: with discrete and continuous-valued attributes. The paper explains a new quantization procedure fur continuous-valued attributes and shows how appropriate ranges of values of various attributes are obtained. The algorithm uses a decision-tree-based approach for obtaining rules, but unlike other tree-based algorithms such as ID3, it allows more than one attribute at a node which greatly improves its performance. The ability of the algorithm to obtain a measure of partial match further enhances its generalization characteristic. The algorithm produces the same rules irrespective of the order of presentation of training examples. The algorithm has been demonstrated on classification problems. The results have compared favorably with those obtained by existing inductive learning algorithms. 相似文献