首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Rough set theory is one of the effective methods to feature selection, which can preserve the meaning of the features. The essence of rough set approach to feature selection is to find a subset of the original features. Since finding a minimal subset of the features is a NP-hard problem, it is necessary to investigate effective and efficient heuristic algorithms. Ant colony optimization (ACO) has been successfully applied to many difficult combinatorial problems like quadratic assignment, traveling salesman, scheduling, etc. It is particularly attractive for feature selection since there is no heuristic information that can guide search to the optimal minimal subset every time. However, ants can discover the best feature combinations as they traverse the graph. In this paper, we propose a new rough set approach to feature selection based on ACO, which adopts mutual information based feature significance as heuristic information. A novel feature selection algorithm is also given. Jensen and Shen proposed a ACO-based feature selection approach which starts from a random feature. Our approach starts from the feature core, which changes the complete graph to a smaller one. To verify the efficiency of our algorithm, experiments are carried out on some standard UCI datasets. The results demonstrate that our algorithm can provide efficient solution to find a minimal subset of the features.  相似文献   

2.
The Intelligent Water Drop (IWD) algorithm is a recent stochastic swarm-based method that is useful for solving combinatorial and function optimization problems. In this paper, we investigate the effectiveness of the selection method in the solution construction phase of the IWD algorithm. Instead of the fitness proportionate selection method in the original IWD algorithm, two ranking-based selection methods, namely linear ranking and exponential ranking, are proposed. Both ranking-based selection methods aim to solve the identified limitations of the fitness proportionate selection method as well as to enable the IWD algorithm to escape from local optima and ensure its search diversity. To evaluate the usefulness of the proposed ranking-based selection methods, a series of experiments pertaining to three combinatorial optimization problems, i.e., rough set feature subset selection, multiple knapsack and travelling salesman problems, is conducted. The results demonstrate that the exponential ranking selection method is able to preserve the search diversity, therefore improving the performance of the IWD algorithm.  相似文献   

3.
翟俊海    刘博  张素芳 《智能系统学报》2017,12(3):397-404
特征选择是指从初始特征全集中,依据既定规则筛选出特征子集的过程,是数据挖掘的重要预处理步骤。通过剔除冗余属性,以达到降低算法复杂度和提高算法性能的目的。针对离散值特征选择问题,提出了一种将粗糙集相对分类信息熵和粒子群算法相结合的特征选择方法,依托粒子群算法,以相对分类信息熵作为适应度函数,并与其他基于进化算法的特征选择方法进行了实验比较,实验结果表明本文提出的方法具有一定的优势。  相似文献   

4.
模糊决策粗糙集是决策粗糙集理论在模糊集环境下的重要延伸,然而该模型对含噪声的数据不具有很好的容忍性。为此在传统的模糊相似关系中引入一个限定阈值,提出一种改进的模糊相似关系。在其基础上对原始的模糊决策粗糙集进行重构,提出一种改进的模糊决策粗糙集模型。根据不同的特征选择方式,利用所提出的改进模型设计出两种搜索策略的最小化决策代价特征选择算法。实验分析表明,该算法比传统算法具有更高的优越性。  相似文献   

5.
Selecting an optimal subset from original large feature set in the design of pattern classifier is an important and difficult problem. In this paper, we use tabu search to solve this feature selection problem and compare it with classic algorithms, such as sequential methods, branch and bound method, etc., and most other suboptimal methods proposed recently, such as genetic algorithm and sequential forward (backward) floating search methods. Based on the results of experiments, tabu search is shown to be a promising tool for feature selection in respect of the quality of obtained feature subset and computation efficiency. The effects of parameters in tabu search are also analyzed by experiments.  相似文献   

6.
In recent years, fuzzy rough set theory has emerged as a suitable tool for performing feature selection. Fuzzy rough feature selection enables us to analyze the discernibility of the attributes, highlighting the most attractive features in the construction of classifiers. However, its results can be enhanced even more if other data reduction techniques, such as instance selection, are considered.In this work, a hybrid evolutionary algorithm for data reduction, using both instance and feature selection, is presented. A global process of instance selection, carried out by a steady-state genetic algorithm, is combined with a fuzzy rough set based feature selection process, which searches for the most interesting features to enhance both the evolutionary search process and the final preprocessed data set. The experimental study, the results of which have been contrasted through nonparametric statistical tests, shows that our proposal obtains high reduction rates on training sets which greatly enhance the behavior of the nearest neighbor classifier.  相似文献   

7.
基于模糊粗糙集信息熵的蚁群特征选择方法   总被引:1,自引:0,他引:1  
赵军阳  张志利 《计算机应用》2009,29(1):109-111,
目前针对高维数据特征选择提出的启发式算法多数容易陷入局部最优,无法对整个特征空间进行有效搜索。为了提高对特征域的并行搜索能力,基于模糊粗糙集的信息熵原理,对蚁群模型的搜索策略、信息素更新和状态转移规则等进行了改进,提出蚁群特征选择方法。经UCI数据实验验证,该算法比传统的特征选择算法具有更好的选择效果,是有效的。  相似文献   

8.
杨震宇  叶军  季雨瑄  敖家欣  王磊 《计算机应用研究》2022,39(4):1118-1123+1131
目前已有蚁群算法优化的特征选择方法,大多采用的是以属性依赖度和信息熵属性重要度作为路径上启发搜索因子,但这类搜索方法在某些决策表中存在算法早熟或搜索到的特征子集包含了冗余特征,从而导致选择精度显著下降。针对此类问题,根据条件属性在分辨矩阵中的占比提出了一种属性重要度的度量方法,以分辨矩阵重要度作为路径上启发因子,设计了一种基于分辨矩阵与蚁群算法优化的特征子集搜索方法。该算法从特征核出发,蚁群依次选择概率大的特征加入特征核集,直至找到最小特征子集算法终止。通过实例验证和UCI数据集实验结果表明,与基于属性依赖度和信息熵属性重要度的特征选择方法相比,在通常情况下,该算法能较小代价找到最小特征子集,并且可以有效减少计算工作量。  相似文献   

9.
针对网络流量特征属性的优化选择问题,提出了一种结合粗糙集和禁忌搜索的网络流量特征选择方法(RS-TS).该方法通过粗糙集算法对网络流量特征属性进行约简,将所得到的特征子集作为禁忌搜索的初始解,并利用禁忌搜索得到最优特征子集.实验验证RS-TS方法优于基于GA的特征选择方法和基于IG的特征选择方法,能够有效地去除网络流量的冗余特征属性,提高网络流量分类精度.  相似文献   

10.
We propose a new feature selection strategy based on rough sets and particle swarm optimization (PSO). Rough sets have been used as a feature selection method with much success, but current hill-climbing rough set approaches to feature selection are inadequate at finding optimal reductions as no perfect heuristic can guarantee optimality. On the other hand, complete searches are not feasible for even medium-sized datasets. So, stochastic approaches provide a promising feature selection mechanism. Like Genetic Algorithms, PSO is a new evolutionary computation technique, in which each potential solution is seen as a particle with a certain velocity flying through the problem space. The Particle Swarms find optimal regions of the complex search space through the interaction of individuals in the population. PSO is attractive for feature selection in that particle swarms will discover best feature combinations as they fly within the subset space. Compared with GAs, PSO does not need complex operators such as crossover and mutation, it requires only primitive and simple mathematical operators, and is computationally inexpensive in terms of both memory and runtime. Experimentation is carried out, using UCI data, which compares the proposed algorithm with a GA-based approach and other deterministic rough set reduction algorithms. The results show that PSO is efficient for rough set-based feature selection.  相似文献   

11.
Tabu搜索在特征选择中的应用   总被引:25,自引:0,他引:25  
研究利用Tabu搜索从大特征集中选择一组有效特征的问题.分析了Tabu搜索中 表长、邻域大小和候选解数量等参数对Tabu搜索的影响.对两种特征选择的问题,与经典及 最近新提出的一些特征选择方法如SFS,SBS,GSFS,GSBS,PTA,BB,GA和SFFS,SFBS等 算法的实验比较表明,Tabu搜索在求解时间和解的质量上都取得了满意的结果.  相似文献   

12.
The degree of malignancy in brain glioma is assessed based on magnetic resonance imaging (MRI) findings and clinical data before operation. These data contain irrelevant features, while uncertainties and missing values also exist. Rough set theory can deal with vagueness and uncertainty in data analysis, and can efficiently remove redundant information. In this paper, a rough set method is applied to predict the degree of malignancy. As feature selection can improve the classification accuracy effectively, rough set feature selection algorithms are employed to select features. The selected feature subsets are used to generate decision rules for the classification task. A rough set attribute reduction algorithm that employs a search method based on particle swarm optimization (PSO) is proposed in this paper and compared with other rough set reduction algorithms. Experimental results show that reducts found by the proposed algorithm are more efficient and can generate decision rules with better classification performance. The rough set rule-based method can achieve higher classification accuracy than other intelligent analysis methods such as neural networks, decision trees and a fuzzy rule extraction algorithm based on Fuzzy Min-Max Neural Networks (FRE-FMMNN). Moreover, the decision rules induced by rough set rule induction algorithm can reveal regular and interpretable patterns of the relations between glioma MRI features and the degree of malignancy, which are helpful for medical experts.  相似文献   

13.
Feature selection is viewed as an important preprocessing step for pattern recognition, machine learning and data mining. Traditional hill-climbing search approaches to feature selection have difficulties to find optimal reducts. And the current stochastic search strategies, such as GA, ACO and PSO, provide a more robust solution but at the expense of increased computational effort. It is necessary to investigate fast and effective search algorithms. Rough set theory provides a mathematical tool to discover data dependencies and reduce the number of features contained in a dataset by purely structural methods. In this paper, we define a structure called power set tree (PS-tree), which is an order tree representing the power set, and each possible reduct is mapped to a node of the tree. Then, we present a rough set approach to feature selection based on PS-tree. Two kinds of pruning rules for PS-tree are given. And two novel feature selection algorithms based on PS-tree are also given. Experiment results demonstrate that our algorithms are effective and efficient.  相似文献   

14.
由于无监督环境下特征选择缺少类别信息的依赖,所以利用模糊粗糙集理论提出一种非一致性度量方法DAM(disagreement measure),用于度量任意两个特征集合或特征间引起的模糊等价类含义的差异程度.在此基础上实现DAMUFS无监督特征选择算法,其在无监督条件下可以选择出包含更多信息量的特征子集,同时还保证特征子集中属性冗余度尽可能小.实验将DAMUFS算法与一些无监督以及有监督特征选择算法在多个数据集上进行分类性能比较,结果证明了DAMUFS的有效性.  相似文献   

15.
分类问题普遍存在于现代工业生产中。在进行分类任务之前,利用特征选择筛选有用的信息,能够有效地提高分类效率和分类精度。最小冗余最大相关算法(mRMR)考虑最大化特征与类别的相关性和最小化特征之间的冗余性,能够有效地选择特征子集;但该算法存在中后期特征重要度偏差大以及无法直接给出特征子集的问题。针对该问题,文中提出了结合邻域粗糙集差别矩阵和mRMR原理的特征选择算法。根据最大相关性和最小冗余性原则,利用邻域熵和邻域互信息定义了特征的重要度,以更好地处理混合数据类型。基于差别矩阵定义了动态差别集,利用差别集的动态演化有效去除冗余属性,缩小搜索范围,优化特征子集,并根据差别矩阵判定迭代截止条件。实验选取SVM,J48,KNN和MLP作为分类器来评价该特征选择算法的性能。在公共数据集上的实验结果表明,与已有算法相比,所提算法的平均分类精度提升了2%左右,同时在特征较多的数据集上能够有效地缩短特征选择时间。所提算法继承了差别矩阵和mRMR的优点,能够有效地处理特征选择问题。  相似文献   

16.
从海量文本集中选择较优秀的特征子集是文本分类中的一个NP-难问题。而对于NP-问题,遗传算法往往能够有效地加以解决。为了克服传统遗传算法的“漂移”和“早敛”问题,首先引入了粗糙集并在此基础上详细设计了适应度函数、自适应交叉算子、自适应变异算子以及合理的终止条件。以此遗传算法为基础设计了一个特征选择算法。在复旦大学提供的语料库上进行了试验验证。实验结果表明此特征选择算法性能良好。  相似文献   

17.
孙林  赵婧  徐久成  王欣雅 《计算机应用》2022,42(5):1355-1366
针对经典的帝王蝶优化(MBO)算法不能很好地处理连续型数据,以及粗糙集模型对于大规模、高维复杂的数据处理能力不足等问题,提出了基于邻域粗糙集(NRS)和MBO的特征选择算法。首先,将局部扰动和群体划分策略与MBO算法结合,并构建传输机制以形成一种二进制MBO(BMBO)算法;其次,引入突变算子增强算法的探索能力,设计了基于突变算子的BMBO(BMBOM)算法;然后,基于NRS的邻域度构造适应度函数,并对初始化的特征子集的适应度值进行评估并排序;最后,使用BMBOM算法通过不断迭代搜索出最优特征子集,并设计了一种元启发式特征选择算法。在基准函数上评估BMBOM算法的优化性能,并在UCI数据集上评价所提出的特征选择算法的分类能力。实验结果表明,在5个基准函数上,BMBOM算法的最优值、最差值、平均值以及标准差明显优于MBO和粒子群优化(PSO)算法;在UCI数据集上,与基于粗糙集的优化特征选择算法、结合粗糙集与优化算法的特征选择算法、结合NRS与优化算法的特征选择算法、基于二进制灰狼优化的特征选择算法相比,所提特征选择算法在分类精度、所选特征数和适应度值这3个指标上表现良好,能够选择特征数少且分类精度高的最优特征子集。  相似文献   

18.
本文首先简单分析了几种经典的特征选择方法,总结了它们的不足,然后提出了特征集中度的概念, 紧接着把差别对象对集引入粗糙集并提出了一个基于差别对象对集的属性约简算法,最后把该属性约简算法同特征 集中度结合起来,提出了一个综合性特征选择方法.该综合性方法首先利用特征集中度进行特征初选以过滤掉一些 词条来降低特征空间的稀疏性,然后再使用所提属性约简算法消除冗余,从而获得较具代表性的特征子集.实验结 果表明该综合性方法效果良好.  相似文献   

19.
实际应用中,数据常常表现出不完备性和动态性的特点.针对动态不完备数据中的特征选择问题,提出了一种基于相容粗糙集模型和信息熵理论的增量式特征选择方法.首先,建立了不完备信息系统中特征值动态更新时论域上条件划分与决策分类的动态更新模式,分析了作为特征重要度评价准则的不完备相容信息熵的增量计算机制,并将该机制引入到启发式最优...  相似文献   

20.
粗糙集理论作为一种处理不精确和不一致数据的数学工具被广泛应用于特征子集选择和属性约简中。在大多数现存的算法中,属性依赖度被用来度量特征子集的重要性,而依赖度在处理不一致信息系统时会出现找不到任何特征子集的问题。文中讨论了使用属性依赖性作为度量的缺点和不足,引入一种一致性度量,分析了其和依赖性之间的关系,重新定义了信息系统的多余属性和约简的概念,并构造了基于一致性度量的前向贪婪搜索算法。通过UCI数据集合验证了算法能够有效地处理不一致信息系统。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号