首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Neural Computing and Applications - Feature selection (FS) is considered as one of the core concepts in the areas of machine learning and data mining which immensely impacts the performance of...  相似文献   

2.
Fast orthogonal forward selection algorithm for feature subset selection   总被引:2,自引:0,他引:2  
Feature selection is an important issue in pattern classification. In the presented study, we develop a fast orthogonal forward selection (FOFS) algorithm for feature subset selection. The FOFS algorithm employs an orthogonal transform to decompose correlations among candidate features, but it performs the orthogonal decomposition in an implicit way. Consequently, the fast algorithm demands less computational effort as compared with conventional orthogonal forward selection (OFS).  相似文献   

3.
Feature selection is an important filtering method for data analysis, pattern classification, data mining, and so on. Feature selection reduces the number of features by removing irrelevant and redundant data. In this paper, we propose a hybrid filter–wrapper feature subset selection algorithm called the maximum Spearman minimum covariance cuckoo search (MSMCCS). First, based on Spearman and covariance, a filter algorithm is proposed called maximum Spearman minimum covariance (MSMC). Second, three parameters are proposed in MSMC to adjust the weights of the correlation and redundancy, improve the relevance of feature subsets, and reduce the redundancy. Third, in the improved cuckoo search algorithm, a weighted combination strategy is used to select candidate feature subsets, a crossover mutation concept is used to adjust the candidate feature subsets, and finally, the filtered features are selected into optimal feature subsets. Therefore, the MSMCCS combines the efficiency of filters with the greater accuracy of wrappers. Experimental results on eight common data sets from the University of California at Irvine Machine Learning Repository showed that the MSMCCS algorithm had better classification accuracy than the seven wrapper methods, the one filter method, and the two hybrid methods. Furthermore, the proposed algorithm achieved preferable performance on the Wilcoxon signed-rank test and the sensitivity–specificity test.  相似文献   

4.
基于遗传算法和支持向量机的特征选择研究   总被引:3,自引:0,他引:3  
为了让特征子集获得较高的分类准确率,提出了基于遗传算法和支持向量机的特征选择方法.该方法在ReliefF算法提供先验信息的基础上,将SVM参数混编入特征选择基因编码中,然后利用遗传算法寻求最优的特征子集和支持向量机参数组合.实验结果表明,通过该方法选择的特征子集和支持向量机参数组合能以较小的特征子集获得较高的分类准确率.  相似文献   

5.
基于类信息的文本聚类中特征选择算法   总被引:2,自引:0,他引:2  
文本聚类属于无监督的学习方法,由于缺乏类信息还很难直接应用有监督的特征选择方法,因此提出了一种基于类信息的特征选择算法,此算法在密度聚类算法的聚类结果上使用信息增益特征选择法重新选择最有分类能力的特征,实验验证了算法的可行性和有效性。  相似文献   

6.
Dimensionality reduction has been attracted extensive attention in machine learning. It usually includes two types: feature selection and subspace learning. Previously, many researchers have demonstrated that the dimensionality reduction is meaningful for real applications. Unfortunately, a large mass of these works utilize the feature selection and subspace learning independently. This paper explores a novel supervised feature selection algorithm by considering the subspace learning. Specifically, this paper employs an ? 2,1?norm and an ? 2,p ?norm regularizers, respectively, to conduct sample denoising and feature selection via exploring the correlation structure of data. Then this paper uses two constraints (i.e. hypergraph and low-rank) to consider the local structure and the global structure among the data, respectively. Finally, this paper uses the optimizing framework to iteratively optimize each parameter while fixing the other parameter until the algorithm converges. A lot of experiments show that our new supervised feature selection method can get great results on the eighteen public data sets.  相似文献   

7.
8.

Feature selection is one of the significant steps in classification tasks. It is a pre-processing step to select a small subset of significant features that can contribute the most to the classification process. Presently, many metaheuristic optimization algorithms were successfully applied for feature selection. The genetic algorithm (GA) as a fundamental optimization tool has been widely used in feature selection tasks. However, GA suffers from the hyperparameter setting, high computational complexity, and the randomness of selection operation. Therefore, we propose a new rival genetic algorithm, as well as a fast version of rival genetic algorithm, to enhance the performance of GA in feature selection. The proposed approaches utilize the competition strategy that combines the new selection and crossover schemes, which aim to improve the global search capability. Moreover, a dynamic mutation rate is proposed to enhance the search behaviour of the algorithm in the mutation process. The proposed approaches are validated on 23 benchmark datasets collected from the UCI machine learning repository and Arizona State University. In comparison with other competitors, proposed approach can provide highly competing results and overtake other algorithms in feature selection.

  相似文献   

9.
Bacterial foraging optimization (BFO) algorithm is a new swarming intelligent method, which has a satisfactory performance in solving the continuous optimization problem based on the chemotaxis, swarming, reproduction and elimination-dispersal steps. However, BFO algorithm is rarely used to deal with feature selection problem. In this paper, we propose two novel BFO algorithms, which are named as adaptive chemotaxis bacterial foraging optimization algorithm (ACBFO) and improved swarming and elimination-dispersal bacterial foraging optimization algorithm (ISEDBFO) respectively. Two improvements are presented in ACBFO. On the one hand, in order to solve the discrete problem, data structure of each bacterium is redefined to establish the mapping relationship between the bacterium and the feature subset. On the other hand, an adaptive method for evaluating the importance of features is designed. Therefore the primary features in feature subset are preserved. ISEDBFO is proposed based on ACBFO. ISEDBFO algorithm also includes two modifications. First, with the aim of describing the nature of cell to cell attraction-repulsion relationship more accurately, swarming representation is improved by means of introducing the hyperbolic tangent function. Second, in order to retain the primary features of eliminated bacteria, roulette technique is applied to the elimination-dispersal phase.In this study, ACBFO and ISEDBFO are tested with 10 public data sets of UCI. The performance of the proposed methods is compared with particle swarm optimization based, genetic algorithm based, simulated annealing based, ant lion optimization based, binary bat algorithm based and cuckoo search based approaches. The experimental results demonstrate that the average classification accuracy of the proposed algorithms is nearly 3 percentage points higher than other tested methods. Furthermore, the improved algorithms reduce the length of the feature subset by almost 3 in comparison to other methods. In addition, the modified methods achieve excellent performance on wilcoxon signed-rank test and sensitivity-specificity test. In conclusion, the novel BFO algorithms can provide important support for the expert and intelligent systems.  相似文献   

10.
Research of multi-population agent genetic algorithm for feature selection   总被引:2,自引:0,他引:2  
Search algorithm is an essential part of feature selection algorithm. In this paper, through constructing double chain-like agent structure and with improved genetic operators, the authors propose one novel agent genetic algorithm-multi-population agent genetic algorithm (MPAGAFS) for feature selection. The double chain-like agent structure is more like local environment in real world, the introduction of this structure is good to keep the diversity of population. Moreover, the structure can help to construct multi-population agent GA, thereby realizing parallel searching for optimal feature subset. In order to evaluate the performance of MPAGAFS, several groups of experiments are conducted. The experimental results show that the MPAGAFS cannot only be used for serial feature selection but also for parallel feature selection with satisfying precision and number of features.  相似文献   

11.
数据库通常包含很多冗余特征,找出重要特征叫做特征提取。本文提出一种基于属性重要度的启发式特征选取算法。该算法以属性重要度为迭代准则得到属性集合的最小约简。  相似文献   

12.
一种基于CFN的特征选择及权重算法   总被引:1,自引:0,他引:1  
利用TF和DF的组合进行特征选择,及利用TF-IDF算法计算权重.是文本分类中常用的算法.但当训练集较小时,此特征选择算法会将一些特征区分能力强的低频词过滤掉,并直接影响特征词的权重.本文提出一种基于汉语框架网络(以下简称CFN)的特征选择和计算权重的算法.实验表明:算法可使分类的准确率达到67.3%,较传统算法有很大提高.也说明了该算法能够满足小训练集环境下对文本分类准确率的要求.  相似文献   

13.
This paper presents a Knowledge-based system (KBS) developed to allow users, who may not be knowledgeable about sensors, to select sensors suitable for their specific needs. The KBS runs on a micro-computer. The selection criteria are user specified and are based on the desired measurement parameters. The system output includes all of the operational and dimensional parameters of the recommended sensor, price, and vendor information.  相似文献   

14.
Microarray technology can be used as an efficient diagnostic system to recognise diseases such as tumours or to discriminate between different types of cancers in normal tissues. This technology has received increasing attention from the bioinformatics community because of its potential in designing powerful decision-making tools for cancer diagnosis. However, the presence of thousands or tens of thousands of genes affects the predictive accuracy of this technology from the perspective of classification. Thus, a key issue in microarray data is identifying or selecting the smallest possible set of genes from the input data that can achieve good predictive accuracy for classification. In this work, we propose a two-stage selection algorithm for gene selection problems in microarray data-sets called the symmetrical uncertainty filter and harmony search algorithm wrapper (SU-HSA). Experimental results show that the SU-HSA is better than HSA in isolation for all data-sets in terms of the accuracy and achieves a lower number of genes on 6 out of 10 instances. Furthermore, the comparison with state-of-the-art methods shows that our proposed approach is able to obtain 5 (out of 10) new best results in terms of the number of selected genes and competitive results in terms of the classification accuracy.  相似文献   

15.
为了解决中文文本分类中初始特征空间维数过高带来的“维数灾难”问题,提高分类精度和分类效率,提出了一种基于模拟退火及蜂群算法的优化特征选择算法.该算法中,以蜂群算法流程为主体,根据蜜蜂群体觅食的特点快速寻找最优解,并且针对蜂群算法容易陷入局部最优解的问题,把模拟退火算法机制引入其中.该算法既保留了蜂群算法群体寻优的特点,又可以有效地避免陷入局部最优解.通过选择合适的收益率函数和温度下降函数,用实验的方法与卡方统计、信息增益和互信息等算法进行比较,表明了该算法的可行性和有效性.  相似文献   

16.
This paper studies a new feature selection method for data classification that efficiently combines the discriminative capability of features with the ridge regression model. It first sets up the global structure of training data with the linear discriminant analysis that assists in identifying the discriminative features. And then, the ridge regression model is employed to assess the feature representation and the discrimination information, so as to obtain the representative coefficient matrix. The importance of features can be calculated with this representative coefficient matrix. Finally, the new subset of selected features is applied to a linear Support Vector Machine for data classification. To validate the efficiency, sets of experiments are conducted with twenty benchmark datasets. The experimental results show that the proposed approach performs much better than the state-of-the-art feature selection algorithms in terms of the evaluating indicator of classification. And the proposed feature selection algorithm possesses a competitive performance compared with existing feature selection algorithms with regard to the computational cost.  相似文献   

17.
Xu  Ruohao  Li  Mengmeng  Yang  Zhongliang  Yang  Lifang  Qiao  Kangjia  Shang  Zhigang 《Applied Intelligence》2021,51(10):7233-7244

Feature selection is a technique to improve the classification accuracy of classifiers and a convenient data visualization method. As an incremental, task oriented, and model-free learning algorithm, Q-learning is suitable for feature selection, this study proposes a dynamic feature selection algorithm, which combines feature selection and Q-learning into a framework. First, the Q-learning is used to construct the discriminant functions for each class of the data. Next, the feature ranking is achieved according to the all discrimination functions vectors for each class of the data comprehensively, and the feature ranking is doing during the process of updating discriminant function vectors. Finally, experiments are designed to compare the performance of the proposed algorithm with four feature selection algorithms, the experimental results on the benchmark data set verify the effectiveness of the proposed algorithm, the classification performance of the proposed algorithm is better than the other feature selection algorithms, meanwhile the proposed algorithm also has good performance in removing the redundant features, and the experiments of the effect of learning rates on the our algorithm demonstrate that the selection of parameters in our algorithm is very simple.

  相似文献   

18.
Salp Swarm Algorithm (SSA) is one of the most recently proposed algorithms driven by the simulation behavior of salps. However, similar to most of the meta-heuristic algorithms, it suffered from stagnation in local optima and low convergence rate. Recently, chaos theory has been successfully applied to solve these problems. In this paper, a novel hybrid solution based on SSA and chaos theory is proposed. The proposed Chaotic Salp Swarm Algorithm (CSSA) is applied on 14 unimodal and multimodal benchmark optimization problems and 20 benchmark datasets. Ten different chaotic maps are employed to enhance the convergence rate and resulting precision. Simulation results showed that the proposed CSSA is a promising algorithm. Also, the results reveal the capability of CSSA in finding an optimal feature subset, which maximizes the classification accuracy, while minimizing the number of selected features. Moreover, the results showed that logistic chaotic map is the optimal map of the used ten, which can significantly boost the performance of original SSA.  相似文献   

19.
Gu  Xiangyuan  Guo  Jichang  Xiao  Lijun  Li  Chongyi 《Applied Intelligence》2022,52(2):1436-1447
Applied Intelligence - There are many feature selection algorithms based on mutual information and three-dimensional mutual information (TDMI) among features and the class label, since these...  相似文献   

20.
A new local search based hybrid genetic algorithm for feature selection   总被引:2,自引:0,他引:2  
This paper presents a new hybrid genetic algorithm (HGA) for feature selection (FS), called as HGAFS. The vital aspect of this algorithm is the selection of salient feature subset within a reduced size. HGAFS incorporates a new local search operation that is devised and embedded in HGA to fine-tune the search in FS process. The local search technique works on basis of the distinct and informative nature of input features that is computed by their correlation information. The aim is to guide the search process so that the newly generated offsprings can be adjusted by the less correlated (distinct) features consisting of general and special characteristics of a given dataset. Thus, the proposed HGAFS receives the reduced redundancy of information among the selected features. On the other hand, HGAFS emphasizes on selecting a subset of salient features with reduced number using a subset size determination scheme. We have tested our HGAFS on 11 real-world classification datasets having dimensions varying from 8 to 7129. The performances of HGAFS have been compared with the results of other existing ten well-known FS algorithms. It is found that, HGAFS produces consistently better performances on selecting the subsets of salient features with resulting better classification accuracies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号