首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
“Dimensionality” is one of the major problems which affect the quality of learning process in most of the machine learning and data mining tasks. Having high dimensional datasets for training a classification model may lead to have “overfitting” of the learned model to the training data. Overfitting reduces generalization of the model, therefore causes poor classification accuracy for the new test instances. Another disadvantage of dimensionality of dataset is to have high CPU time requirement for learning and testing the model. Applying feature selection to the dataset before the learning process is essential to improve the performance of the classification task. In this study, a new hybrid method which combines artificial bee colony optimization technique with differential evolution algorithm is proposed for feature selection of classification tasks. The developed hybrid method is evaluated by using fifteen datasets from the UCI Repository which are commonly used in classification problems. To make a complete evaluation, the proposed hybrid feature selection method is compared with the artificial bee colony optimization, and differential evolution based feature selection methods, as well as with the three most popular feature selection techniques that are information gain, chi-square, and correlation feature selection. In addition to these, the performance of the proposed method is also compared with the studies in the literature which uses the same datasets. The experimental results of this study show that our developed hybrid method is able to select good features for classification tasks to improve run-time performance and accuracy of the classifier. The proposed hybrid method may also be applied to other search and optimization problems as its performance for feature selection is better than pure artificial bee colony optimization, and differential evolution.  相似文献   

2.
Protein function prediction is an important problem in functional genomics. Typically, protein sequences are represented by feature vectors. A major problem of protein datasets that increase the complexity of classification models is their large number of features. Feature selection (FS) techniques are used to deal with this high dimensional space of features. In this paper, we propose a novel feature selection algorithm that combines genetic algorithms (GA) and ant colony optimization (ACO) for faster and better search capability. The hybrid algorithm makes use of advantages of both ACO and GA methods. Proposed algorithm is easily implemented and because of use of a simple classifier in that, its computational complexity is very low. The performance of proposed algorithm is compared to the performance of two prominent population-based algorithms, ACO and genetic algorithms. Experimentation is carried out using two challenging biological datasets, involving the hierarchical functional classification of GPCRs and enzymes. The criteria used for comparison are maximizing predictive accuracy, and finding the smallest subset of features. The results of experiments indicate the superiority of proposed algorithm.  相似文献   

3.
符晓 《计算机科学》2018,45(Z6):290-294
为了提高云计算中虚拟机(VM)的利用率并降低任务的完成时间,提出了一种融合共享机制的混合群智能优化算法,实现云任务的动态调度。首先,将虚拟机调度编码为蜜蜂、蚂蚁和遗传个体。然后,利用人工蜂群算法(ABC)、蚁群算法(ACO)和遗传算法(GA)分别在各自邻域内寻找最优解。最后,通过一个共享机制使3种算法定期交流各自搜索到的解,并将获得的最佳解作为当前最优解进行下一次迭代过程,以此来加速算法收敛并提高收敛精度。通过CloudSim进行了一个云任务调度的仿真实验,结果表明提出的混合算法能够合理有效地调度任务,在任务完成时间和稳定性方面具有优越的性能。  相似文献   

4.
经典的人工蜂群(artificial bee colony, ABC)算法面临着收敛速度慢、易陷入局部最优等不足,因此基于该算法来进行特征选择还存在很多问题.对此,提出了一种基于粒度粗糙熵与改进蜂群算法的特征选择方法FS_GREIABC.首先,将粗糙集中的知识粒度与粗糙熵有机地结合起来,提出一种新的信息熵模型——粒度粗糙熵;其次,将粒度粗糙熵应用于ABC算法中,提出一种基于粒度粗糙熵的适应度函数,从而获得了一种新的适应度计算策略;第三,为了提高ABC算法的局部搜索能力,将云模型引入到跟随蜂阶段.在多个UCI数据集以及软件缺陷预测数据集上的实验表明,相对于现有的特征选择算法, FS_GREIABC不仅能够选择较少的特征,而且具有更好的分类性能.  相似文献   

5.
徐伟  冷静 《计算机应用与软件》2021,38(3):314-318,333
为了降低网络入侵检测系统的虚警率,提出一种混合式网络入侵检测方法,将人工蜂群(ABC)算法用于特征提取,XGBoost算法用于特征分类和评价。选择和定义不同的场景和攻击类型,并设计混合式网络拓扑;对预处理后的数据,采用ABC算法进行特征提取,利用XGBoost算法将需要评价的特征进行分类;得到特征的最优子集,利用这些特征完成网络异常检测。在多个公开数据集上的实验结果表明,该混合方法在准确度和检测率方面优于其他方法,且其时间复杂度和空间复杂度较低,表现出较高的检测效率。  相似文献   

6.
Feature selection is the basic pre-processing task of eliminating irrelevant or redundant features through investigating complicated interactions among features in a feature set. Due to its critical role in classification and computational time, it has attracted researchers’ attention for the last five decades. However, it still remains a challenge. This paper proposes a binary artificial bee colony (ABC) algorithm for the feature selection problems, which is developed by integrating evolutionary based similarity search mechanisms into an existing binary ABC variant. The performance analysis of the proposed algorithm is demonstrated by comparing it with some well-known variants of the particle swarm optimization (PSO) and ABC algorithms, including standard binary PSO, new velocity based binary PSO, quantum inspired binary PSO, discrete ABC, modification rate based ABC, angle modulated ABC, and genetic algorithms on 10 benchmark datasets. The results show that the proposed algorithm can obtain higher classification performance in both training and test sets, and can eliminate irrelevant and redundant features more effectively than the other approaches. Note that all the algorithms used in this paper except for standard binary PSO and GA are employed for the first time in feature selection.  相似文献   

7.
Data from many real-world applications can be high dimensional and features of such data are usually highly redundant. Identifying informative features has become an important step for data mining to not only circumvent the curse of dimensionality but to reduce the amount of data for processing. In this paper, we propose a novel feature selection method based on bee colony and gradient boosting decision tree aiming at addressing problems such as efficiency and informative quality of the selected features. Our method achieves global optimization of the inputs of the decision tree using the bee colony algorithm to identify the informative features. The method initializes the feature space spanned by the dataset. Less relevant features are suppressed according to the information they contribute to the decision making using an artificial bee colony algorithm. Experiments are conducted with two breast cancer datasets and six datasets from the public data repository. Experimental results demonstrate that the proposed method effectively reduces the dimensions of the dataset and achieves superior classification accuracy using the selected features.  相似文献   

8.
Support vector machine (SVM) is a state-of-art classification tool with good accuracy due to its ability to generate nonlinear model. However, the nonlinear models generated are typically regarded as incomprehensible black-box models. This lack of explanatory ability is a serious problem for practical SVM applications which require comprehensibility. Therefore, this study applies a C5 decision tree (DT) to extract rules from SVM result. In addition, a metaheuristic algorithm is employed for the feature selection. Both SVM and C5 DT require expensive computation. Applying these two algorithms simultaneously for high-dimensional data will increase the computational cost. This study applies artificial bee colony optimization (ABC) algorithm to select the important features. The proposed algorithm ABC–SVM–DT is applied to extract comprehensible rules from SVMs. The ABC algorithm is applied to implement feature selection and parameter optimization before SVM–DT. The proposed algorithm is evaluated using eight datasets to demonstrate the effectiveness of the proposed algorithm. The result shows that the classification accuracy and complexity of the final decision tree can be improved simultaneously by the proposed ABC–SVM–DT algorithm, compared with genetic algorithm and particle swarm optimization algorithm.  相似文献   

9.
Decision trees have been widely used in data mining and machine learning as a comprehensible knowledge representation. While ant colony optimization (ACO) algorithms have been successfully applied to extract classification rules, decision tree induction with ACO algorithms remains an almost unexplored research area. In this paper we propose a novel ACO algorithm to induce decision trees, combining commonly used strategies from both traditional decision tree induction algorithms and ACO. The proposed algorithm is compared against three decision tree induction algorithms, namely C4.5, CART and cACDT, in 22 publicly available data sets. The results show that the predictive accuracy of the proposed algorithm is statistically significantly higher than the accuracy of both C4.5 and CART, which are well-known conventional algorithms for decision tree induction, and the accuracy of the ACO-based cACDT decision tree algorithm.  相似文献   

10.
Artificial bee colony algorithm (ABC), which is inspired by the foraging behavior of honey bee swarm, is a biological-inspired optimization. It shows more effective than genetic algorithm (GA), particle swarm optimization (PSO) and ant colony optimization (ACO). However, ABC is good at exploration but poor at exploitation, and its convergence speed is also an issue in some cases. For these insufficiencies, we propose an improved ABC algorithm called I-ABC. In I-ABC, the best-so-far solution, inertia weight and acceleration coefficients are introduced to modify the search process. Inertia weight and acceleration coefficients are defined as functions of the fitness. In addition, to further balance search processes, the modification forms of the employed bees and the onlooker ones are different in the second acceleration coefficient. Experiments show that, for most functions, the I-ABC has a faster convergence speed and better performances than each of ABC and the gbest-guided ABC (GABC). But I-ABC could not still substantially achieve the best solution for all optimization problems. In a few cases, it could not find better results than ABC or GABC. In order to inherit the bright sides of ABC, GABC and I-ABC, a high-efficiency hybrid ABC algorithm, which is called PS-ABC, is proposed. PS-ABC owns the abilities of prediction and selection. Results show that PS-ABC has a faster convergence speed like I-ABC and better search ability than other relevant methods for almost all functions.  相似文献   

11.
In recent years, a few sequential covering algorithms for classification rule discovery based on the ant colony optimization meta-heuristic (ACO) have been proposed. This paper proposes a new ACO-based classification algorithm called AntMiner-C. Its main feature is a heuristic function based on the correlation among the attributes. Other highlights include the manner in which class labels are assigned to the rules prior to their discovery, a strategy for dynamically stopping the addition of terms in a rule’s antecedent part, and a strategy for pruning redundant rules from the rule set. We study the performance of our proposed approach for twelve commonly used data sets and compare it with the original AntMiner algorithm, decision tree builder C4.5, Ripper, logistic regression technique, and a SVM. Experimental results show that the accuracy rate obtained by AntMiner-C is better than that of the compared algorithms. However, the average number of rules and average terms per rule are higher.  相似文献   

12.
This paper presents a hybrid approach based on feature selection, fuzzy weighted pre-processing and artificial immune recognition system (AIRS) to medical decision support systems. We have used the heart disease and hepatitis disease datasets taken from UCI machine learning database as medical dataset. Artificial immune recognition system has shown an effective performance on several problems such as machine learning benchmark problems and medical classification problems like breast cancer, diabetes, and liver disorders classification. The proposed approach consists of three stages. In the first stage, the dimensions of heart disease and hepatitis disease datasets are reduced to 9 from 13 and 19 in the feature selection (FS) sub-program by means of C4.5 decision tree algorithm (CBA program), respectively. In the second stage, heart disease and hepatitis disease datasets are normalized in the range of [0,1] and are weighted via fuzzy weighted pre-processing. In the third stage, weighted input values obtained from fuzzy weighted pre-processing are classified using AIRS classifier system. The obtained classification accuracies of our system are 92.59% and 81.82% using 50-50% training-test split for heart disease and hepatitis disease datasets, respectively. With these results, the proposed method can be used in medical decision support systems.  相似文献   

13.
Microarray technology presents a challenge due to the large dimensionality of the data, which can be difficult to interpret. To address this challenge, the article proposes a feature extraction-based cancer classification technique coupled with artificial bee colony optimization (ABC) algorithm. The ABC-support vector machine (SVM) method is used to classify the lung cancer datasets and compared them with existing techniques in terms of precision, recall, F-measure, and accuracy. The proposed ABC-SVM has the advantage of dealing with complex nonlinear data, providing good flexibility. Simulation analysis was conducted with 30% of the data reserved for testing the proposed method. The results indicate that the proposed attribute classification technique, which uses fewer genes, performs better than other modalities. The classifiers, such as naïve Bayes, multi-class SVM, and linear discriminant analysis, were also compared and the proposed method outperformed these classifiers and state-of-the-art techniques. Overall, this study demonstrates the potential of using intelligent algorithms and feature extraction techniques to improve the accuracy of cancer diagnosis using microarray gene expression data.  相似文献   

14.
Clustering is a popular data analysis and data mining technique. A popular technique for clustering is based on k-means such that the data is partitioned into K clusters. However, the k-means algorithm highly depends on the initial state and converges to local optimum solution. This paper presents a new hybrid evolutionary algorithm to solve nonlinear partitional clustering problem. The proposed hybrid evolutionary algorithm is the combination of FAPSO (fuzzy adaptive particle swarm optimization), ACO (ant colony optimization) and k-means algorithms, called FAPSO-ACO–K, which can find better cluster partition. The performance of the proposed algorithm is evaluated through several benchmark data sets. The simulation results show that the performance of the proposed algorithm is better than other algorithms such as PSO, ACO, simulated annealing (SA), combination of PSO and SA (PSO–SA), combination of ACO and SA (ACO–SA), combination of PSO and ACO (PSO–ACO), genetic algorithm (GA), Tabu search (TS), honey bee mating optimization (HBMO) and k-means for partitional clustering problem.  相似文献   

15.
This paper presents a new hybrid algorithm that executes large neighbourhood search algorithm in combination with the solution construction mechanism of the ant colony optimization algorithm (LNS–ACO) for the capacitated vehicle routing problem (CVRP). The proposed hybrid LNS–ACO algorithm aims at enhancing the performance of the large neighbourhood search algorithm by providing a satisfactory level of diversification via the solution construction mechanism of the ant colony optimization algorithm. Therefore, LNS–ACO algorithm combines its solution improvement mechanism with a solution construction mechanism. The performance of the proposed algorithm is tested on a set of CVRP instances. The hybrid LNS–ACO algorithm is compared against two other LNS variants and some of the formerly developed methods in terms of solution quality. Computational results indicate that the proposed hybrid LNS–ACO algorithm has a satisfactory performance in solving CVRP instances.  相似文献   

16.
A hybrid ant colony optimization algorithm is proposed by introducing extremal optimization local-search algorithm to the ant colony optimization (ACO) algorithm, and is applied to multiuser detection in direct sequence ultra wideband (DS-UWB) communication system in this paper. ACO algorithms have already successfully been applied to combinatorial optimization; however, as the pheromone accumulates, we may not get a global optimum because it can get stuck in a local minimum resulting in a bad steady state. Extremal optimization (EO) is a recently developed local-search heuristic method and has been successfully applied to a wide variety of optimization problems. Hence in this paper, a hybrid ACO algorithm, named ACO-EO algorithm, is proposed by introducing EO to ACO to improve the local-search ability of the algorithm. The ACO-EO algorithm is applied to multiuser detection in DS-UWB communication system, and via computer simulations it is shown that the proposed hybrid ACO algorithm has much better performance than other ACO algorithms and even equal to the optimal multiuser detector.  相似文献   

17.
The significance of the preprocessing stage in any data mining task is well known. Before attempting medical data classification, characteristics ofmedical datasets, including noise, incompleteness, and the existence of multiple and possibly irrelevant features, need to be addressed. In this paper, we show that selecting the right combination of preprocessing methods has a considerable impact on the classification potential of a dataset. The preprocessing operations considered include the discretization of numeric attributes, the selection of attribute subset(s), and the handling of missing values. The classification is performed by an ant colony optimization algorithm as a case study. Experimental results on 25 real-world medical datasets show that a significant relative improvement in predictive accuracy, exceeding 60% in some cases, is obtained.  相似文献   

18.
With the growing trend toward remote security verification procedures for telephone banking, biometric security measures and similar applications, automatic speaker verification (ASV) has received a lot of attention in recent years. The complexity of ASV system and its verification time depends on the number of feature vectors, their dimensionality, the complexity of the speaker models and the number of speakers. In this paper, we concentrate on optimizing dimensionality of feature space by selecting relevant features. At present there are several methods for feature selection in ASV systems. To improve performance of ASV system we present another method that is based on ant colony optimization (ACO) algorithm. After feature reduction phase, feature vectors are applied to a Gaussian mixture model universal background model (GMM-UBM) which is a text-independent speaker verification model. The performance of proposed algorithm is compared to the performance of genetic algorithm on the task of feature selection in TIMIT corpora. The results of experiments indicate that with the optimized feature set, the performance of the ASV system is improved. Moreover, the speed of verification is significantly increased since by use of ACO, number of features is reduced over 80% which consequently decrease the complexity of our ASV system.  相似文献   

19.
基于新型人工蜂群算法的分布式不相关并行机调度   总被引:1,自引:0,他引:1  
针对考虑预防性维修的分布式不相关并行机调度问题,提出了一种新型人工蜂群算法(ABC)以最小化最大完成时间.为了获得高质量的计算结果,该算法将整个种群划分为1个引领蜂群和3个跟随蜂群,跟随蜂有自己的蜜源且采用新方式跟随引领蜂, 4种蜂群运用彼此各异的搜索策略产生新解以增强种群多样性,提出一种新策略处理侦查蜂的搜索,并利用优化数据更新整个种群.通过大量仿真实验验证了新型ABC在求解所研究问题方面的有效性和优势.  相似文献   

20.
We have proposed a hybrid SVM based decision tree to speedup SVMs in its testing phase for binary classification tasks. While most existing methods addressed towards this task aim at reducing the number of support vectors, we have focused on reducing the number of test datapoints that need SVM’s help in getting classified. The central idea is to approximate the decision boundary of SVM using decision trees. The resulting tree is a hybrid tree in the sense that it has both univariate and multivariate (SVM) nodes. The hybrid tree takes SVM’s help only in classifying crucial datapoints lying near decision boundary; remaining less crucial datapoints are classified by fast univariate nodes. The classification accuracy of the hybrid tree is guaranteed by tuning a threshold parameter. Extensive computational comparisons on 19 publicly available datasets indicate that the proposed method achieves significant speedup when compared to SVMs, without any compromise in classification accuracy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号