首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 150 毫秒
1.
多分类器选择集成方法   总被引:2,自引:0,他引:2       下载免费PDF全文
针对目前人们对分类性能的高要求和多分类器集成实现的复杂性,从基分类器准确率和基分类器间差异性两方面出发,提出了一种新的多分类器选择集成算法。该算法首先从生成的基分类器中选择出分类准确率较高的,然后利用分类器差异性度量来选择差异性大的高性能基分类器,在分类器集成之前先对分类器集进行选择获得新的分类器集。在UCI数据库上的实验结果证明,该方法优于bagging方法,取得了很好的分类识别效果。  相似文献   

2.
尹光  朱玉全  陈耿 《计算机工程》2012,38(8):167-169
为提高集成分类器系统的分类性能,提出一种分类器选择集成算法MCC-SCEN。该算法选取基分类器集中具有最大互信息差异性的子集和最大个体分类能力的子集,以确定待扩展分类器集,选择具有较大混合分类能力的基分类器加入到待扩展集中,构成集成系统,进行加权投票并产生结果。实验结果表明,该方法优于经典的AdaBoost和Bagging方法,具有较高的分类准确率。  相似文献   

3.
为解决多分类器融合过程中时间开销大和准确率不高的问题,采用改进的Bagging方法并结合MapReduce技术,提出了一种基于选择性集成的并行多分类器融合方法PMCF-SE。该方法基于MapReduce并行计算架构。在Map阶段,选择分类效果较好的基分类器;在Reduce阶段,从所选的基分类器中选择差异性较大的基分类器,然后采用D-S证据理论融合被选的基分类器。实验结果表明,在执行效率方面,与单机环境相比,集群环境下该方法的执行效率有所提高;在分类准确率方面,与Bagging算法相比,PMCF-SE在不同的基分类器数目下的分类准确率都高于Bagging算法。  相似文献   

4.
为提高信用评估的预测精度,提出一种基于装袋的基因表达式编程(GEP)多分类器集成算法。该算法采用Bagging方法将GEP产生的多个差异基分类器进行集成。在德国信用数据库真实数据集上的实验及性能分析表明,该算法较SVM算法的预测精度提高约2.7%;较KNN(K=17)算法的预测精度提高约7.93%;较单GEP分类算法的预测精度提高约1.1%。  相似文献   

5.
传统的雷电数据预测方法往往采用单一最优机器学习算法,较少考虑气象数据的时空变化等现象。针对该现象,提出一种基于集成策略的多机器学习短时雷电预报算法。首先,对气象数据进行属性约简,降低数据维度;其次,在数据集上训练多种异构机器学习分类器,并基于预测质量筛选最优基分类器;最后,通过对最优基分类器训练权重,并结合集成策略产生最终分类器。实验表明,该方法优于传统单最优方法,其平均预测准确率提高了9.5%。  相似文献   

6.
为了提高面部表情的分类识别性能,基于集成学习理论,提出了一种二次优化选择性(Quadratic Optimization Choice, QOC)集成分类模型。首先,对于9个基分类器,依据性能进行排序,选择前30%的基分类器作为集成模型的候选基分类器。其次,依据组合规则产生集成模型簇。最后,对集成模型簇进行二次优化选择,选择具有最小泛化误差的集成分类器的子集,从而确定最优集成分类模型。为了验证QOC集成分类模型的性能,选择采用最大值、最小值和均值规则的集成模型作为对比模型,实验结果表明:相对基分类器,QOC集成分类模型取得了较好的分类效果,尤其是对于识别率较差的悲伤表情类,平均识别率提升了21.11%。相对于非选择性集成模型,QOC集成分类模型识别性能也有显著提高。  相似文献   

7.
如何构造差异性大的基分类器是集成学习研究的重点,为此提出迭代循环选择法:以最大化正则互信息为准则提取最优特征子集,进而基于此训练得到基分类器;同时以错分样本个数作为差异性度量准则来评价所得基分类器的性能,若满足条件则停止,反之则循环迭代直至结束.最后用加权投票法融合所选基分类器的识别结果.通过仿真实验验证算法的有效性,以支持向量机为分类器,在公共数据集UCI上进行实验,并与单SVM及经典的Bagging集成算法和特征Bagging集成算法进行对比.实验结果显示,该方法可获得较高的分类精度.  相似文献   

8.
在多分类器集成时,每个基分类器的效能不同,如每个权值都相同,则会影响基分类器发挥作用。基于此,提出基于PSO拓展的多分类器加权集成方法BCPSO。该方法采用随机子空间生成各个独立的子分类器,输出结果通过各分类器加权投票组合规则集成。实验结果表明,该方法有效可行,具有较高的分类正确率。  相似文献   

9.
一种挖掘概念漂移数据流的选择性集成算法   总被引:1,自引:0,他引:1  
提出一种挖掘概念漂移数据流的选择性集成学习算法。该算法根据各基分类器在验证集上的输出结果向量方向与参考向量方向之间的偏离程度,选择参与集成的基分类器。分别在具有突发性和渐进性概念漂移的人造数据集SEA和Hyperplane上进行实验分析。实验结果表明,这种基分类器选择方法大幅度提高了集成算法在处理概念漂移数据流时的分类准确性。使用error-ambiguity分解对算法构建的naive Bayes集成在解决分类问题时的性能进行了分析。实验结果表明,算法成功的主要原因是它能显著降低平均泛化误差。  相似文献   

10.
针对多分类器集成方法产生的流量分类器在泛化能力方面的局限性,提出一种选择性集成网络流量分类框架,以满足流量分类对分类器高效的需求。基于此框架,提出一种多分类器选择性集成的网络流量分类方法 MCSE(Multiple Classifiers Selective Ensemble network traffic classification method),解决多分类器的选取问题。该方法首先利用半监督学习技术提升基分类器的精度,然后改进不一致性度量方法对分类器差异性的度量策略,降低多分类器集成方法实现网络流量分类的复杂性,有效减少选择最优分类器的计算开销。实验表明,与Bagging算法和GASEN算法相比,MCSE方法能更充分利用基分类器间的互补性,具有更高效的流量分类性能。  相似文献   

11.
Yin  Chuanlong  Zhu  Yuefei  Liu  Shengli  Fei  Jinlong  Zhang  Hetong 《The Journal of supercomputing》2020,76(9):6690-6719

The performance of classifiers has a direct impact on the effectiveness of intrusion detection system. Thus, most researchers aim to improve the detection performance of classifiers. However, classifiers can only get limited useful information from the limited number of labeled training samples, which usually affects the generalization of classifiers. In order to enhance the network intrusion detection classifiers, we resort to adversarial training, and a novel supervised learning framework using generative adversarial network for improving the performance of the classifier is proposed in this paper. The generative model in our framework is utilized to continuously generate other complementary labeled samples for adversarial training and assist the classifier for classification, while the classifier in our framework is used to identify different categories. Meanwhile, the loss function is deduced again, and several empirical training strategies are proposed to improve the stabilization of the supervised learning framework. Experimental results prove that the classifier via adversarial training improves the performance indicators of intrusion detection. The proposed framework provides a feasible method to enhance the performance and generalization of the classifier.

  相似文献   

12.
In this paper, we propose a cascade classifier combining AdaBoost and support vector machine, and applied this to pedestrian detection. The pedestrian detection involved using a window of fixed size to extract the candidate region from left to right and top to bottom of the image, and performing feature extractions on the candidate region. Finally, our proposed cascade classifier completed the classification of the candidate region. The cascade-AdaBoost classifier has been successfully used in pedestrian detection. We have improved the initial setting method for the weights of the training samples in the AdaBoost classifier, so that the selected weak classifier would be able to focus on a higher detection rate other than accuracy. The proposed cascade classifier can automatically select the AdaBoost classifier or SVM to construct a cascade classifier according to the training samples, so as to effectively improve classification performance and reduce training time. In order to verify our proposed method, we have used our extracted database of pedestrian training samples, PETs database, INRIA database and MIT database. This completed the pedestrian detection experiment whose result was compared to those of the cascade-AdaBoost classifier and support vector machine. The result of the experiment showed that in a simple environment involving campus experimental image and PETs database, both our cascade classifier and other classifiers can attain good results, while in a complicated environment involving INRA and MIT database experiments, our cascade classifier had better results than those of other classifiers.  相似文献   

13.
The effect of errors in ground truth on the estimated thematic accuracy of a classifier is considered. A relationship is derived between the true accuracy of a classifier relative to ground truth without errors, the actual accuracy of the ground truth used, and the measured accuracy of the classifier as a function of the number of classes. We show that if the accuracy of the ground truth is known or can be estimated, the true accuracy of a classifier can be estimated from the measured accuracy. In a series of simulations our method is shown to produce unbiased estimates of the true accuracy of the classifier with an uncertainty that depends on the number of samples and the accuracy of the ground truth. A method for determining the relative performance of two or more classifiers over the same area is then discussed. The results indicate that, as the number of samples increases, the performance of the classifiers can be effectively differentiated using inaccurate ground truth. It is argued that relative accuracies computed using a large number of inaccurate ground truth points are more representative of the true relative performance of the classifiers as they are being evaluated over a larger portion of the scene. An example is presented that uses this method to evaluate the relative performance of two Landsat classifiers.  相似文献   

14.
基于粗集理论的选择性支持向量机集成   总被引:1,自引:0,他引:1       下载免费PDF全文
集成分类器的性能很大程度决定于各成员分类器的构造和对各成员分类器的组合方法。提出一种基于粗集理论的选择性支持向量机集成算法,该算法首先利用粗集技术产生一个属性约简集合,然后以各约简集为样本属性空间构造各成员分类器,其次通过对各成员分类器精度与差异度的计算,选择既满足个体的精度要求,又满足个体差异性要求的成员分类器进行集成。最后通过对UCI上一组实验数据的测试,证实该方法能够有效提高支持向量机的推广性能。  相似文献   

15.
1 引言近年来,多分类器的组合方法已成为模式识别研究的热点问题,并已在模式识别的多个应用方面,如字符识别、目标识别、文本分类等领域获得了较好的应用效果。多分类器组合方法的基本假设是:对一个需要专家进行的任务,k个专家个人判断的有效组合应该优于个人的判断。利用具有不同特性和性能的多分类器,通过进行有效的组合可以获得更高的模式识别性能。  相似文献   

16.
Multiple classifier systems (MCSs) based on the combination of outputs of a set of different classifiers have been proposed in the field of pattern recognition as a method for the development of high performance classification systems. Previous work clearly showed that multiple classifier systems are effective only if the classifiers forming them are accurate and make different errors. Therefore, the fundamental need for methods aimed to design “accurate and diverse” classifiers is currently acknowledged. In this paper, an approach to the automatic design of multiple classifier systems is proposed. Given an initial large set of classifiers, our approach is aimed at selecting the subset made up of the most accurate and diverse classifiers. A proof of the optimality of the proposed design approach is given. Reported results on the classification of multisensor remote sensing images show that this approach allows the design of effective multiple classifier systems.  相似文献   

17.
This paper proposes a method for combining multiple tree classifiers based on both classifier ensemble (bagging) and dynamic classifier selection schemes (DCS). The proposed method is composed of the following procedures: (1) building individual tree classifiers based on bootstrap samples; (2) calculating the distance between all possible two trees; (3) clustering the trees based on single linkage clustering; (4) selecting two clusters by local region in terms of accuracy and error diversity; and (5) voting the results of tree classifiers selected in the two clusters. Empirical evaluation using publicly available data sets confirms the superiority of our proposed approach over other classifier combining methods.  相似文献   

18.
Automatic text classification is one of the most important tools in Information Retrieval. This paper presents a novel text classifier using positive and unlabeled examples. The primary challenge of this problem as compared with the classical text classification problem is that no labeled negative documents are available in the training example set. Firstly, we identify many more reliable negative documents by an improved 1-DNF algorithm with a very low error rate. Secondly, we build a set of classifiers by iteratively applying the SVM algorithm on a training data set, which is augmented during iteration. Thirdly, different from previous PU-oriented text classification works, we adopt the weighted vote of all classifiers generated in the iteration steps to construct the final classifier instead of choosing one of the classifiers as the final classifier. Finally, we discuss an approach to evaluate the weighted vote of all classifiers generated in the iteration steps to construct the final classifier based on PSO (Particle Swarm Optimization), which can discover the best combination of the weights. In addition, we built a focused crawler based on link-contexts guided by different classifiers to evaluate our method. Several comprehensive experiments have been conducted using the Reuters data set and thousands of web pages. Experimental results show that our method increases the performance (F1-measure) compared with PEBL, and a focused web crawler guided by our PSO-based classifier outperforms other several classifiers both in harvest rate and target recall.  相似文献   

19.
基于最小代价的多分类器动态集成   总被引:2,自引:0,他引:2  
本文提出一种基于最小代价准则的分类器动态集成方法.与一般方法不同,动态集成是根据“性能预测特征”,动态地为每一样本选择最适合的一组分类器进行集成.该选择基于使误识代价与时间代价最小化的准则,改变代价函数的定义可以方便地达到识别率与识别速度之间的不同折衷.本文中提出了两种分类器动态集成的方法,并介绍了在联机手写汉字识别中的具体应用.在实验中使了3个分类器进行动态集成,因此,得到7种分类组合.在预先定义的代价意义下,我们比较了动态集成方法和其它7种固定方法的性能.实验结果证明了动态集成方法的高灵活性、实用性和提高系统综合性能的能力.  相似文献   

20.
The primary concern of the rating policies for a banking industry is to develop a more objective, accurate and competitive scoring model to avoid losses from potential bad debt. This study proposes an artificial immune classifier based on the artificial immune network (named AINE-based classifier) to evaluate the applicants’ credit scores. Two experimental credit datasets are used to show the accuracy rate of the artificial immune classifier. The ten-fold cross-validation method is applied to evaluate the performance of the classifier. The classifier is compared with other data mining techniques. Experimental results show that for the AINE-based classifier in credit scoring is more competitive than the SVM and hybrid SVM-based classifiers, except the BPN classifier. We further compare our classifier with other three AIS-based classifiers in the benchmark datasets, and show that the AINE-based classifier can rival the AIRS-based classifiers and outperforms the SAIS classifier when the number of attributes and classes increase. Our classifier can provide the credit card issuer with accurate and valuable information of credit scoring analyses to avoid making incorrect decisions that result in the loss of applicants’ bad debt.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号