首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
将集成学习的思想引入到增量学习之中可以显著提升学习效果,近年关于集成式增量学习的研究大多采用加权投票的方式将多个同质分类器进行结合,并没有很好地解决增量学习中的稳定-可塑性难题。针对此提出了一种异构分类器集成增量学习算法。该算法在训练过程中,为使模型更具稳定性,用新数据训练多个基分类器加入到异构的集成模型之中,同时采用局部敏感哈希表保存数据梗概以备待测样本近邻的查找;为了适应不断变化的数据,还会用新获得的数据更新集成模型中基分类器的投票权重;对待测样本进行类别预测时,以局部敏感哈希表中与待测样本相似的数据作为桥梁,计算基分类器针对该待测样本的动态权重,结合多个基分类器的投票权重和动态权重判定待测样本所属类别。通过对比实验,证明了该增量算法有比较高的稳定性和泛化能力。  相似文献   

2.
多分类器选择集成方法   总被引:2,自引:0,他引:2       下载免费PDF全文
针对目前人们对分类性能的高要求和多分类器集成实现的复杂性,从基分类器准确率和基分类器间差异性两方面出发,提出了一种新的多分类器选择集成算法。该算法首先从生成的基分类器中选择出分类准确率较高的,然后利用分类器差异性度量来选择差异性大的高性能基分类器,在分类器集成之前先对分类器集进行选择获得新的分类器集。在UCI数据库上的实验结果证明,该方法优于bagging方法,取得了很好的分类识别效果。  相似文献   

3.
为了解决在分类器集成过程中分类性能要求高和集成过程复杂等问题,分析常规集成方法的优缺点,研究已有的分类器差异性度量方法,提出了筛选差异性尽可能大的分类器作为基分类器而构建的一个层级式分类器集成系统.构建不同的基分类器,选择准确率较高的备选,分析其差异性,选出差异大的分类器作为系统所需基分类器,构成集成系统.通过在UCI数据集上进行的试验,获得了很好的分类识别效果,验证了这种分类集成系统的优越性.  相似文献   

4.
尹光  朱玉全  陈耿 《计算机工程》2012,38(8):167-169
为提高集成分类器系统的分类性能,提出一种分类器选择集成算法MCC-SCEN。该算法选取基分类器集中具有最大互信息差异性的子集和最大个体分类能力的子集,以确定待扩展分类器集,选择具有较大混合分类能力的基分类器加入到待扩展集中,构成集成系统,进行加权投票并产生结果。实验结果表明,该方法优于经典的AdaBoost和Bagging方法,具有较高的分类准确率。  相似文献   

5.
基于聚类选择的分类器集成   总被引:1,自引:0,他引:1  
提出了一种基于聚类选择的分类器集成方法,通过聚类把模式特征空间划分成不相交的区域,对于初始分类器集合,各区域给出分类器的删除分值,各分类器总分值确定其删除优先级别,由删除优先级别选择一组分类器组成集成。理论分析和实验结果表明,基于聚类选择的分类器集成方法能够更好地对模式进行分类。  相似文献   

6.
差异性是分类器集成具有高泛化能力的必要条件. 然而,目前对差异性度量、有效性及分类器优化集成都没有统一的分析和处理方法. 针对上述问题,本文一方面从差异性度量方法、差异性度量有效性分析和相应的分类器优化集成技术三个角度,全面总结与分析了基于差异性的分类器集成. 同时,本文还通过向量空间模型形象地论证了差异性度量的有效性. 另一方面,本文针对多种典型的基于差异性的分类器集成技术(Bagging,boosting GA-based,quadratic programming (QP)、semi-definite programming (SDP)、regularized selective ensemble (RSE))在UCI数据库和USPS数据库上进行了对比实验与性能分析,并对如何选择差异性度量方法和具体的优化集成技术给出了可行性建议.  相似文献   

7.
作为一种典型的大数据,数据流具有连续、无限、概念漂移和快速到达等特点,因此传统的分类技术无法直接有效地应用于数据流挖掘。本文在经典的精度加权集成(Accuracy weighted ensemble,AWE)算法的基础上提出概念自适应快速决策树更新集成(Concept very fast decision tree update ensemble,CUE)算法。该算法不仅在基分类器的权重分配方面进行了改进,而且在解决数据块大小的敏感性问题以及增加基分类器之间的相异性方面,有明显的改善。实验表明在分类准确率上,CUE算法高于AWE算法。最后,提出聚类动态分类器选择(Dynamic classifier selection with clustering,DCSC)算法。该算法基于分类器动态选择的思想,没有繁琐的赋权值机制,所以时间效率较高。实验结果验证了DCSC算法的有效和高效性,并能有效地处理概念漂移。  相似文献   

8.
互联网容纳了海量的文本信息,文本分类系统能够在给定的类别下,自动将文本分门别类,更好地帮助人们挖掘有用信息.介绍了基于词频分类器集成文本分类算法.该算法计算代价小,分类召回率高,但准确率较低,分析了导致准确率低的原因,在此基础上提出了基于改进词频分类器集成的文本分类算法,改进后的算法在文本权重更新方面做了参数调整,使得算法的准确率有显著提高,最后用实验验证了改进后算法的性能.实验结果表明,基于改进词频分类器集成的文本分类算法不仅提高了分类的准确性,而且表现出较好的稳定性.  相似文献   

9.
快速多分类器集成算法研究   总被引:1,自引:0,他引:1       下载免费PDF全文
研究快速多分类器集成算法。对多分类器集成需选定一定数量的弱分类器,再为每个弱分类器分配一定权重。在选择弱分类器时,通过计算每个弱分类器在全部训练样本集上的分类错误率,对其进行排序,挑选出分类效果最好的若干弱分类器。在多分类器权重分配策略上,提出2种权重分配方法:Biased AdaBoost算法与基于差分演化的多分类器集成算法。在人脸数据库上的实验结果表明,与经典AdaBoost算法相比,该算法能有效降低训练时间,提高识别准确率。  相似文献   

10.
为提高决策树的集成分类精度,介绍了一种基于特征变换的旋转森林分类器集成算法,通过对数据属性集的随机分割,并在属性子集上对抽取的子样本数据进行主成分分析,以构造新的样本数据,达到增大基分类器差异性及提高预测准确率的目的。在Weka平台下,分别采用Bagging、AdaBoost及旋转森林算法对剪枝与未剪枝的J48决策树分类算法进行集成的对比试验,以10次10折交叉验证的平均准确率为比较依据。结果表明旋转森林算法的预测精度优于其他两个算法,验证了旋转森林是一种有效的决策树分类器集成算法。  相似文献   

11.
Incremental construction of classifier and discriminant ensembles   总被引:2,自引:0,他引:2  
We discuss approaches to incrementally construct an ensemble. The first constructs an ensemble of classifiers choosing a subset from a larger set, and the second constructs an ensemble of discriminants, where a classifier is used for some classes only. We investigate criteria including accuracy, significant improvement, diversity, correlation, and the role of search direction. For discriminant ensembles, we test subset selection and trees. Fusion is by voting or by a linear model. Using 14 classifiers on 38 data sets, incremental search finds small, accurate ensembles in polynomial time. The discriminant ensemble uses a subset of discriminants and is simpler, interpretable, and accurate. We see that an incremental ensemble has higher accuracy than bagging and random subspace method; and it has a comparable accuracy to AdaBoost, but fewer classifiers.  相似文献   

12.
提出一种基于进化规划的神经网络群的自动设计方法.该方法不仅使得神经网络群中的个体网络倾向于完成不同的子任务,同时各神经网络个体在进化过程中不断寻找最好的协作关系,而且神经网络群的规模和结构不需预先设定.仿真试验表明,该算法是有效的。  相似文献   

13.
基于神经网络集成的专家系统模型   总被引:9,自引:3,他引:9  
提出一种基于神经网络集成的专家系统模型,并给出神经网络集成的构造算法.在该模型中神经网络集成作为专家系统的一个内嵌模块,用于专家系统的知识获取,克服了传统专家系统在知识获取中的"瓶颈"问题.并将该模型用于图书剔旧系统中,初步建成基于神经网络集成的图书剔旧专家系统原型.  相似文献   

14.
In this paper we introduce a framework for making statistical inference on the asymptotic prediction of parallel classification ensembles. The validity of the analysis is fairly general. It only requires that the individual classifiers are generated in independent executions of some randomized learning algorithm, and that the final ensemble prediction is made via majority voting. Given an unlabeled test instance, the predictions of the classifiers in the ensemble are obtained sequentially. As the individual predictions become known, Bayes' theorem is used to update an estimate of the probability that the class predicted by the current ensemble coincides with the classification of the corresponding ensemble of infinite size. Using this estimate, the voting process can be halted when the confidence on the asymptotic prediction is sufficiently high. An empirical investigation in several benchmark classification problems shows that most of the test instances require querying only a small number of classifiers to converge to the infinite ensemble prediction with a high degree of confidence. For these instances, the difference between the generalization error of the finite ensemble and the infinite ensemble limit is very small, often negligible.  相似文献   

15.
周朴雄 《计算机应用研究》2008,25(10):2982-2983
将神经网络集成思想引入 Web文本分类领域 ,提出了利用最小估计误差策略进行最优加权网络集成的方案。具体做法是根据各网络的分类性能、各网络同其他网络的相关程度给每个网络的后验概率估计赋予不同的权值 ,通过加权平均提高后验概率估计的准确程度 ,进而提高分类率。英文数据库的实验结果表明 ,与经典的 Bayes模型、 kNN模型相比 ,该模型具有更高的分类精度与更快的分类速度。  相似文献   

16.
基于神经网络集成的软件故障预测及实验分析   总被引:1,自引:0,他引:1  
软件系统故障预测是软件测试过程中软件可靠性研究的重点之一。利用软件系统测试过程中前期的故障相关信息进行建模,预测后期的软件故障信息,以便于后期测试和验证资源的合理分配。根据软件测试过程中已知的软件故障时间序列,利用非齐次泊松分布过程、神经网络、神经网络集成等方法对其进行建模。通过对三个实例分别建模,其预测平均相对误差G-O模型依次为3.02%、5.88%和6.58%,而神经网络集成模型为0.19%、1.88%和1.455%,实验结果表明神经网络集成模型具有更精确的预测能力。  相似文献   

17.
In this paper, we propose an approach for ensemble construction based on the use of supervised projections, both linear and non-linear, to achieve both accuracy and diversity of individual classifiers. The proposed approach uses the philosophy of boosting, putting more effort on difficult instances, but instead of learning the classifier on a biased distribution of the training set, it uses misclassified instances to find a supervised projection that favors their correct classification. We show that supervised projection algorithms can be used for this task. We try several known supervised projections, both linear and non-linear, in order to test their ability in the present framework. Additionally, the method is further improved introducing concepts from oversampling for imbalance datasets. The introduced method counteracts the negative effect of a low number of instances for constructing the supervised projections.The method is compared with AdaBoost showing an improved performance on a large set of 45 problems from the UCI Machine Learning Repository. Also, the method shows better robustness in presence of noise with respect to AdaBoost.  相似文献   

18.
Abstract: Neural network ensembles (sometimes referred to as committees or classifier ensembles) are effective techniques to improve the generalization of a neural network system. Combining a set of neural network classifiers whose error distributions are diverse can generate better results than any single classifier. In this paper, some methods for creating ensembles are reviewed, including the following approaches: methods of selecting diverse training data from the original source data set, constructing different neural network models, selecting ensemble nets from ensemble candidates and combining ensemble members' results. In addition, new results on ensemble combination methods are reported.  相似文献   

19.
Voting-based consensus clustering refers to a distinct class of consensus methods in which the cluster label mismatch problem is explicitly addressed. The voting problem is defined as the problem of finding the optimal relabeling of a given partition with respect to a reference partition. It is commonly formulated as a weighted bipartite matching problem. In this paper, we present a more general formulation of the voting problem as a regression problem with multiple-response and multiple-input variables. We show that a recently introduced cumulative voting scheme is a special case corresponding to a linear regression method. We use a randomized ensemble generation technique, where an overproduced number of clusters is randomly selected for each ensemble partition. We apply an information theoretic algorithm for extracting the consensus clustering from the aggregated ensemble representation and for estimating the number of clusters. We apply it in conjunction with bipartite matching and cumulative voting. We present empirical evidence showing substantial improvements in clustering accuracy, stability, and estimation of the true number of clusters based on cumulative voting. The improvements are achieved in comparison to consensus algorithms based on bipartite matching, which perform very poorly with the chosen ensemble generation technique, and also to other recent consensus algorithms.  相似文献   

20.
An ensemble of multiple classifiers is widely considered to be an effective technique for improving accuracy and stability of a single classifier. This paper proposes a framework of sparse ensembles and deals with new linear weighted combination methods for sparse ensembles. Sparse ensemble is to sparsely combine the outputs of multiple classifiers by using a sparse weight vector. When the continuous outputs of multiple classifiers are provided in our methods, the problem of solving sparse weight vector can be formulated as linear programming problems in which the hinge loss or/and the 1-norm regularization are exploited. Both the hinge loss and the 1-norm regularization are techniques inducing sparsity used in machine learning. We only ensemble classifiers with nonzero weight coefficients. In these LP-based methods, the ensemble training error is minimized while the weight vector of ensemble learning is controlled, which can be thought as implementing the structure risk minimization rule and naturally explains good performance of these methods. The promising experimental results over UCI data sets and the radar high-resolution range profile data are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号