首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 797 毫秒
1.
基分类器之间的差异性和单个基分类器自身的准确性是影响集成系统泛化性能的两个重要因素,针对差异性和准确性难以平衡的问题,提出了一种基于差异性和准确性的加权调和平均(D-A-WHA)度量基因表达数据的选择性集成算法。以核超限学习机(KELM)作为基分类器,通过D-A-WHA度量调节基分类器之间的差异性和准确性,最后选择一组准确性较高并且与其他基分类器差异性较大的基分类器组合进行集成。通过在UCI基因数据集上进行仿真实验,实验结果表明,与传统的Bagging、Adaboost等集成算法相比,基于D-A-WHA度量的选择性集成算法分类精度和稳定性都有显著的提高,且能有效应用于癌症基因数据的分类中。  相似文献   

2.
在集成学习中使用平均法、投票法作为结合策略无法充分利用基分类器的有效信息,且根据波动性设置基分类器的权重不精确、不恰当。以上问题会降低集成学习的效果,为了进一步提高集成学习的性能,提出将证据推理(evidence reasoning, ER)规则作为结合策略,并使用多样性赋权法设置基分类器的权重。首先,由多个深度学习模型作为基分类器、ER规则作为结合策略,构建集成学习的基本结构;然后,通过多样性度量方法计算每个基分类器相对于其他基分类器的差异性;最后,将差异性归一化实现基分类器的权重设置。通过多个图像数据集的分类实验,结果表明提出的方法较实验选取的其他方法准确率更高且更稳定,证明了该方法可以充分利用基分类器的有效信息,且多样性赋权法更精确。  相似文献   

3.
In this paper, a measure of competence based on random classification (MCR) for classifier ensembles is presented. The measure selects dynamically (i.e. for each test example) a subset of classifiers from the ensemble that perform better than a random classifier. Therefore, weak (incompetent) classifiers that would adversely affect the performance of a classification system are eliminated. When all classifiers in the ensemble are evaluated as incompetent, the classification accuracy of the system can be increased by using the random classifier instead. Theoretical justification for using the measure with the majority voting rule is given. Two MCR based systems were developed and their performance was compared against six multiple classifier systems using data sets taken from the UCI Machine Learning Repository and Ludmila Kuncheva Collection. The systems developed had typically the highest classification accuracies regardless of the ensemble type used (homogeneous or heterogeneous).  相似文献   

4.
《Information Fusion》2005,6(1):21-36
In the context of Multiple Classifier Systems, diversity among base classifiers is known to be a necessary condition for improvement in ensemble performance. In this paper the ability of several pair-wise diversity measures to predict generalisation error is compared. A new pair-wise measure, which is computed between pairs of patterns rather than pairs of classifiers, is also proposed for two-class problems. It is shown experimentally that the proposed measure is well correlated with base classifier test error as base classifier complexity is systematically varied. However, correlation with unity-weighted sum and vote is shown to be weaker, demonstrating the difficulty in choosing base classifier complexity for optimal fusion. An alternative strategy based on weighted combination is also investigated and shown to be less sensitive to number of training epochs.  相似文献   

5.
针对如何提高集成学习的性能,提出一种结合Rotation Forest和Multil3oost的集成学习方法—利用Rotation Forest中旋转变换的思想对原始数据集进行变换,旨在增加分类器间的差异度;利用Mu1tiI3oost在变换后的数据集上训练基分类器,旨在提高基分类器的准确度。最后用简单的多数投票法融合各基分类器的决策结果,将其作为集成分类器的输出。为了验证该方法的有效性,在公共数据集UCI上进行了实验,结果显示,该方法可获得较高的分类精度。  相似文献   

6.
为了进一步提高复杂干扰环境下对海雷达目标识别的泛化能力,提出基于k-medoids聚类和随机参考分类器(RRC)的动态选择集成算法(KMRRC).主要利用重采样技术生成多个基分类器,然后基于成对多样性度量准则将基分类器划分为多个簇,并基于校验数据集为每个基分类器构建相应的RRC模型,最后利用RRC从各个簇中动态选择竞争力最强的部分基分类器进行集成决策.通过寻优实验确定KMRRC的参数设置,随后利用Java调用Weka API在自建的目标全极化高分辨距离像(HRRP)样本库及17个UCI数据集上进行KMRRC与常用的9种集成算法和基分类算法的对比实验,并进一步研究多样性度量方法的选取对KMRRC性能的影响.实验验证文中算法在对海雷达目标识别领域的有效性.  相似文献   

7.
Independent component analysis (ICA) has been widely used to tackle the microarray dataset classification problem, but there still exists an unsolved problem that the independent component (IC) sets may not be reproducible after different ICA transformations. Inspired by the idea of ensemble feature selection, we design an ICA based ensemble learning system to fully utilize the difference among different IC sets. In this system, some IC sets are generated by different ICA transformations firstly. A multi-objective genetic algorithm (MOGA) is designed to select different biologically significant IC subsets from these IC sets, which are then applied to build base classifiers. Three schemes are used to fuse these base classifiers. The first fusion scheme is to combine all individuals in the final generation of the MOGA. In addition, in the evolution, we design a global-recording technique to record the best IC subsets of each IC set in a global-recording list. Then the IC subsets in the list are deployed to build base classifier so as to implement the second fusion scheme. Furthermore, by pruning about half of less accurate base classifiers obtained by the second scheme, a compact and more accurate ensemble system is built, which is regarded as the third fusion scheme. Three microarray datasets are used to test the ensemble systems, and the corresponding results demonstrate that these ensemble schemes can further improve the performance of the ICA based classification model, and the third fusion scheme leads to the most accurate ensemble system with the smallest ensemble size.  相似文献   

8.
基于粗集理论的选择性支持向量机集成   总被引:1,自引:0,他引:1       下载免费PDF全文
集成分类器的性能很大程度决定于各成员分类器的构造和对各成员分类器的组合方法。提出一种基于粗集理论的选择性支持向量机集成算法,该算法首先利用粗集技术产生一个属性约简集合,然后以各约简集为样本属性空间构造各成员分类器,其次通过对各成员分类器精度与差异度的计算,选择既满足个体的精度要求,又满足个体差异性要求的成员分类器进行集成。最后通过对UCI上一组实验数据的测试,证实该方法能够有效提高支持向量机的推广性能。  相似文献   

9.
Recent researches in fault classification have shown the importance of accurately selecting the features that have to be used as inputs to the diagnostic model. In this work, a multi-objective genetic algorithm (MOGA) is considered for the feature selection phase. Then, two different techniques for using the selected features to develop the fault classification model are compared: a single classifier based on the feature subset with the best classification performance and an ensemble of classifiers working on different feature subsets. The motivation for developing ensembles of classifiers is that they can achieve higher accuracies than single classifiers. An important issue for an ensemble to be effective is the diversity in the predictions of the base classifiers which constitute it, i.e. their capability of erring on different sub-regions of the pattern space. In order to show the benefits of having diverse base classifiers in the ensemble, two different ensembles have been developed: in the first, the base classifiers are constructed on feature subsets found by MOGAs aimed at maximizing the fault classification performance and at minimizing the number of features of the subsets; in the second, diversity among classifiers is added to the MOGA search as the third objective function to maximize. In both cases, a voting technique is used to effectively combine the predictions of the base classifiers to construct the ensemble output. For verification, some numerical experiments are conducted on a case of multiple-fault classification in rotating machinery and the results achieved by the two ensembles are compared with those obtained by a single optimal classifier.  相似文献   

10.
为了提高分类器集成性能,提出了一种基于聚类算法与排序修剪结合的分类器集成方法。首先将混淆矩阵作为量化基分类器间差异度的工具,通过聚类将分类器划分为若干子集;然后提出一种排序修剪算法,以距离聚类中心最近的分类器为起点,根据分类器的距离对差异度矩阵动态加权,以加权差异度作为排序标准对子集中的分类器进行按比例修剪;最后使用投票法对选出的基分类器进行集成。同时与多种集成方法在UCI数据库中的10组数据集上进行对比与分析,实验结果表明基于聚类与排序修剪的分类器选择方法有效提升了集成系统的分类能力。  相似文献   

11.
A classifier ensemble combines a set of individual classifier’s predictions to produce more accurate results than that of any single classifier system. However, one classifier ensemble with too many classifiers may consume a large amount of computational time. This paper proposes a new ensemble subset evaluation method that integrates classifier diversity measures into a novel classifier ensemble reduction framework. The framework converts the ensemble reduction into an optimization problem and uses the harmony search algorithm to find the optimized classifier ensemble. Both pairwise and non-pairwise diversity measure algorithms are applied by the subset evaluation method. For the pairwise diversity measure, three conventional diversity algorithms and one new diversity measure method are used to calculate the diversity’s merits. For the non-pairwise diversity measure, three classical algorithms are used. The proposed subset evaluation methods are demonstrated by the experimental data. In comparison with other classifier ensemble methods, the method implemented by the measurement of the interrater agreement exhibits a high accuracy prediction rate against the current ensembles’ performance. In addition, the framework with the new diversity measure achieves relatively good performance with less computational time.  相似文献   

12.
曹鹏  李博  栗伟  赵大哲 《计算机应用》2013,33(2):550-553
针对大规模数据的分类准确率低且效率下降的问题,提出一种结合X-means聚类的自适应随机子空间组合分类算法。首先使用X-means聚类方法,保持原有数据结构的同时,把复杂的数据空间自动分解为多个样本子空间进行分治学习;而自适应随机子空间组合分类器,提升了基分类器的差异性并自动确定基分类器数量,提升了组合分类器的鲁棒性及分类准确性。该算法在人工和UCI数据集上进行了测试,并与传统单分类和组合分类算法进行了比较。实验结果表明,对于大规模数据集,该方法具有更好的分类精度和健壮性,并提升了整体算法的效率。  相似文献   

13.
相比于集成学习,集成剪枝方法是在多个分类器中搜索最优子集从而改善分类器的泛化性能,简化集成过程。帕累托集成剪枝方法同时考虑了分类器的精准度及集成规模两个方面,并将二者均作为优化的目标。然而帕累托集成剪枝算法只考虑了基分类器的精准度与集成规模,忽视了分类器之间的差异性,从而导致了分类器之间的相似度比较大。本文提出了融入差异性的帕累托集成剪枝算法,该算法将分类器的差异性与精准度综合为第1个优化目标,将集成规模作为第2个优化目标,从而实现多目标优化。实验表明,当该改进的集成剪枝算法与帕累托集成剪枝算法在集成规模相当的前提下,由于差异性的融入该改进算法能够获得较好的性能。  相似文献   

14.
The primary effect of using a reduced number of classifiers is a reduction in the computational requirements during learning and classification time. In addition to this obvious result, research shows that the fusion of all available classifiers is not a guarantee of best performance but good results on the average. The much researched issue of whether it is more convenient to fuse or to select has become even more of interest in recent years with the development of the Online Boosting theory, where a limited set of classifiers is continuously updated as new inputs are observed and classifications performed. The concept of online classification has recently received significant interest in the computer vision community. Classifiers can be trained on the visual features of a target, casting the tracking problem into a binary classification one: distinguishing the target from the background.Here we discuss how to optimize the performance of a classifier ensemble employed for target tracking in video sequences. In particular, we propose the F-score measure as a novel means to select the members of the ensemble in a dynamic fashion. For each frame, the ensemble is built as a subset of a larger pool of classifiers selecting its members according to their F-score. We observed an overall increase in classification accuracy and a general tendency in redundancy reduction among the members of an f-score optimized ensemble. We carried out our experiments both on benchmark binary datasets and standard video sequences.  相似文献   

15.
Due to the wide variety of fusion techniques available for combining multiple classifiers into a more accurate classifier, a number of good studies have been devoted to determining in what situations some fusion methods should be preferred over other ones. However, the sample size behavior of the various fusion methods has hitherto received little attention in the literature of multiple classifier systems. The main contribution of this paper is thus to investigate the effect of training sample size on their relative performance and to gain more insight into the conditions for the superiority of some combination rules.A large experiment is conducted to study the performance of some fixed and trainable combination rules for executing one- and two-level classifier fusion for different training sample sizes. The experimental results yield the following conclusions: when implementing one-level fusion to combine homogeneous or heterogeneous base classifiers, fixed rules outperform trainable ones in nearly all cases, with only one exception of merging heterogeneous classifiers for large sample size. Moreover, the best classification for any considered sample size is generally achieved by a second level of combination (namely, utilizing one fusion rule to further combine a set of ensemble classifiers with each of them constructed by fusing base classifiers). Under these circumstances, it seems that adopting different types of fusion rules (fixed or trainable) as the combiners for two levels of fusion is appropriate.  相似文献   

16.
多分类器选择集成方法   总被引:2,自引:0,他引:2       下载免费PDF全文
针对目前人们对分类性能的高要求和多分类器集成实现的复杂性,从基分类器准确率和基分类器间差异性两方面出发,提出了一种新的多分类器选择集成算法。该算法首先从生成的基分类器中选择出分类准确率较高的,然后利用分类器差异性度量来选择差异性大的高性能基分类器,在分类器集成之前先对分类器集进行选择获得新的分类器集。在UCI数据库上的实验结果证明,该方法优于bagging方法,取得了很好的分类识别效果。  相似文献   

17.
Training set resampling based ensemble design techniques are successfully used to reduce the classification errors of the base classifiers. Boosting is one of the techniques used for this purpose where each training set is obtained by drawing samples with replacement from the available training set according to a weighted distribution which is modified for each new classifier to be included in the ensemble. The weighted resampling results in a classifier set, each being accurate in different parts of the input space mainly specified the sample weights. In this study, a dynamic integration of boosting based ensembles is proposed so as to take into account the heterogeneity of the input sets. An evidence-theoretic framework is developed for this purpose so as to take into account the weights and distances of the neighboring training samples in both training and testing boosting based ensembles. The effectiveness of the proposed technique is compared to the AdaBoost algorithm using three different base classifiers.  相似文献   

18.
Training neural networks in distinguishing different emotions from physiological signals frequently involves fuzzy definitions of each affective state. In addition, manual design of classification tasks often uses sub-optimum classifier parameter settings, leading to average classification performance. In this study, an attempt to create a framework for multi-layered optimization of an ensemble of classifiers to maximize the system's ability to learn and classify affect, and to minimize human involvement in setting optimum parameters for the classification system is proposed. Using fuzzy adaptive resonance theory mapping (ARTMAP) as the classifier template, genetic algorithms (GAs) were employed to perform exhaustive search for the best combination of parameter settings for individual classifier performance. Speciation was implemented using subset selection of classification data attributes, as well as using an island model genetic algorithms method. Subsequently, the generated population of optimum classifier configurations was used as candidates to form an ensemble of classifiers. Another set of GAs were used to search for the combination of classifiers that would result in the best classification ensemble accuracy. The proposed methodology was tested using two affective data sets and was able to produce relatively small ensembles of fuzzy ARTMAPs with excellent affect recognition accuracy.  相似文献   

19.
Credit scoring aims to assess the risk associated with lending to individual consumers. Recently, ensemble classification methodology has become popular in this field. However, most researches utilize random sampling to generate training subsets for constructing the base classifiers. Therefore, their diversity is not guaranteed, which may lead to a degradation of overall classification performance. In this paper, we propose an ensemble classification approach based on supervised clustering for credit scoring. In the proposed approach, supervised clustering is employed to partition the data samples of each class into a number of clusters. Clusters from different classes are then pairwise combined to form a number of training subsets. In each training subset, a specific base classifier is constructed. For a sample whose class label needs to be predicted, the outputs of these base classifiers are combined by weighted voting. The weight associated with a base classifier is determined by its classification performance in the neighborhood of the sample. In the experimental study, two benchmark credit data sets are adopted for performance evaluation, and an industrial case study is conducted. The results show that compared to other ensemble classification methods, the proposed approach is able to generate base classifiers with higher diversity and local accuracy, and improve the accuracy of credit scoring.  相似文献   

20.
为了提高面部表情的分类识别性能,基于集成学习理论,提出了一种二次优化选择性(Quadratic Optimization Choice, QOC)集成分类模型。首先,对于9个基分类器,依据性能进行排序,选择前30%的基分类器作为集成模型的候选基分类器。其次,依据组合规则产生集成模型簇。最后,对集成模型簇进行二次优化选择,选择具有最小泛化误差的集成分类器的子集,从而确定最优集成分类模型。为了验证QOC集成分类模型的性能,选择采用最大值、最小值和均值规则的集成模型作为对比模型,实验结果表明:相对基分类器,QOC集成分类模型取得了较好的分类效果,尤其是对于识别率较差的悲伤表情类,平均识别率提升了21.11%。相对于非选择性集成模型,QOC集成分类模型识别性能也有显著提高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号