首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
《Information Fusion》2003,4(2):87-100
A popular method for creating an accurate classifier from a set of training data is to build several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. One way to generate an ensemble of accurate and diverse simple Bayesian classifiers is to use different feature subsets generated with the random subspace method. In this case, the ensemble consists of multiple classifiers constructed by randomly selecting feature subsets, that is, classifiers constructed in randomly chosen subspaces. In this paper, we present an algorithm for building ensembles of simple Bayesian classifiers in random subspaces. The EFS_SBC algorithm includes a hill-climbing-based refinement cycle, which tries to improve the accuracy and diversity of the base classifiers built on random feature subsets. We conduct a number of experiments on a collection of 21 real-world and synthetic data sets, comparing the EFS_SBC ensembles with the single simple Bayes, and with the boosted simple Bayes. In many cases the EFS_SBC ensembles have higher accuracy than the single simple Bayesian classifier, and than the boosted Bayesian ensemble. We find that the ensembles produced focusing on diversity have lower generalization error, and that the degree of importance of diversity in building the ensembles is different for different data sets. We propose several methods for the integration of simple Bayesian classifiers in the ensembles. In a number of cases the techniques for dynamic integration of classifiers have significantly better classification accuracy than their simple static analogues. We suggest that a reason for that is that the dynamic integration better utilizes the ensemble coverage than the static integration.  相似文献   

2.
The problem of object category classification by committees or ensembles of classifiers, each of which is based on one diverse codebook, is addressed in this paper. Two methods of constructing visual codebook ensembles are proposed in this study. The first technique introduces diverse individual visual codebooks using different clustering algorithms. The second uses various visual codebooks of different sizes for constructing an ensemble with high diversity. Codebook ensembles are trained to capture and convey image properties from different aspects. Based on these codebook ensembles, different types of image representations can be acquired. A classifier ensemble can be trained based on different expression datasets from the same training image set. The use of a classifier ensemble to categorize new images can lead to improved performance. Detailed experimental analysis on a Pascal VOC challenge dataset reveals that the present ensemble approach performs well, consistently improves the performance of visual object classifiers, and results in state-of-the-art performance in categorization.  相似文献   

3.

Abstract  

In neural network ensemble, the diversity of its constitutive component networks is a crucial factor to boost its generalization performance. In terms of how each ensemble system solves the problem, we can roughly categorize the existing ensemble mechanism into two groups: data-driven and model-driven ensembles. The former engenders diversity to ensemble members by manipulating the data, while the latter realizes ensemble diversity by manipulating the component models themselves. Within a neural network ensemble, standard back-propagation (BP) networks are usually used as a base component. However, in this article, we will use our previously designed improved circular back-propagation (ICBP) neural network to establish such an ensemble. ICBP differentiates from BP network not only because an extra anisotropic input node is added, but also more importantly, because of the introduction of the extra node, it possesses an interesting property apart from the BP network, i.e., just through directly assigning different sets of values 1 and −1 to the weights connecting the extra node to all the hidden nodes, we can construct a set of heterogeneous ICBP networks with different hidden layer activation functions, among which we select four typical heterogeneous ICBPs to build a dynamic classifier selection ICBP system (DCS-ICBP). The system falls into the category of model-driven ensemble. The aim of this article is to explore the relationship between the explicitly constructed ensemble and the diversity scale, and further to verify feasibility and effectiveness of the system on classification problems through empirical study. Experimental results on seven benchmark classification tasks show that our DCS-ICBP outperforms each individual ICBP classifier and surpasses the performance of combination of ICBP using the majority voting technique, i.e. majority voting ICBP system (MVICBP). The successful simulation results validate that in DCS-ICBP we provide a new constructive method for diversity enforcement for ICBP ensemble systems.  相似文献   

4.
Rotation forest: A new classifier ensemble method   总被引:8,自引:0,他引:8  
We propose a method for generating classifier ensembles based on feature extraction. To create the training data for a base classifier, the feature set is randomly split into K subsets (K is a parameter of the algorithm) and Principal Component Analysis (PCA) is applied to each subset. All principal components are retained in order to preserve the variability information in the data. Thus, K axis rotations take place to form the new features for a base classifier. The idea of the rotation approach is to encourage simultaneously individual accuracy and diversity within the ensemble. Diversity is promoted through the feature extraction for each base classifier. Decision trees were chosen here because they are sensitive to rotation of the feature axes, hence the name "forest.” Accuracy is sought by keeping all principal components and also using the whole data set to train each base classifier. Using WEKA, we examined the Rotation Forest ensemble on a random selection of 33 benchmark data sets from the UCI repository and compared it with Bagging, AdaBoost, and Random Forest. The results were favorable to Rotation Forest and prompted an investigation into diversity-accuracy landscape of the ensemble models. Diversity-error diagrams revealed that Rotation Forest ensembles construct individual classifiers which are more accurate than these in AdaBoost and Random Forest, and more diverse than these in Bagging, sometimes more accurate as well.  相似文献   

5.
《Applied Soft Computing》2008,8(1):305-315
This paper presents a soft computing based bank performance prediction system. It is an ensemble system whose constituent models are a multi-layered feed forward neural network trained with backpropagation (MLFF-BP), a probabilistic neural network (PNN) and a radial basis function neural network (RBFN), support vector machine (SVM), classification and regression trees (CART) and a fuzzy rule based classifier. Further, principal component analysis (PCA) based hybrid neural networks, viz. PCA-MLFF-BP, PCA-PNN and PCA-RBF are also included as constituents of the ensemble. Moreover, GRNN and PNN were trained with a genetic algorithm to optimize the smoothing factors. Two ensembles (i) simple majority voting based and (ii) weightage based are implemented. This system predicts the performance of a bank in the coming financial year based on its previous 2-years’ financial data. Ten-fold cross-validation is performed in the training sessions and results are validated with an independent production set. It is demonstrated that the ensemble is able to yield lower Type I and Type II errors compared to its constituent models. Further, the ensemble also outperformed an earlier study [P.G. Swicegood, Predicting poor bank profitability: a comparison of neural network, discriminant analysis and professional human judgement, Ph.D. Thesis, Department of Finance, Florida State University, 1998] that used multivariate discriminant analysis (MDA), MLFF-BP and human judgment.  相似文献   

6.
A classifier ensemble combines a set of individual classifier’s predictions to produce more accurate results than that of any single classifier system. However, one classifier ensemble with too many classifiers may consume a large amount of computational time. This paper proposes a new ensemble subset evaluation method that integrates classifier diversity measures into a novel classifier ensemble reduction framework. The framework converts the ensemble reduction into an optimization problem and uses the harmony search algorithm to find the optimized classifier ensemble. Both pairwise and non-pairwise diversity measure algorithms are applied by the subset evaluation method. For the pairwise diversity measure, three conventional diversity algorithms and one new diversity measure method are used to calculate the diversity’s merits. For the non-pairwise diversity measure, three classical algorithms are used. The proposed subset evaluation methods are demonstrated by the experimental data. In comparison with other classifier ensemble methods, the method implemented by the measurement of the interrater agreement exhibits a high accuracy prediction rate against the current ensembles’ performance. In addition, the framework with the new diversity measure achieves relatively good performance with less computational time.  相似文献   

7.
基于因果发现的神经网络集成方法   总被引:4,自引:0,他引:4  
凌锦江  周志华 《软件学报》2004,15(10):1479-1484
现有的神经网络集成方法主要通过扰动训练数据来产生精确且差异度较大的个体网络,从而获得较强的泛化能力.利用因果发现技术,在取样结果中找出类别属性的祖先属性,然后使用仅包含这些属性的数据生成个体网络,从而有效地将扰动训练数据与扰动输入属性结合起来,以产生精确度高且差异度大的个体.实验结果表明,该方法的泛化能力与当前一些流行的集成方法相当或更好.  相似文献   

8.
基于免疫聚类的思想,提出了一种神经网络集成方法。采用轮盘赌选择方法重复地从各免疫聚类中的抽取样本以构成神经网络集成中各个体神经网络的训练样本集,神经网络集成的输出采用相对多数投票法。将基于免疫聚类的神经网络集成应用于中医舌诊诊断,以肝病病证诊断进行仿真。实验结果表明:基于免疫聚类的神经网络集成比基于Bagging算法的神经网络集成能有效地提高其泛化能力。因此,基于免疫聚类的神经网络集成算法的研究是可行的、有效的。  相似文献   

9.
Is Combining Classifiers with Stacking Better than Selecting the Best One?   总被引:6,自引:0,他引:6  
Džeroski  Saso  Ženko  Bernard 《Machine Learning》2004,54(3):255-273
We empirically evaluate several state-of-the-art methods for constructing ensembles of heterogeneous classifiers with stacking and show that they perform (at best) comparably to selecting the best classifier from the ensemble by cross validation. Among state-of-the-art stacking methods, stacking with probability distributions and multi-response linear regression performs best. We propose two extensions of this method, one using an extended set of meta-level features and the other using multi-response model trees to learn at the meta-level. We show that the latter extension performs better than existing stacking approaches and better than selecting the best classifier by cross validation.  相似文献   

10.
A theoretical analysis of bagging as a linear combination of classifiers   总被引:1,自引:0,他引:1  
We apply an analytical framework for the analysis of linearly combined classifiers to ensembles generated by bagging. This provides an analytical model of bagging misclassification probability as a function of the ensemble size, which is a novel result in the literature. Experimental results on real data sets confirm the theoretical predictions. This allows us to derive a novel and theoretically grounded guideline for choosing bagging ensemble size. Furthermore, our results are consistent with explanations of bagging in terms of classifier instability and variance reduction, support the optimality of the simple average over the weighted average combining rule for ensembles generated by bagging, and apply to other randomization-based methods for constructing classifier ensembles. Although our results do not allow to compare bagging misclassification probability with the one of an individual classifier trained on the original training set, we discuss how the considered theoretical framework could be exploited to this aim.  相似文献   

11.
Ensemble classification – combining the results of a set of base learners – has received much attention in the machine learning community and has demonstrated promising capabilities in improving classification accuracy. Compared with neural network or decision tree ensembles, there is no comprehensive empirical research in support vector machine (SVM) ensembles. To fill this void, this paper analyses and compares SVM ensembles with four different ensemble constructing techniques, namely bagging, AdaBoost, Arc-X4 and a modified AdaBoost. Twenty real-world data sets from the UCI repository are used as benchmarks to evaluate and compare the performance of these SVM ensemble classifiers by their classification accuracy. Different kernel functions and different numbers of base SVM learners are tested in the ensembles. The experimental results show that although SVM ensembles are not always better than a single SVM, the SVM bagged ensemble performs as well or better than other methods with a relatively higher generality, particularly SVMs with a polynomial kernel function. Finally, an industrial case study of gear defect detection is conducted to validate the empirical analysis results.  相似文献   

12.
Recent researches in fault classification have shown the importance of accurately selecting the features that have to be used as inputs to the diagnostic model. In this work, a multi-objective genetic algorithm (MOGA) is considered for the feature selection phase. Then, two different techniques for using the selected features to develop the fault classification model are compared: a single classifier based on the feature subset with the best classification performance and an ensemble of classifiers working on different feature subsets. The motivation for developing ensembles of classifiers is that they can achieve higher accuracies than single classifiers. An important issue for an ensemble to be effective is the diversity in the predictions of the base classifiers which constitute it, i.e. their capability of erring on different sub-regions of the pattern space. In order to show the benefits of having diverse base classifiers in the ensemble, two different ensembles have been developed: in the first, the base classifiers are constructed on feature subsets found by MOGAs aimed at maximizing the fault classification performance and at minimizing the number of features of the subsets; in the second, diversity among classifiers is added to the MOGA search as the third objective function to maximize. In both cases, a voting technique is used to effectively combine the predictions of the base classifiers to construct the ensemble output. For verification, some numerical experiments are conducted on a case of multiple-fault classification in rotating machinery and the results achieved by the two ensembles are compared with those obtained by a single optimal classifier.  相似文献   

13.
神经网络集成和支持向量机都是在机器学习领域很流行的方法。集成方法成功地提高了神经网络的稳健性和精度,其中选择性集成方法通过算法选择差异度大的个体,取得了很好的效果。而支持向量机更是克服了神经网络的局部最优,不稳定等缺点,也在多个方面取得了很好的结果。该文着重研究这两种方法在小样本多类数据集上的性能,在四个真实数据集上的结果表明,支持向量机性能要比神经网络集成稍好.  相似文献   

14.
《Image and vision computing》2001,19(9-10):699-707
In the field of pattern recognition, the combination of an ensemble of neural networks has been proposed as an approach to the development of high performance image classification systems. However, previous work clearly showed that such image classification systems are effective only if the neural networks forming them make different errors. Therefore, the fundamental need for methods aimed to design ensembles of ‘error-independent’ networks is currently acknowledged. In this paper, an approach to the automatic design of effective neural network ensembles is proposed. Given an initial large set of neural networks, our approach is aimed to select the subset formed by the most error-independent nets. Reported results on the classification of multisensor remote-sensing images show that this approach allows one to design effective neural network ensembles.  相似文献   

15.
In this paper, a generalized adaptive ensemble generation and aggregation (GAEGA) method for the design of multiple classifier systems (MCSs) is proposed. GAEGA adopts an “over-generation and selection” strategy to achieve a good bias-variance tradeoff. In the training phase, different ensembles of classifiers are adaptively generated by fitting the validation data globally with different degrees. The test data are then classified by each of the generated ensembles. The final decision is made by taking into consideration both the ability of each ensemble to fit the validation data locally and reducing the risk of overfitting. In this paper, the performance of GAEGA is assessed experimentally in comparison with other multiple classifier aggregation methods on 16 data sets. The experimental results demonstrate that GAEGA significantly outperforms the other methods in terms of average accuracy, ranging from 2.6% to 17.6%.  相似文献   

16.
Considerable research effort has been expended to identify more accurate models for decision support systems in financial decision domains including credit scoring and bankruptcy prediction. The focus of this earlier work has been to identify the “single best” prediction model from a collection that includes simple parametric models, nonparametric models that directly estimate data densities, and nonlinear pattern recognition models such as neural networks. Recent theories suggest this work may be misguided in that ensembles of predictors provide more accurate generalization than the reliance on a single model. This paper investigates three recent ensemble strategies: crossvalidation, bagging, and boosting. We employ the multilayer perceptron neural network as a base classifier. The generalization ability of the neural network ensemble is found to be superior to the single best model for three real world financial decision applications.  相似文献   

17.
Ensemble methods aim at combining multiple learning machines to improve the efficacy in a learning task in terms of prediction accuracy, scalability, and other measures. These methods have been applied to evolutionary machine learning techniques including learning classifier systems (LCSs). In this article, we first propose a conceptual framework that allows us to appropriately categorize ensemble‐based methods for fair comparison and highlights the gaps in the corresponding literature. The framework is generic and consists of three sequential stages: a pre‐gate stage concerned with data preparation; the member stage to account for the types of learning machines used to build the ensemble; and a post‐gate stage concerned with the methods to combine ensemble output. A taxonomy of LCSs‐based ensembles is then presented using this framework. The article then focuses on comparing LCS ensembles that use feature selection in the pre‐gate stage. An evaluation methodology is proposed to systematically analyze the performance of these methods. Specifically, random feature sampling and rough set feature selection‐based LCS ensemble methods are compared. Experimental results show that the rough set‐based approach performs significantly better than the random subspace method in terms of classification accuracy in problems with high numbers of irrelevant features. The performance of the two approaches are comparable in problems with high numbers of redundant features.  相似文献   

18.
Hybrid neural network classifiers for automatic target detection   总被引:1,自引:0,他引:1  
Abstract: We describe a one-class classification approach to an automatic target detection problem, which involves distinguishing targets from clutter in diverse environments. We use only target statistics to construct the classifier. The classifier combines conventional and neural network methods. The classifier is a Parzen estimator, which requires storage and recall of all training points. To reduce the size of the training set, we apply two neural network learning algorithms: (1) we use a backpropagation network to approximate the Parzen estimator; (2) we apply the infomax learning principle to compress the size of the training set before constructing the Parzen estimator. We find that the results obtained with the infomax scheme approach those obtained with Parzen alone and are better than those obtained with backpropagation.  相似文献   

19.
This paper presents two new approaches for constructing an ensemble of neural networks (NN) using coevolution and the artificial immune system (AIS). These approaches are extensions of the CLONal Selection Algorithm for building ENSembles (CLONENS) algorithm. An explicit diversity promotion technique was added to CLONENS and a novel coevolutionary approach to build neural ensembles is introduced, whereby two populations representing the gates and the individual NN are coevolved. The former population is responsible for defining the ensemble size and selecting the members of the ensemble. This population is evolved using the differential evolution algorithm. The latter population supplies the best individuals for building the ensemble, which is evolved by AIS. Results show that it is possible to automatically define the ensemble size being also possible to find smaller ensembles with good generalization performance on the tested benchmark regression problems. More interestingly, the use of the diversity measure during the evolutionary process did not necessarily improve generalization. In this case, diverse ensembles may be found using only implicit diversity promotion techniques.  相似文献   

20.
提出一种基于类别信息的分类器集成方法Cagging.基于类别信息重复选择样本生成基本分类器的训练集,增强了基本分类器之间的差异性;利用基本分类器对不同模式类的分类能力为每个基本分类器设置一组权重.使用权重对各分类器输出结果进行加权决策,较好地利用了各个基本分类器之间的差异性.在人脸图像库ORL上的实验验证了Cagging的有效性.此外,Cagging方法的基本分类器生成方式适合于通过增量学习生成集成分类器,扩展Cagging设计了基于增量学习的分类器集成方法Cagging-Ⅰ,实验验证了它的有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号