首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Limits on the majority vote accuracy in classifier fusion   总被引:7,自引:0,他引:7  
We derive upper and lower limits on the majority vote accuracy with respect to individual accuracy p, the number of classifiers in the pool (L), and the pairwise dependence between classifiers, measured by Yule’s Q statistic. Independence between individual classifiers is typically viewed as an asset in classifier fusion. We show that the majority vote with dependent classifiers can potentially offer a dramatic improvement both over independent classifiers and over an individual classifier with accuracy p. A functional relationship between the limits and the pairwise dependence Q is derived. Two patterns of the joint distribution for classifier outputs (correct/incorrect) are identified to derive the limits: the pattern of success and the pattern of failure. The results support the intuition that negative pairwise dependence is beneficial although not straightforwardly related to the accuracy. The pattern of success showed that for the highest improvement over p, all pairs of classifiers in the pool should have the same negative dependence. ID="A1"Correspondance and offprint requests to: L. I. Kuncheva, School of Informatics, University of Wales, Bangor LL57 1UT, Gwynedd, UK. Email: l.i.kuncheva@bangor.ac.uk  相似文献   

2.
Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In this article, we investigate a family of combiners based on order statistics, for robust handling of situations where there are large discrepancies in performance of individual classifiers. Based on a mathematical modelling of how the decision boundaries are affected by order statistic combiners, we derive expressions for the reductions in error expected when simple output combination methods based on the median, the maximum and in general, the ith order statistic, are used. Furthermore, we analyse the trim and spread combiners, both based on linear combinations of the ordered classifier outputs, and show that in the presence of uneven classifier performance, they often provide substantial gains over both linear and simple order statistics combiners. Experimental results on both real world data and standard public domain data sets corroborate these findings. Received: 17 November 2000, Received in revised form: 07 November 2001, Accepted: 22 November 2001  相似文献   

3.
Classifier combination methods have proved to be an effective tool to increase the performance of classification techniques that can be used in any pattern recognition applications. Despite a significant number of publications describing successful classifier combination implementations, the theoretical basis is still not matured enough and achieved improvements are inconsistent. In this paper, we propose a novel statistical validation technique known as correlation‐based classifier combination technique for combining classifier in any pattern recognition problem. This validation has significant influence on the performance of combinations, and their utilization is necessary for complete theoretical understanding of combination algorithms. The analysis presented is statistical in nature but promises to lead to a class of algorithms for rank‐based decision combination. The potentials of the theoretical and practical issues in implementation are illustrated by applying it on 2 standard datasets in pattern recognition domain, namely, handwritten digit recognition and letter image recognition datasets taken from UCI Machine Learning Database Repository ( http://www.ics.uci.edu/_mlearn ). 1 An empirical evaluation using 8 well‐known distinct classifiers confirms the validity of our approach compared to some other combinations of multiple classifiers algorithms. Finally, we also suggest a methodology for determining the best mix of individual classifiers.  相似文献   

4.
: In the context of supervised statistical learning, we present a broad class of models named Generalised Additive Multi-Mixture Models (GAM-MM), based on a multiple combination of mixtures of classifiers to be used in both the regression and classification cases. In particular, we additively combine mixtures of different types of classifiers, defining an ensemble composed of nonparametric tools (tree-based methods), semiparametric tools (scatterplot smoothers) and parametric tools (linear regression). Within this approach, we define a classifier scoring criterion to be jointly used with the bagging procedure for estimation of the mixing parameters, and describe the GAM-MM estimation procedure, that adaptively works by iterating a backfitting-like algorithm and a local scoring procedure until convergence. The effectiveness of our approach in modelling complex data structures is evaluated by presenting the results of some applications on real and simulated data. Received: 24 November 2001, Received in revised form: 10 December 2001, Accepted: 5 April 2002 ID="A1" Correspondence and offprint requests to: C. Conversano, Dipartimento di Matematica e Statistica, Università degli Studi di Napoli Federico II, Monte Sant'Angelo, Napoli I-80126, Italy. Email: conversa@unina.it  相似文献   

5.
6.
: A robust character of combining diverse classifiers using a majority voting has recently been illustrated in the pattern recognition literature. Furthermore, negatively correlated classifiers turned out to offer further improvement of the majority voting performance even comparing to the idealised model with independent classifiers. However, negatively correlated classifiers represent a very unlikely situation in real-world classification problems, and their benefits usually remain out of reach. Nevertheless, it is theoretically possible to obtain a 0% majority voting error using a finite number of classifiers at error levels lower than 50%. We attempt to show that structuring classifiers into relevant multistage organisations can widen this boundary, as well as the limits of majority voting error, even more. Introducing discrete error distributions for analysis, we show how majority voting errors and their limits depend upon the parameters of a multiple classifier system with hardened binary outputs (correct/incorrect). Moreover, we investigate the sensitivity of boundary distributions of classifier outputs to small discrepancies modelled by the random changes of votes, and propose new more stable patterns of boundary distributions. Finally, we show how organising classifiers into different structures can be used to widen the limits of majority voting errors, and how this phenomenon can be effectively exploited. Received: 17 November 2000, Received in revised form: 27 November 2001, Accepted: 29 November 2001 ID="A1" Correspondence and offprint requests to: D. Ruta, Applied Computing Research Unit, Division of Computer and Information Systems, University of Paisley, High Street, Paisley PA1 2BE, UK. Email: ruta-ci0@paisley.ac.uk  相似文献   

7.
Using a number of measures for characterising the complexity of classification problems, we studied the comparative advantages of two methods for constructing decision forests – bootstrapping and random subspaces. We investigated a collection of 392 two-class problems from the UCI depository, and observed that there are strong correlations between the classifier accuracies and measures of length of class boundaries, thickness of the class manifolds, and nonlinearities of decision boundaries. We found characteristics of both difficult and easy cases where combination methods are no better than single classifiers. Also, we observed that the bootstrapping method is better when the training samples are sparse, and the subspace method is better when the classes are compact and the boundaries are smooth. Received: 03 November 2000, Received in revised form: 25 October 2001, Accepted: 04 January 2002  相似文献   

8.
The simultaneous use of multiple classifiers has been shown to provide performance improvement in classification problems. The selection of an optimal set of classifiers is an important part of multiple classifier systems and the independence of classifier outputs is generally considered to be an advantage for obtaining better multiple classifier systems. In this paper, the need for the classifier independence is interrogated from classification performance point of view. The performance achieved with the use of classifiers having independent joint distributions is compared to some other classifiers which are defined to have best and worst joint distributions. These distributions are obtained by formulating the combination operation as an optimization problem. The analysis revealed several important observations about classifier selection which are then used to analyze the problem of selecting an additional classifier to be used with the available multiple classifier system.  相似文献   

9.
樊康新 《计算机工程》2009,35(24):191-193
针对朴素贝叶斯(NB)分类器在分类过程中存在诸如分类模型对样本具有敏感性、分类精度难以提高等缺陷,提出一种基于多种特征选择方法的NB组合文本分类器方法。依据Boosting分类算法,采用多种不同的特征选择方法建立文本的特征词集,训练NB分类器作为Boosting迭代过程的基分类器,通过对基分类器的加权投票生成最终的NB组合文本分类器。实验结果表明,该组合分类器较单NB文本分类器具有更好的分类性能。  相似文献   

10.
: The performance of a multiple classifier system combining the soft outputs of k-Nearest Neighbour (k-NN) Classifiers by the product rule can be degraded by the veto effect. This phenomenon is caused by k-NN classifiers estimating the class a posteriori probabilities using the maximum likelihood method. We show that the problem can be minimised by marginalising the k-NN estimates using the Bayesian prior. A formula for the resulting moderated k-NN estimate is derived. The merits of moderation are examined on real data sets. Tests with different bagging procedures indicate that the proposed moderation method improves the performance of the multiple classifier system significantly. Received: 21 March 2001, Received in revised form: 04 September 2001, Accepted: 20 September 2001  相似文献   

11.
Using an ensemble of classifiers instead of a single classifier has been shown to improve generalization performance in many pattern recognition problems. However, the extent of such improvement depends greatly on the amount of correlation among the errors of the base classifiers. Therefore, reducing those correlations while keeping the classifiers’ performance levels high is an important area of research. In this article, we explore Input Decimation (ID), a method which selects feature subsets for their ability to discriminate among the classes and uses these subsets to decouple the base classifiers. We provide a summary of the theoretical benefits of correlation reduction, along with results of our method on two underwater sonar data sets, three benchmarks from the Probenl/UCI repositories, and two synthetic data sets. The results indicate that input decimated ensembles outperform ensembles whose base classifiers use all the input features; randomly selected subsets of features; and features created using principal components analysis, on a wide range of domains. ID="A1"Correspondance and offprint requests to: Kagan Tumer, NASA Ames Research Center, Moffett Field, CA, USA  相似文献   

12.
In classifier combination, the relative values of beliefs assigned to different hypotheses are more important than accurate estimation of the combined belief function representing the joint observation space. Because of this, the independence requirement in Dempster’s rule should be examined from classifier combination point of view. In this study, it is investigated whether there is a set of dependent classifiers which provides a better combined accuracy than independent classifiers when Dempster’s rule of combination is used. The analysis carried out for three different representations of statistical evidence has shown that the combination of dependent classifiers using Dempster’s rule may provide much better combined accuracies compared to independent classifiers.  相似文献   

13.
This article deals with the combination of pattern classifiers with two reject options. Such classifiers operate in two steps and differ on the managing of ambiguity and distance rejection (independently or not). We propose to combine the first steps of these classifiers using concepts from the theory of evidence. We propose some intelligent basic probability assignment to reject classes before using the combination rule. After combination, a decision rule is proposed for classifying or rejecting patterns either for distance or for ambiguity. We emphasize that rejection is not related to a lack of consensus between the classifiers, but to the initial reject options. In the case of ambiguity rejection, a class-selective approach has been used. Some illustrative results on artificial and real data are given. Received: 21 November 2000, Received in revised form: 25 October 2001, Accepted: 26 November 2001  相似文献   

14.
多分类器融合实现机型识别   总被引:2,自引:0,他引:2  
针对空战目标识别中机型识别这一问题,提出了基于多分类器融合的识别方法。该方法以战术性能参数为输入,便于满足空战的实时性要求。通过广泛收集数据,得到机型识别的分类特征,选取分类特征的子集作为单分类器的特征,用BP网络设计单分类器,然后选用性能优良的和规则进行分类器融合,求得最终的决策。实验结果表明,多分类器融合的识别性能明显优于参与融合的分类器,也优于相同输入的单分类器。该方法的另一特点是能够进行缺省推理,因而有较强的抗干扰能力,适合真实战场环境的需要。  相似文献   

15.
分类器的动态选择与循环集成方法   总被引:1,自引:0,他引:1  
针对多分类器系统设计中最优子集选择效率低下、集成方法缺乏灵活性等问题, 提出了分类器的动态选择与循环集成方法 (Dynamic selection and circulating combination, DSCC). 该方法利用不同分类器模型之间的互补性, 动态选择出对目标有较高识别率的分类器组合, 使参与集成的分类器数量能够随识别目标的复杂程度而自适应地变化, 并根据可信度实现系统的循环集成. 在手写体数字识别实验中, 与其他常用的分类器选择方法相比, 所提出的方法灵活高效, 识别率更高.  相似文献   

16.
Generalized rules for combination and joint training of classifiers   总被引:1,自引:0,他引:1  
Classifier combination has repeatedly been shown to provide significant improvements in performance for a wide range of classification tasks. In this paper, we focus on the problem of combining probability distributions generated by different classifiers. Specifically, we present a set of new combination rules that generalize the most commonly used combination functions, such as the mean, product, min, and max operations. These new rules have continuous and differentiable forms, and can thus not only be used for combination of independently trained classifiers, but also as objective functions in a joint classifier training scheme. We evaluate both of these schemes by applying them to the combination of phone classifiers in a speech recognition system. We find a significant performance improvement over previously used combination schemes when jointly training and combining multiple systems using a generalization of the product rule.  相似文献   

17.
基于可见光与红外数据融合的地形分类   总被引:1,自引:0,他引:1  
顾迎节  金忠 《计算机工程》2013,39(2):187-191
针对单传感器地形分类效果不佳的问题,提出一种基于可见光与红外数据融合的地形分类方法。分别对可见光图像与红外图像提取特征,使用最近邻分类器和最小距离分类器进行后验概率估计,将来自不同特征、不同分类器的后验概率加权组合,通过散度计算得到特征的权重,实验确定分类器的权重,并在最小距离的后验概率估计中,使用马氏距离代替欧氏距离。实验结果表明,该方法对水泥路和沙子路的识别率分别达到99.33%和96.67%,均高于同类方法。  相似文献   

18.
How to effectively predict financial distress is an important problem in corporate financial management. Though much attention has been paid to financial distress prediction methods based on single classifier, its limitation of uncertainty and benefit of multiple classifier combination for financial distress prediction has also been neglected. This paper puts forward a financial distress prediction method based on weighted majority voting combination of multiple classifiers. The framework of multiple classifier combination system, model of weighted majority voting combination, basic classifiers’ voting weight model and basic classifiers’ selection principles are discussed in detail. Empirical experiment with Chinese listed companies’ real world data indicates that this method can greatly improve the average prediction accuracy and stability, and it is more suitable for financial distress prediction than single classifiers.  相似文献   

19.
Hierarchical Fusion of Multiple Classifiers for Hyperspectral Data Analysis   总被引:3,自引:0,他引:3  
Many classification problems involve high dimensional inputs and a large number of classes. Multiclassifier fusion approaches to such difficult problems typically centre around smart feature extraction, input resampling methods, or input space partitioning to exploit modular learning. In this paper, we investigate how partitioning of the output space (i.e. the set of class labels) can be exploited in a multiclassifier fusion framework to simplify such problems and to yield better solutions. Specifically, we introduce a hierarchical technique to recursively decompose a C-class problem into C_1 two-(meta) class problems. A generalised modular learning framework is used to partition a set of classes into two disjoint groups called meta-classes. The coupled problems of finding a good partition and of searching for a linear feature extractor that best discriminates the resulting two meta-classes are solved simultaneously at each stage of the recursive algorithm. This results in a binary tree whose leaf nodes represent the original C classes. The proposed hierarchical multiclassifier framework is particularly effective for difficult classification problems involving a moderately large number of classes. The proposed method is illustrated on a problem related to classification of landcover using hyperspectral data: a 12-class AVIRIS subset with 180 bands. For this problem, the classification accuracies obtained were superior to most other techniques developed for hyperspectral classification. Moreover, the class hierarchies that were automatically discovered conformed very well with human domain experts’ opinions, which demonstrates the potential of using such a modular learning approach for discovering domain knowledge automatically from data. Received: 21 November 2000, Received in revised form: 02 November 2001, Accepted: 13 December 2001  相似文献   

20.
基于Multi-Agent的分类器融合   总被引:14,自引:0,他引:14  
针对决策层输出的分类器融合问题,该文提出了一种基于Multi-Agent思想的融合算法,该算法将分类器融合问题建模为人类发源地问题,通过引入决策共现矩阵,并在智能体之间进行信息交互,从而利用了分类器之间的决策相关信息,算法根据在融合训练集上得到的统计参量,指导各个智能体向不同类别溯源,并通过智能体之间的信息交换改变溯源概率,最终达到群体决策,得到决策类别,本文在标准数据集上对该算法进行了实验研究,通过与其它一些融合方法的比较,得出在用于融合的分类器较少时,该算法得到比其它方法更低的分类错误率,其空间复杂度相对BKS方法较小,实验证实,该算法是收敛的。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号