首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A common way to model multi-class classification problems is by means of Error-Correcting Output Codes (ECOC). Given a multi-class problem, the ECOC technique designs a code word for each class, where each position of the code identifies the membership of the class for a given binary problem. A classification decision is obtained by assigning the label of the class with the closest code. One of the main requirements of the ECOC design is that the base classifier is capable of splitting each sub-group of classes from each binary problem. However, we can not guarantee that a linear classifier model convex regions. Furthermore, non-linear classifiers also fail to manage some type of surfaces. In this paper, we present a novel strategy to model multi-class classification problems using sub-class information in the ECOC framework. Complex problems are solved by splitting the original set of classes into sub-classes, and embedding the binary problems in a problem-dependent ECOC design. Experimental results show that the proposed splitting procedure yields a better performance when the class overlap or the distribution of the training objects conceil the decision boundaries for the base classifier. The results are even more significant when one has a sufficiently large training size.  相似文献   

2.
针对多分类不均衡问题,提出了一种新的基于一对一(one-versus-one,OVO)分解策略的方法。首先基于OVO分解策略将多分类不均衡问题分解成多个二值分类问题;再利用处理不均衡二值分类问题的算法建立二值分类器;接着利用SMOTE过抽样技术处理原始数据集;然后采用基于距离相对竞争力加权方法处理冗余分类器;最后通过加权投票法获得输出结果。在KEEL不均衡数据集上的大量实验结果表明,所提算法比其他经典方法具有显著的优势。  相似文献   

3.
Multi-Class Learning by Smoothed Boosting   总被引:1,自引:0,他引:1  
AdaBoost.OC has been shown to be an effective method in boosting “weak” binary classifiers for multi-class learning. It employs the Error-Correcting Output Code (ECOC) method to convert a multi-class learning problem into a set of binary classification problems, and applies the AdaBoost algorithm to solve them efficiently. One of the main drawbacks with the AdaBoost.OC algorithm is that it is sensitive to the noisy examples and tends to overfit training examples when they are noisy. In this paper, we propose a new boosting algorithm, named “MSmoothBoost”, which introduces a smoothing mechanism into the boosting procedure to explicitly address the overfitting problem with AdaBoost.OC. We proved the bounds for both the empirical training error and the marginal training error of the proposed boosting algorithm. Empirical studies with seven UCI datasets and one real-world application have indicated that the proposed boosting algorithm is more robust and effective than the AdaBoost.OC algorithm for multi-class learning. Editor: Nicolo Cesa-Bianchi  相似文献   

4.
多类分类是目标识别中必须面对的一个关键问题,现有分类器大都为二分器,无法满足对多类目标进行分类,为此,提出利用纠错输出编码方法对多类问题进行分解,即把多类问题转化成二类问题;同时讨论一种基于最小二乘法对二分器结果进行融合的策略。实验分别对UCI数据集和三种一维距离像数据集进行测试,结果表明与经典的多分类器相比,提出的多类分类策略有较高的分类正确率。  相似文献   

5.
The paper shows the possibilities of generalizing the two-class classification into multi-class classification by means of a fuzzy inference system. Fuzzy combiner harnesses the support values from classifiers to provide final response having no other restrictions on their structure. We compare proposed combination methods with ECOC and two variations of decision templates, based on Euclidean and symmetric distance. The effectiveness of the proposed combination method based on the fuzzy logic theory is also evaluated via computer experiments carried out on benchmark datasets.  相似文献   

6.
集成学习通过构建具有一定互补功能的多个分类器来完成学习任务,以减少分类误差。但是当前研究未能考虑分类器的局部有效性。为此,在基于集成学习的框架下,提出了一个分层结构的多分类算法。该算法按预测类别分解问题,在分层的基础上,集成多个分类器以提高分类准确度。在美国某高校招生录取这一个实际应用的数据集及3个UCI数据集上进行实验,实验结果验证了该算法的有效性。  相似文献   

7.
Supervised classification based on error-correcting output codes (ECOC) is an efficient method to solve the problem of multi-class classification, and how to get the accurate probability estimation via ECOC is also an attractive research direction. This paper proposed three kinds of ECOC to get unbiased probability estimates, and investigated the corresponding classification performance in depth at the same time. Two evaluating criterions for ECOC that has better classification performance were concluded, which are Bayes consistence and unbiasedness of probability estimation. Experimental results on artificial data sets and UCI data sets validate the correctness of our conclusion.  相似文献   

8.
The evaluation of feature selection methods for text classification with small sample datasets must consider classification performance, stability, and efficiency. It is, thus, a multiple criteria decision-making (MCDM) problem. Yet there has been few research in feature selection evaluation using MCDM methods which considering multiple criteria. Therefore, we use MCDM-based methods for evaluating feature selection methods for text classification with small sample datasets. An experimental study is designed to compare five MCDM methods to validate the proposed approach with 10 feature selection methods, nine evaluation measures for binary classification, seven evaluation measures for multi-class classification, and three classifiers with 10 small datasets. Based on the ranked results of the five MCDM methods, we make recommendations concerning feature selection methods. The results demonstrate the effectiveness of the used MCDM-based method in evaluating feature selection methods.  相似文献   

9.
We present a simple yet effective approach for human action recognition. Most of the existing solutions based on multi-class action classification aim to assign a class label for the input video. However, the variety and complexity of real-life videos make it very challenging to achieve high classification accuracy. To address this problem, we propose to partition the input video into small clips and formulate action recognition as a joint decision-making task. First, we partition all videos into two equal segments that are processed in the same manner. We repeat this procedure to obtain three layers of video subsegments, which are then organized in a binary tree structure. We train separate classifiers for each layer. By applying the corresponding classifiers to video subsegments, we obtain a decision value matrix (DVM). Then, we construct an aggregated representation for the original full-length video by integrating the elements of the DVM. Finally, we train a new action recognition classifier based on the DVM representation. Our extensive experimental evaluations demonstrate that the proposed method achieves significant performance improvement against several compared methods on two benchmark datasets.  相似文献   

10.
基于KNN模型的层次纠错输出编码算法   总被引:2,自引:0,他引:2  
辛轶  郭躬德  陈黎飞  黄杰 《计算机应用》2009,29(11):3051-3055
纠错输出编码是一种解决多类分类问题的有效方法,但其编码矩阵只对类进行编码且都采用事先构造出来的统一形式,适应性较差。为此,提出一种新颖的层次纠错输出编码算法。该算法在训练阶段先通过KNN模型算法在数据集上构建多个同类簇,选取各类中最具代表性的簇形成层次编码矩阵,然后再根据编码矩阵进行单分类器训练。在测试阶段,该算法通过模型融合进一步发挥KNN模型和纠错输出编码各自的优点。在UCI公共数据集上的实验结果表明,新方法的性能优于KNN模型算法和纠错输出编码算法。  相似文献   

11.
《Information Fusion》2003,4(1):11-21
It is known that the error correcting output code (ECOC) technique, when applied to multi-class learning problems, can improve generalisation performance. One reason for the improvement is its ability to decompose the original problem into complementary two-class problems. Binary classifiers trained on the sub-problems are diverse and can benefit from combining using a simple distance-based strategy. However there is some discussion about why ECOC performs as well as it does, particularly with respect to the significance of the coding/decoding strategy. In this paper we consider the binary (0,1) code matrix conditions necessary for reduction of error in the ECOC framework, and demonstrate the desirability of equidistant codes. It is shown that equidistant codes can be generated by using properties related to the number of 1’s in each row and between any pair of rows. Experimental results on synthetic data and a few popular benchmark problems show how performance deteriorates as code length is reduced for six decoding strategies.  相似文献   

12.
基于证据理论的纠错输出编码解决多类分类问题   总被引:1,自引:0,他引:1  
针对多类分类问题,利用纠错输出编码作为分解框架,把多类问题转化为多个二类问题加以解决;同时提出一种基于证据理论的解码策略,把每一个二分器的输出作为证据之一进行融合,并讨论在两种编码类型(二元和三元编码矩阵)下证据融合的不同策略.通过实验分别对UCI数据集和3种一维距离像数据集进行测试,并与几种经典的解码方法进行比较,验证了所提出的方法能有效提高纠错输出编码特别是三元编码矩阵的分类正确率.  相似文献   

13.
This paper presents a new study on a method of designing a multi-class classifier: Data-driven Error Correcting Output Coding (DECOC). DECOC is based on the principle of Error Correcting Output Coding (ECOC), which uses a code matrix to decompose a multi-class problem into multiple binary problems. ECOC for multi-class classification hinges on the design of the code matrix. We propose to explore the distribution of data classes and optimize both the composition and the number of base learners to design an effective and compact code matrix. Two real world applications are studied: (1) the holistic recognition (i.e., recognition without segmentation) of touching handwritten numeral pairs and (2) the classification of cancer tissue types based on microarray gene expression data. The results show that the proposed DECOC is able to deliver competitive accuracy compared with other ECOC methods, using parsimonious base learners than the pairwise coupling (one-vs-one) decomposition scheme. With a rejection scheme defined by a simple robustness measure, high reliabilities of around 98% are achieved in both applications.  相似文献   

14.
随着支持向量机的发展,由最初的两类分类问题逐渐推广到多类分类问题,且其思想、算法多种多样,各有千秋。主要研究以当前比较流行的以多个二类分类器组合实现多类分类器的算法之一:DDAG。提出此算法在多类支持向量机应用分类中存在的优点和不足,并针对其不足,提出一种改进的算法思想。  相似文献   

15.
Traffic sign classification represents a classical application of multi-object recognition processing in uncontrolled adverse environments. Lack of visibility, illumination changes, and partial occlusions are just a few problems. In this paper, we introduce a novel system for multi-class classification of traffic signs based on error correcting output codes (ECOC). ECOC is based on an ensemble of binary classifiers that are trained on bi-partition of classes. We classify a wide set of traffic signs types using robust error correcting codings. Moreover, we introduce the novel β-correction decoding strategy that outperforms the state-of-the-art decoding techniques, classifying a high number of classes with great success.  相似文献   

16.
New results on error correcting output codes of kernel machines   总被引:1,自引:0,他引:1  
We study the problem of multiclass classification within the framework of error correcting output codes (ECOC) using margin-based binary classifiers. Specifically, we address two important open problems in this context: decoding and model selection. The decoding problem concerns how to map the outputs of the classifiers into class codewords. In this paper we introduce a new decoding function that combines the margins through an estimate of their class conditional probabilities. Concerning model selection, we present new theoretical results bounding the leave-one-out (LOO) error of ECOC of kernel machines, which can be used to tune kernel hyperparameters. We report experiments using support vector machines as the base binary classifiers, showing the advantage of the proposed decoding function over other functions of I he margin commonly used in practice. Moreover, our empirical evaluations on model selection indicate that the bound leads to good estimates of kernel parameters.  相似文献   

17.
Image classification is a multi-class problem that is usually tackled with ensembles of binary classifiers. Furthermore, one of the most important challenges in this field is to find a set of highly discriminative image features for reaching a good performance in image classification. In this work we propose to use weighted ensembles as a method for feature combination. First, a set of binary classifiers are trained with a set of features and then, the scores are weighted with distances obtained from another set of feature vectors. We present two different approaches to weight the score vector: (1) directly multiplying each score by the weights and (2) fusing the scores values and the distances through a Neural Network. The experiments have shown that the proposed methodology improves classification accuracy of simple ensembles and even more it obtains similar classification accuracy than state-of-the-art methods, but using much less parameters.  相似文献   

18.
Error-correcting output coding (ECOC) is a strategy to create classifier ensembles which reduces a multi-class problem into some binary sub-problems. A key issue in designing any ECOC classifier refers to defining optimal codematrix having maximum discrimination power and minimum number of columns. This paper proposes a heuristic method for application-dependent design of optimal ECOC matrix based on a thinning algorithm. The main idea of the proposed Thinned-ECOC method is to successively remove some redundant and unnecessary columns of any initial codematrix based on a metric defined for each column. As a result, computational cost of the ensemble is reduced while preserving its accuracy. Proposed method has been validated using the UCI machine learning database and further applied to a couple of real-world pattern recognition problems (the face recognition and gene expression based cancer classification). Experimental results emphasize the robustness of Thinned-ECOC in comparison with existing state-of-the-art code generation methods.  相似文献   

19.
Hierarchical classification can be seen as a multidimensional classification problem where the objective is to predict a class, or set of classes, according to a taxonomy. There have been different proposals for hierarchical classification, including local and global approaches. Local approaches can suffer from the inconsistency problem, that is, if a local classifier has a wrong prediction, the error propagates down the hierarchy. Global approaches tend to produce more complex models. In this paper, we propose an alternative approach inspired in multidimensional classification. It starts by building a multi-class classifier per each parent node in the hierarchy. In the classification phase, all the local classifiers are applied simultaneously to each instance, providing a probability for each class in the taxonomy. Then the probability of the subset of classes, for each path in the hierarchy, is obtained by combining the local classifiers results. The path with highest probability is returned as the result for all the levels in the hierarchy. As an extension of the proposal method, we also developed a new technique, based on information gain, to classifies at different levels in the hierarchy. The proposed method was tested on different hierarchical classification data sets and was compared against state-of-the-art methods, resulting in superior predictive performance and/or efficiency to the other approaches in all the datasets.  相似文献   

20.
AUC(ROC曲线下面积)评价标准已经广泛地用于度量机器学习中各种分类算法在两类数据集上的分类性能。首先介绍了SVM(支持向量机)多类分类方法,然后对AUC方法进行了系统地介绍,最后通过实验来比较各种SVM多类分类方法在多类别数据集上的AUC的值。实验结果表明,AUC值和核函数和多类转换方法的选取都有着密切的联系。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号