首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
倪黄晶  王蔚 《计算机工程》2011,37(10):160-161
不同的基分类器对不同分布类型的多类别不平衡数据的适应性存在较大差异。为此,针对分类器的选用问题,在分析比较准确率(ACC)及曲线下面积(AUC)的评价标准基础上,选择基于AUC的分类器评价方法,将支持向量机、决策树和贝叶斯分类器应用于标准数据集中,并采用AUC来评价结果,得出相关结论:在多类不平衡数据上,贝叶斯是最好的基分类器,且SVM分类器存在一定改进空间。  相似文献   

2.
The One-vs-One strategy is one of the most commonly used decomposition technique to overcome multi-class classification problems; this way, multi-class problems are divided into easier-to-solve binary classification problems considering pairs of classes from the original problem, which are then learned by independent base classifiers.The way of performing the division produces the so-called non-competence. This problem occurs whenever an instance is classified, since it is submitted to all the base classifiers although the outputs of some of them are not meaningful (they were not trained using the instances from the class of the instance to be classified). This issue may lead to erroneous classifications, because in spite of their incompetence, all classifiers' decisions are usually considered in the aggregation phase.In this paper, we propose a dynamic classifier selection strategy for One-vs-One scheme that tries to avoid the non-competent classifiers when their output is probably not of interest. We consider the neighborhood of each instance to decide whether a classifier may be competent or not. In order to verify the validity of the proposed method, we will carry out a thorough experimental study considering different base classifiers and comparing our proposal with the best performer state-of-the-art aggregation within each base classifier from the five Machine Learning paradigms selected. The findings drawn from the empirical analysis are supported by the appropriate statistical analysis.  相似文献   

3.
讨论和比较了现有的几种多类SVM方法.在此基础上,提出了一种组合多个两类分类器结果的多类SVM决策方法.在该方法中,定义了新的决策函数,其值是在传统投票决策值的基础上乘以不同分类器的权重.新的多类SVM在一定程度上解决了传统投票决策方法的不可分区域问题,因此具有更好的分类性能.最后,将新方法作为关键技术应用于故障诊断实例,实际诊断结果证明了所提多类SVM决策方法的优越性.  相似文献   

4.
In this work, we formalise and evaluate an ensemble of classifiers that is designed for the resolution of multi-class problems. To achieve a good accuracy rate, the base learners are built with pairwise coupled binary and multi-class classifiers. Moreover, to reduce the computational cost of the ensemble and to improve its performance, these classifiers are trained using a specific attribute subset. This proposal offers the opportunity to capture the advantages provided by binary decomposition methods, by attribute partitioning methods, and by cooperative characteristics associated with a combination of redundant base learners. To analyse the quality of this architecture, its performance has been tested on different domains, and the results have been compared to other well-known classification methods. This experimental evaluation indicates that our model is, in most cases, as accurate as these methods, but it is much more efficient.  相似文献   

5.
Feature selection for ensembles has shown to be an effective strategy for ensemble creation due to its ability of producing good subsets of features, which make the classifiers of the ensemble disagree on difficult cases. In this paper we present an ensemble feature selection approach based on a hierarchical multi-objective genetic algorithm. The underpinning paradigm is the “overproduce and choose”. The algorithm operates in two levels. Firstly, it performs feature selection in order to generate a set of classifiers and then it chooses the best team of classifiers. In order to show its robustness, the method is evaluated in two different contexts:supervised and unsupervised feature selection. In the former, we have considered the problem of handwritten digit recognition and used three different feature sets and multi-layer perceptron neural networks as classifiers. In the latter, we took into account the problem of handwritten month word recognition and used three different feature sets and hidden Markov models as classifiers. Experiments and comparisons with classical methods, such as Bagging and Boosting, demonstrated that the proposed methodology brings compelling improvements when classifiers have to work with very low error rates. Comparisons have been done by considering the recognition rates only.  相似文献   

6.
In this paper, we tackle the problem of model selection when misclassification costs are unknown and/or may evolve. Unlike traditional approaches based on a scalar optimization, we propose a generic multi-model selection framework based on a multi-objective approach. The idea is to automatically train a pool of classifiers instead of one single classifier, each classifier in the pool optimizing a particular trade-off between the objectives. Within the context of two-class classification problems, we introduce the “ROC front concept” as an alternative to the ROC curve representation. This strategy is applied to the multi-model selection of SVM classifiers using an evolutionary multi-objective optimization algorithm. The comparison with a traditional scalar optimization technique based on an AUC criterion shows promising results on UCI datasets as well as on a real-world classification problem.  相似文献   

7.
Multi-class classification is one of the major challenges in real world application. Classification algorithms are generally binary in nature and must be extended for multi-class problems. Therefore, in this paper, we proposed an enhanced Genetically Optimized Neural Network (GONN) algorithm, for solving multi-class classification problems. We used a multi-tree GONN representation which integrates multiple GONN trees; each individual is a single GONN classifier. Thus enhanced classifier is an integrated version of individual GONN classifiers for all classes. The integrated version of classifiers is evolved genetically to optimize its architecture for multi-class classification. To demonstrate our results, we had taken seven datasets from UCI Machine Learning repository and compared the classification accuracy and training time of enhanced GONN with classical Koza’s model and classical Back propagation model. Our algorithm gives better classification accuracy of almost 5% and 8% than Koza’s model and Back propagation model respectively even for complex and real multi-class data in lesser amount of time. This enhanced GONN algorithm produces better results than popular classification algorithms like Genetic Algorithm, Support Vector Machine and Neural Network which makes it a good alternative to the well-known machine learning methods for solving multi-class classification problems. Even for datasets containing noise and complex features, the results produced by enhanced GONN is much better than other machine learning algorithms. The proposed enhanced GONN can be applied to expert and intelligent systems for effectively classifying large, complex and noisy real time multi-class data.  相似文献   

8.
In this work we present the first efficient algorithm for unsupervised training of multi-class regularized least- squares classifiers. The approach is closely related to the unsupervised extension of the support vector machine classifier known as maximum margin clustering, which recently has received considerable attention, though mostly considering the binary classification case. We present a combinatorial search scheme that combines steepest descent strategies with powerful meta-heuristics for avoiding bad local optima. The regularized least-squares based formulation of the problem allows us to use matrix algebraic optimization enabling constant time checks for the intermediate candidate solutions during the search. Our experimental evaluation indicates the potential of the novel method and demonstrates its superior clustering performance over a variety of competing methods on real world datasets. Both time complexity analysis and experimental comparisons show that the method can scale well to practical sized problems.  相似文献   

9.
随着支持向量机的发展,由最初的两类分类问题逐渐推广到多类分类问题,且其思想、算法多种多样,各有千秋。主要研究以当前比较流行的以多个二类分类器组合实现多类分类器的算法之一:DDAG。提出此算法在多类支持向量机应用分类中存在的优点和不足,并针对其不足,提出一种改进的算法思想。  相似文献   

10.
Fisher kernels combine the powers of discriminative and generative classifiers by mapping the variable-length sequences to a new fixed length feature space, called the Fisher score space. The mapping is based on a single generative model and the classifier is intrinsically binary. We propose a multi-class classification strategy that applies a multi-class classification on each Fisher score space and combines the decisions of multi-class classifiers. We experimentally show that the Fisher scores of one class provide discriminative information for the other classes as well. We compare several multi-class classification strategies for Fisher scores generated from the hidden Markov models of sign sequences. The proposed multi-class classification strategy increases the classification accuracy in comparison with the state of the art strategies based on combining binary classifiers. To reduce the computational complexity of the Fisher score extraction and the training phases, we also propose a score space selection method and show that, similar or even higher accuracies can be obtained by using only a subset of the score spaces. Based on the proposed score space selection method, a signer adaptation technique is also presented that does not require any re-training.  相似文献   

11.
特征选择是文本分类中一种重要的文本预处理技术,它能够有效地提高分类器的精度和效率。文本分类中特征选择的关键是寻求有效的特征评价指标。一般来说,同一个特征评价指标对不同的分类器,其效果不同,由此,一个好的特征评价指标应当考虑分类器的特点。由于朴素贝叶斯分类器简单、高效而且对特征选择很敏感,因此,对用于该种分类器的特征选择方法的研究具有重要的意义。有鉴于此,提出了一种有效的用于贝叶斯分类器的多类别文本特征评价指标:CDM。利用贝叶斯分类器在两个多类别的文本数据集上进行了实验。实验结果表明提出的CDM指标具有比其它特征评价指标更好的特征选择效果。  相似文献   

12.
针对多分类不均衡问题,提出了一种新的基于一对一(one-versus-one,OVO)分解策略的方法。首先基于OVO分解策略将多分类不均衡问题分解成多个二值分类问题;再利用处理不均衡二值分类问题的算法建立二值分类器;接着利用SMOTE过抽样技术处理原始数据集;然后采用基于距离相对竞争力加权方法处理冗余分类器;最后通过加权投票法获得输出结果。在KEEL不均衡数据集上的大量实验结果表明,所提算法比其他经典方法具有显著的优势。  相似文献   

13.
一种设计层次支持向量机多类分类器的新方法   总被引:15,自引:2,他引:13  
层次结构的设计是层次支持向量机多类分类方法应用中的关键问题,类间可分性是设计层次结构的重要依据,提出了一种基于线性支持向量机度量类间相似程度的方法,并给出了一种基于类间可分性设计层次支持向量机多类分类器的新方法。实验表明,新方法有效地提高了层次支持向量机多类分类器的分类精度和速度。  相似文献   

14.
基于SVM的图像分类研究   总被引:1,自引:0,他引:1  
图像分类技术有着重要的应用前景,而且对于基于内容的图像检索的发展会有积极的推动作用。多类图像分类是图像分类中的难点,对基于SVM的多类图像分类方法进行了研究,提出在二类支持向量机的基础上构造多类分类器的方法,实验结果证明和传统方法相比,分类准确率有了较大的提高。  相似文献   

15.
In this paper, it is proposed a new methodology based on invariant moments and multi-class support vector machine (MCSVM) for classification of human parasite eggs in microscopic images. The MCSVM is one of the most used classifiers but it has not used for classification of human parasite eggs to date. This method composes four stages. These are pre-processing stage, feature extraction stage, classification stage, and testing stage. In pre-processing stage, the digital image processing methods, which are noise reduction, contrast enhancement, thresholding, and morphological and logical processes. In feature extraction stage, the invariant moments of pre-processed parasite images are calculated. Finally, in classification stage, the multi-class support vector machine (MCSVM) classifier is used for classification of features extracted feature extraction stage. We used MATLAB software for estimating the success classification rate of proposed approach in this study. For this aim, proposed approach was tested by using test data. At end of test, 97.70% overall success rates were obtained.  相似文献   

16.
We consider a problem of risk estimation for large-margin multi-class classifiers. We propose a novel risk bound for the multi-class classification problem. The bound involves the marginal distribution of the classifier and the Rademacher complexity of the hypothesis class. We prove that our bound is tight in the number of classes. Finally, we compare our bound with the related ones and provide a simplified version of the bound for the multi-class classification with kernel based hypotheses.  相似文献   

17.
This paper proposes two gradient based methods to fit a Probit regression model by maximizing the sample log-likelihood function. Using the property of the Hessian of the objective function, the first method performs weighted least square regression in each iteration of the Newton–Raphson framework, resulting in ProbitBoost, a boosting-like algorithm. Motivated by the gradient boosting algorithm [10], the second proposed approach maximizes the sample log-likelihood function by updating the fitted function a small step in the gradient direction, performing gradient ascent in functional space, resulting in Gradient ProbitBoost. We also generalize the algorithms to multi-class problems by two strategies, one of which is to use the gradient ascent to maximize the multi-class sample log-likelihood function for fitting all the classifiers simultaneously, and the second approach uses the one-versus-all scheme to reduce the multi-class problem to a series of binary classification problems. The proposed algorithms are tested on typical classification problems including face detection, cancer classification, and handwritten digit recognition. The results show that compared to the alternative methods, the proposed algorithms perform similar or better in terms of testing error rates.  相似文献   

18.
One-vs-One strategy is a common and established technique in Machine Learning to deal with multi-class classification problems. It consists of dividing the original multi-class problem into easier-to-solve binary subproblems considering each possible pair of classes. Since several classifiers are learned, their combination becomes crucial in order to predict the class of new instances. Due to the division procedure a series of difficulties emerge at this stage, such as the non-competence problem. Each classifier is learned using only the instances of its corresponding pair of classes, and hence, it is not competent to classify instances belonging to the rest of the classes; nevertheless, at classification time all the outputs of the classifiers are taken into account because the competence cannot be known a priori (the classification problem would be solved). On this account, we develop a distance-based combination strategy, which weights the competence of the outputs of the base classifiers depending on the closeness of the query instance to each one of the classes. Our aim is to reduce the effect of the non-competent classifiers, enhancing the results obtained by the state-of-the-art combinations for One-vs-One strategy. We carry out a thorough experimental study, supported by the proper statistical analysis, showing that the results obtained by the proposed method outperform, both in terms of accuracy and kappa measures, the previous combinations for One-vs-One strategy.  相似文献   

19.
Physical activity recognition using wearable sensors has gained significant interest from researchers working in the field of ambient intelligence and human behavior analysis. The problem of multi-class classification is an important issue in the applications which naturally has more than two classes. A well-known strategy to convert a multi-class classification problem into binary sub-problems is the error-correcting output coding (ECOC) method. Since existing methods use a single classifier with ECOC without considering the dependency among multiple classifiers, it often fails to generalize the performance and parameters in a real-life application, where different numbers of devices, sensors and sampling rates are used. To address this problem, we propose a unique hierarchical classification model based on the combination of two base binary classifiers using selective learning of slacked hierarchy and integrating the training of binary classifiers into a unified objective function. Our method maps the multi-class classification problem to multi-level classification. A multi-tier voting scheme has been introduced to provide a final classification label at each level of the solicited model. The proposed method is evaluated on two publicly available datasets and compared with independent base classifiers. Furthermore, it has also been tested on real-life sensor readings for 3 different subjects to recognize four activities i.e. Walking, Standing, Jogging and Sitting. The presented method uses same hierarchical levels and parameters to achieve better performance on all three datasets having different number of devices, sensors and sampling rates. The average accuracies on publicly available dataset and real-life sensor readings were recorded to be 95% and 85%, respectively. The experimental results validate the effectiveness and generality of the proposed method in terms of performance and parameters.  相似文献   

20.
基于概率投票策略的多类支持向量机及应用   总被引:5,自引:1,他引:4       下载免费PDF全文
王晓红 《计算机工程》2009,35(2):180-183
传统的支持向量机是基于两类问题提出的,如何将其有效地推广至多类分类仍是一个研究的热点问题。在分析比较现有支持向量机多类分类OVO方法存在的问题及缺点的基础上,该文提出一种新的基于概率投票策略的多类分类方法。在该策略中,充分考虑了OVO方法中各个两类支持向量机分类器的差异,并将该差异反映到投票分值上。所提多类支持向量机方法不仅具有较好的分类性能,而且有效解决了传统投票策略中存在的拒分区域问题。将基于概率投票的多分类支持向量机作为关键技术应用于实际齿轮箱故障诊断,并与传统投票策略的结果进行对比,表明所提方法的上述优点。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号