首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Several supervised machine learning applications are commonly represented as multi-class problems, but it is harder to distinguish several classes rather than just two classes. In contrast to the approaches one-against-all and all-pairs that transform a multi-class problem into a set of binary problems, Dichotomy Transformation (DT) converts a multi-class problem into a different problem where the goal is to verify if a pair of documents belongs to the same class or not. To perform this task, DT generates a dichotomy set obtained by combining a pair of documents, each belongs to either a positive class (documents in the pair that have the same class) or a negative class (documents in the pair that come from different classes). The definition of this dichotomy set plays an important role in the overall accuracy of the system. So, an alternative to avoid searching for the best dichotomy set is using multiple classifier systems because we can have many different sets where each one is used to train one binary classifier instead of having only one dichotomy set. Herein we propose Combined Dichotomy Transformations (CoDiT), a Text Categorization system that combines binary classifiers that are trained with different dichotomy sets using DT. By using DT, the number of training examples increases exponentially when compared with the original training set. This is a desirable property because each classifier can be trained with different data without reducing the number of examples or features. Therefore, it is possible to compose an ensemble with diverse and strong classifiers. Experiments using 14 databases show that CoDiT achieves statistically better results in comparison to SVM, Bagging, Random Subspace, BoosTexter, and Random Forest.  相似文献   

2.
Electrical borehole wall images represent micro-resistivity measurements at the borehole wall. The lithology reconstruction is often based on visual interpretation done by geologists. This analysis is very time-consuming and subjective. Different geologists may interpret the data differently. In this work, linear discriminant analysis (LDA) in combination with texture features is used for an automated lithology reconstruction of ODP (Ocean Drilling Program) borehole 1203A drilled during Leg 197. Six rock groups are identified by their textural properties in resistivity data obtained by a Formation MircoScanner (FMS). Although discriminant analysis can be used for multi-class classification, non-optimal decision criteria for certain groups could emerge. For this reason, we use a combination of 2-class (binary) classifiers to increase the overall classification accuracy. The generalization ability of the combined classifiers is evaluated and optimized on a testing dataset where a classification rate of more than 80% for each of the six rock groups is achieved. The combined, trained classifiers are then applied on the whole dataset obtaining a statistical reconstruction of the logged formation. Compared to a single multi-class classifier the combined binary classifiers show better classification results for certain rock groups and more stable results in larger intervals of equal rock type.  相似文献   

3.
The One-vs-One strategy is one of the most commonly used decomposition technique to overcome multi-class classification problems; this way, multi-class problems are divided into easier-to-solve binary classification problems considering pairs of classes from the original problem, which are then learned by independent base classifiers.The way of performing the division produces the so-called non-competence. This problem occurs whenever an instance is classified, since it is submitted to all the base classifiers although the outputs of some of them are not meaningful (they were not trained using the instances from the class of the instance to be classified). This issue may lead to erroneous classifications, because in spite of their incompetence, all classifiers' decisions are usually considered in the aggregation phase.In this paper, we propose a dynamic classifier selection strategy for One-vs-One scheme that tries to avoid the non-competent classifiers when their output is probably not of interest. We consider the neighborhood of each instance to decide whether a classifier may be competent or not. In order to verify the validity of the proposed method, we will carry out a thorough experimental study considering different base classifiers and comparing our proposal with the best performer state-of-the-art aggregation within each base classifier from the five Machine Learning paradigms selected. The findings drawn from the empirical analysis are supported by the appropriate statistical analysis.  相似文献   

4.
Acoustic events produced in controlled environments may carry information useful for perceptually aware interfaces. In this paper we focus on the problem of classifying 16 types of meeting-room acoustic events. First of all, we have defined the events and gathered a sound database. Then, several classifiers based on support vector machines (SVM) are developed using confusion matrix based clustering schemes to deal with the multi-class problem. Also, several sets of acoustic features are defined and used in the classification tests. In the experiments, the developed SVM-based classifiers are compared with an already reported binary tree scheme and with their correlative Gaussian mixture model (GMM) classifiers. The best results are obtained with a tree SVM-based classifier that may use a different feature set at each node. With it, a 31.5% relative average error reduction is obtained with respect to the best result from a conventional binary tree scheme.  相似文献   

5.
多分类问题代价敏感AdaBoost算法   总被引:8,自引:2,他引:6  
付忠良 《自动化学报》2011,37(8):973-983
针对目前多分类代价敏感分类问题在转换成二分类代价敏感分类问题存在的代价合并问题, 研究并构造出了可直接应用于多分类问题的代价敏感AdaBoost算法.算法具有与连续AdaBoost算法 类似的流程和误差估计. 当代价完全相等时, 该算法就变成了一种新的多分类的连续AdaBoost算法, 算法能够确保训练错误率随着训练的分类器的个数增加而降低, 但不直接要求各个分类器相互独立条件, 或者说独立性条件可以通过算法规则来保证, 但现有多分类连续AdaBoost算法的推导必须要求各个分类器相互独立. 实验数据表明, 算法可以真正实现分类结果偏向错分代价较小的类, 特别当每一类被错分成其他类的代价不平衡但平均代价相等时, 目前已有的多分类代价敏感学习算法会失效, 但新方法仍然能 实现最小的错分代价. 研究方法为进一步研究集成学习算法提供了一种新的思路, 得到了一种易操作并近似满足分类错误率最小的多标签分类问题的AdaBoost算法.  相似文献   

6.
Fisher kernels combine the powers of discriminative and generative classifiers by mapping the variable-length sequences to a new fixed length feature space, called the Fisher score space. The mapping is based on a single generative model and the classifier is intrinsically binary. We propose a multi-class classification strategy that applies a multi-class classification on each Fisher score space and combines the decisions of multi-class classifiers. We experimentally show that the Fisher scores of one class provide discriminative information for the other classes as well. We compare several multi-class classification strategies for Fisher scores generated from the hidden Markov models of sign sequences. The proposed multi-class classification strategy increases the classification accuracy in comparison with the state of the art strategies based on combining binary classifiers. To reduce the computational complexity of the Fisher score extraction and the training phases, we also propose a score space selection method and show that, similar or even higher accuracies can be obtained by using only a subset of the score spaces. Based on the proposed score space selection method, a signer adaptation technique is also presented that does not require any re-training.  相似文献   

7.
Multi-class classification is one of the major challenges in real world application. Classification algorithms are generally binary in nature and must be extended for multi-class problems. Therefore, in this paper, we proposed an enhanced Genetically Optimized Neural Network (GONN) algorithm, for solving multi-class classification problems. We used a multi-tree GONN representation which integrates multiple GONN trees; each individual is a single GONN classifier. Thus enhanced classifier is an integrated version of individual GONN classifiers for all classes. The integrated version of classifiers is evolved genetically to optimize its architecture for multi-class classification. To demonstrate our results, we had taken seven datasets from UCI Machine Learning repository and compared the classification accuracy and training time of enhanced GONN with classical Koza’s model and classical Back propagation model. Our algorithm gives better classification accuracy of almost 5% and 8% than Koza’s model and Back propagation model respectively even for complex and real multi-class data in lesser amount of time. This enhanced GONN algorithm produces better results than popular classification algorithms like Genetic Algorithm, Support Vector Machine and Neural Network which makes it a good alternative to the well-known machine learning methods for solving multi-class classification problems. Even for datasets containing noise and complex features, the results produced by enhanced GONN is much better than other machine learning algorithms. The proposed enhanced GONN can be applied to expert and intelligent systems for effectively classifying large, complex and noisy real time multi-class data.  相似文献   

8.
9.
Based on the principle of one-against-one support vector machines (SVMs) multi-class classification algorithm, this paper proposes an extended SVMs method which couples adaptive resonance theory (ART) network to reconstruct a multi-class classifier. Different coupling strategies to reconstruct a multi-class classifier from binary SVM classifiers are compared with application to fault diagnosis of transmission line. Majority voting, a mixture matrix and self-organizing map (SOM) network are compared in reconstructing the global classification decision. In order to evaluate the method’s efficiency, one-against-all, decision directed acyclic graph (DDAG) and decision-tree (DT) algorithm based SVM are compared too. The comparison is done with simulations and the best method is validated with experimental data.  相似文献   

10.
In this paper the adaptive binary classifier is applied for the classification of the tensotremorogramm (TTG) time series. The idea is to reveal pathological states of human motor control system. Adaptive binary classifier being a new type of trained classifiers can be trained on the data for healthy subjects. Then the trained classifier can be used for the examinees division into healthy and sick patients. It is shown, that the trained adaptive binary classifier is able to classify the patients with acceptable accuracy. Other method of classification-Neural Clouds-has also been used. The comparison both methods has been done.  相似文献   

11.
Visual perception of English letters involves different underlying brain processes including brain activity alteration in multiple frequency bands. However, shape analogous letters elicit brain activities which are not obviously distinct and it is therefore difficult to differentiate those activities. In order to address discriminative feasibility and classification performance of the perception of shape-analogous letters, we performed an experiment in where EEG signals were obtained from 20 subjects while they were perceiving shape analogous letters (i.e., ‘p’, ‘q’, ‘b’, and ‘d’). Spectral power densities from five typical frequency bands (i.e., delta, theta, alpha, beta and gamma) were extracted as features, which were then classified by either individual widely-used classifiers, namely k-Nearest Neighbors (kNN), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Random Forest (RF) and AdaBoost (ADA), or an ensemble of some of them. The F-score was employed to select most discriminative features so that the dimension of features was reduced. The results showed that the RF achieved the highest accuracy of 74.1% in the case of multi-class classification. In the case of binary classification, the best performance (Accuracy 86.39%) was achieved by the RF classifier in terms of average accuracy across all possible pairs of the letters. In addition, we employed decision fusion strategy to exert complementary strengths of different classifiers. The results demonstrated that the performance was elevated from 74.10% to 76.63% for the multi-class classification and from 86.39% to 88.08% for the binary class classification.  相似文献   

12.
The subprime mortgage crisis have triggered a significant economic decline over the world. Credit rating forecasting has been a critical issue in the global banking systems. The study trained a Gaussian process based multi-class classifier (GPC), a highly flexible probabilistic kernel machine, using variational Bayesian methods. GPC provides full predictive distributions and model selection simultaneously. During training process, the input features are automatically weighted by their relevances with respect to the output labels. Benefiting from the inherent feature scaling scheme, GPCs outperformed convectional multi-class classifiers and support vector machines (SVMs). In the second stage, conventional SVMs enhanced by feature selection and dimensionality reduction schemes were also compared with GPCs. Empirical results indicated that GPCs still performed the best.  相似文献   

13.
Image classification is a multi-class problem that is usually tackled with ensembles of binary classifiers. Furthermore, one of the most important challenges in this field is to find a set of highly discriminative image features for reaching a good performance in image classification. In this work we propose to use weighted ensembles as a method for feature combination. First, a set of binary classifiers are trained with a set of features and then, the scores are weighted with distances obtained from another set of feature vectors. We present two different approaches to weight the score vector: (1) directly multiplying each score by the weights and (2) fusing the scores values and the distances through a Neural Network. The experiments have shown that the proposed methodology improves classification accuracy of simple ensembles and even more it obtains similar classification accuracy than state-of-the-art methods, but using much less parameters.  相似文献   

14.
对支持向量机的多类分类问题进行研究,提出了一种基于核聚类的多类分类方法。利用核聚类方法将原始样本特征映射到高维特征进行聚类分组,对每一组使用一个支持向量机二值分类器进行分类,并用这些二值分类器组成决策树的节点,构成了一个决策分类树。给出决策树的生成算法,提出了利用交叠系数来控制交叠,从而克服错分积累,提高分类准确率。实验结果表明,采用该方法,手写体汉字识别速度和正确率都达到了实用的要求。  相似文献   

15.
Single pass text classification by direct feature weighting   总被引:2,自引:1,他引:1  
The Feature Weighting Classifier (FWC) is an efficient multi-class classification algorithm for text data that uses Information Gain to directly estimate per-class feature weights in the classifier. This classifier requires only a single pass over the dataset to compute the feature frequencies per class, is easy to implement, and has memory usage that is linear in the number of features. Results of experiments performed on 128 binary and multi-class text and web datasets show that FWC??s performance is at least comparable to, and often better than that of Naive Bayes, TWCNB, Winnow, Balanced Winnow and linear SVM. On a large-scale web dataset with 12,294 classes and 135,973 training instances, FWC trained in 13?s and yielded comparable classification performance to a state of the art multi-class SVM implementation, which took over 15?min to train.  相似文献   

16.
Physical activity recognition using wearable sensors has gained significant interest from researchers working in the field of ambient intelligence and human behavior analysis. The problem of multi-class classification is an important issue in the applications which naturally has more than two classes. A well-known strategy to convert a multi-class classification problem into binary sub-problems is the error-correcting output coding (ECOC) method. Since existing methods use a single classifier with ECOC without considering the dependency among multiple classifiers, it often fails to generalize the performance and parameters in a real-life application, where different numbers of devices, sensors and sampling rates are used. To address this problem, we propose a unique hierarchical classification model based on the combination of two base binary classifiers using selective learning of slacked hierarchy and integrating the training of binary classifiers into a unified objective function. Our method maps the multi-class classification problem to multi-level classification. A multi-tier voting scheme has been introduced to provide a final classification label at each level of the solicited model. The proposed method is evaluated on two publicly available datasets and compared with independent base classifiers. Furthermore, it has also been tested on real-life sensor readings for 3 different subjects to recognize four activities i.e. Walking, Standing, Jogging and Sitting. The presented method uses same hierarchical levels and parameters to achieve better performance on all three datasets having different number of devices, sensors and sampling rates. The average accuracies on publicly available dataset and real-life sensor readings were recorded to be 95% and 85%, respectively. The experimental results validate the effectiveness and generality of the proposed method in terms of performance and parameters.  相似文献   

17.
A common way to model multi-class classification problems is by means of Error-Correcting Output Codes (ECOC). Given a multi-class problem, the ECOC technique designs a code word for each class, where each position of the code identifies the membership of the class for a given binary problem. A classification decision is obtained by assigning the label of the class with the closest code. One of the main requirements of the ECOC design is that the base classifier is capable of splitting each sub-group of classes from each binary problem. However, we can not guarantee that a linear classifier model convex regions. Furthermore, non-linear classifiers also fail to manage some type of surfaces. In this paper, we present a novel strategy to model multi-class classification problems using sub-class information in the ECOC framework. Complex problems are solved by splitting the original set of classes into sub-classes, and embedding the binary problems in a problem-dependent ECOC design. Experimental results show that the proposed splitting procedure yields a better performance when the class overlap or the distribution of the training objects conceil the decision boundaries for the base classifier. The results are even more significant when one has a sufficiently large training size.  相似文献   

18.
针对目前服务机器人手势交互方法在输入方式自然性和识别方法可靠性方面的不足,提出采用结合人脸和人手的姿态作为输入方式,实现了一个基于最优有向无环图支持向量机(DAGSVM)的手势识别系统。系统采用分步细化特征检测过程,即先粗检肤色,然后分别利用人眼Gabor特征和人手边缘小波矩特征检测脸和手部,可克服背景中的肤色干扰,并显著提高特征提取的可靠性;综合利用脸手区域不变矩和手的位置信息组成混合特征向量,采用优化拓扑排序策略组织多个两分类支持向量机(SVM),构成最优DAGSVM多分类器,达到比普通DAGSVM更高的多分类准确率。实验验证了该方法的有效性和可靠性,并用于实现一种自然友好的人机交互方式。  相似文献   

19.
基于Gabor直方图特征和MVBoost的人脸表情识别   总被引:2,自引:0,他引:2  
提出采用Gabor变换与分级直方图统计相结合的方法来提取表情特征,以分层次反映局部区域内纹理变化的信息.这比仅用一维的Gabor系数具有更强的特征表示能力.借助直方图特征,还设计了向量输入、多类连续输出的弱分类器,并嵌入到多类连续AdaBoost的算法框架中,得到了向量输入、多类输出的MVBoost方法.该方法直接对特征进行多类的判决以满足多类时分类的需求,而不必训练多个二分类的AdaBoost分类器,从而使训练过程和分类过程都得到简化.  相似文献   

20.
A multi-class classifier based on the Bradley-Terry model predicts the multi-class label of an input by combining the outputs from multiple binary classifiers, where the combination should be a priori designed as a code word matrix. The code word matrix was originally designed to consist of +1 and ?1 codes, and was later extended into deal with ternary code {+1,0,?1}, that is, allowing 0 codes. This extension has seemed to work effectively but, in fact, contains a problem: a binary classifier forcibly categorizes examples with 0 codes into either +1 or ?1, but this forcible decision makes the prediction of the multi-class label obscure. In this article, we propose a Boosting algorithm that deals with three categories by allowing a ??don??t care?? category corresponding to 0 codes, and present a modified decoding method called a ??ternary?? Bradley-Terry model. In addition, we propose a couple of fast decoding schemes that reduce the heavy computation by the existing Bradley-Terry model-based decoding.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号