首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
通过将多类支持向量机作为分类器,运用Dempster-Shafer理论等信息融合方法对分类结果进行融合,实现对小样本的分类。主要采用对多类支持向量机的分类结果进行求和后取最大值、Dempster-Shafer理论以及使用Dempster-Shafer理论后第二次使用支持向量机三种方式进行融合。由于支持向量机本身是适用于小样本的机器学习算法,Dempster-Shafer理论又可以较好地处理不确定性,两者的结合可以较好地处理小样本分类问题,并提高最终的分类精度。实验结果表明,提出的几种融合策略确实可以在小样  相似文献   

2.

In multi-label classification problems, every instance is associated with multiple labels at the same time. Binary classification, multi-class classification and ordinal regression problems can be seen as unique cases of multi-label classification where each instance is assigned only one label. Text classification is the main application area of multi-label classification techniques. However, relevant works are found in areas like bioinformatics, medical diagnosis, scene classification and music categorization. There are two approaches to do multi-label classification: The first is an algorithm-independent approach or problem transformation in which multi-label problem is dealt by transforming the original problem into a set of single-label problems, and the second approach is algorithm adaptation, where specific algorithms have been proposed to solve multi-label classification problem. Through our work, we not only investigate various research works that have been conducted under algorithm adaptation for multi-label classification but also perform comparative study of two proposed algorithms. The first proposed algorithm is named as fuzzy PSO-based ML-RBF, which is the hybridization of fuzzy PSO and ML-RBF. The second proposed algorithm is named as FSVD-MLRBF that hybridizes fuzzy c-means clustering along with singular value decomposition. Both the proposed algorithms are applied to real-world datasets, i.e., yeast and scene dataset. The experimental results show that both the proposed algorithms meet or beat ML-RBF and ML-KNN when applied on the test datasets.

  相似文献   

3.
4.
In the framework of functional gradient descent/ascent, this paper proposes Quantile Boost (QBoost) algorithms which predict quantiles of the interested response for regression and binary classification. Quantile Boost Regression performs gradient descent in functional space to minimize the objective function used by quantile regression (QReg). In the classification scenario, the class label is defined via a hidden variable, and the quantiles of the class label are estimated by fitting the corresponding quantiles of the hidden variable. An equivalent form of the definition of quantile is introduced, whose smoothed version is employed as the objective function, and then maximized by functional gradient ascent to obtain the Quantile Boost Classification algorithm. Extensive experimentation and detailed analysis show that QBoost performs better than the original QReg and other alternatives for regression and binary classification. Furthermore, QBoost is capable of solving problems in high dimensional space and is more robust to noisy predictors.  相似文献   

5.
针对多分类不均衡问题,提出了一种新的基于一对一(one-versus-one,OVO)分解策略的方法。首先基于OVO分解策略将多分类不均衡问题分解成多个二值分类问题;再利用处理不均衡二值分类问题的算法建立二值分类器;接着利用SMOTE过抽样技术处理原始数据集;然后采用基于距离相对竞争力加权方法处理冗余分类器;最后通过加权投票法获得输出结果。在KEEL不均衡数据集上的大量实验结果表明,所提算法比其他经典方法具有显著的优势。  相似文献   

6.
Feature extraction is an important step before actual learning. Although many feature extraction methods have been proposed for clustering, classification and regression, very limited work has been done on multi-class classification problems. This paper proposes a novel feature extraction method, called orientation distance–based discriminative (ODD) feature extraction, particularly designed for multi-class classification problems. Our proposed method works in two steps. In the first step, we extend the Fisher Discriminant idea to determine an appropriate kernel function and map the input data with all classes into a feature space where the classes of the data are well separated. In the second step, we put forward two variants of ODD features, i.e., one-vs-all-based ODD and one-vs-one-based ODD features. We first construct hyper-plane (SVM) based on one-vs-all scheme or one-vs-one scheme in the feature space; we then extract one-vs-all-based or one-vs-one-based ODD features between a sample and each hyper-plane. These newly extracted ODD features are treated as the representative features and are thereafter used in the subsequent classification phase. Extensive experiments have been conducted to investigate the performance of one-vs-all-based and one-vs-one-based ODD features for multi-class classification. The statistical results show that the classification accuracy based on ODD features outperforms that of the state-of-the-art feature extraction methods.  相似文献   

7.
多标签代价敏感分类集成学习算法   总被引:12,自引:2,他引:10  
付忠良 《自动化学报》2014,40(6):1075-1085
尽管多标签分类问题可以转换成一般多分类问题解决,但多标签代价敏感分类问题却很难转换成多类代价敏感分类问题.通过对多分类代价敏感学习算法扩展为多标签代价敏感学习算法时遇到的一些问题进行分析,提出了一种多标签代价敏感分类集成学习算法.算法的平均错分代价为误检标签代价和漏检标签代价之和,算法的流程类似于自适应提升(Adaptive boosting,AdaBoost)算法,其可以自动学习多个弱分类器来组合成强分类器,强分类器的平均错分代价将随着弱分类器增加而逐渐降低.详细分析了多标签代价敏感分类集成学习算法和多类代价敏感AdaBoost算法的区别,包括输出标签的依据和错分代价的含义.不同于通常的多类代价敏感分类问题,多标签代价敏感分类问题的错分代价要受到一定的限制,详细分析并给出了具体的限制条件.简化该算法得到了一种多标签AdaBoost算法和一种多类代价敏感AdaBoost算法.理论分析和实验结果均表明提出的多标签代价敏感分类集成学习算法是有效的,该算法能实现平均错分代价的最小化.特别地,对于不同类错分代价相差较大的多分类问题,该算法的效果明显好于已有的多类代价敏感AdaBoost算法.  相似文献   

8.
Multi-class classification is one of the major challenges in real world application. Classification algorithms are generally binary in nature and must be extended for multi-class problems. Therefore, in this paper, we proposed an enhanced Genetically Optimized Neural Network (GONN) algorithm, for solving multi-class classification problems. We used a multi-tree GONN representation which integrates multiple GONN trees; each individual is a single GONN classifier. Thus enhanced classifier is an integrated version of individual GONN classifiers for all classes. The integrated version of classifiers is evolved genetically to optimize its architecture for multi-class classification. To demonstrate our results, we had taken seven datasets from UCI Machine Learning repository and compared the classification accuracy and training time of enhanced GONN with classical Koza’s model and classical Back propagation model. Our algorithm gives better classification accuracy of almost 5% and 8% than Koza’s model and Back propagation model respectively even for complex and real multi-class data in lesser amount of time. This enhanced GONN algorithm produces better results than popular classification algorithms like Genetic Algorithm, Support Vector Machine and Neural Network which makes it a good alternative to the well-known machine learning methods for solving multi-class classification problems. Even for datasets containing noise and complex features, the results produced by enhanced GONN is much better than other machine learning algorithms. The proposed enhanced GONN can be applied to expert and intelligent systems for effectively classifying large, complex and noisy real time multi-class data.  相似文献   

9.
In this paper we propose two variational models for semi-supervised clustering of high-dimensional data. The new models produce substantial improvements of the classification accuracy in comparison with the corresponding models without the regional force in cases that the sample rate is relatively low. For the proposed models, the data points are modeled as vertices of a weighted graph, and the labeling function defined on each vertex takes values from the unit simplex, which can be interpreted as the probability of belonging to each class. The algorithm is proposed as a minimization of a convex functional of the labeling function. The first model combines the Rayleigh quotient for the graph Laplacian and a region-force term, and the second one only replaces the Rayleigh quotient with the total variation of the labeling function. The region-force term is calculated by the affinity between each vertex and the training samples, characterizing the conditional probability of each vertex belonging to each class. The numerical methods for solving these two versions of the proposed algorithm are presented, and both are tested on several benchmark data sets such as handwritten digits (MNIST) and moons data. Experiments indicate that the classification accuracy and the computational speed are competitive with the state-of-the-art in multi-class semi-supervised clustering algorithms. Numerical experiments also confirm that the total variation model out performs the Laplacian counter part in most of the tests.  相似文献   

10.
We consider the problem of minimization of the sum of two convex functions, one of which is a smooth function, while another one may be a nonsmooth function. Many high-dimensional learning problems (classification/regression) can be designed using such frameworks, which can be efficiently solved with the help of first-order proximal-based methods. Due to slow convergence of traditional proximal methods, a recent trend is to introduce acceleration to such methods, which increases the speed of convergence. Such proximal gradient methods belong to a wider class of the forward–backward algorithms, which mathematically can be interpreted as fixed-point iterative schemes. In this paper, we design few new proximal gradient methods corresponding to few state-of-the-art fixed-point iterative schemes and compare their performances on the regression problem. In addition, we propose a new accelerated proximal gradient algorithm, which outperforms earlier traditional methods in terms of convergence speed and regression error. To demonstrate the applicability of our method, we conducted experiments for the problem of regression with several publicly available high-dimensional real datasets taken from different application domains. Empirical results exhibit that the proposed method outperforms the previous methods in terms of convergence, accuracy, and objective function values.  相似文献   

11.
针对传统克隆选择算法的不足,提出了一个基于球面杂交的新型克隆选择算法。在该算法的每次迭代过程中,动态地计算出每个抗体的变异概率,根据抗体的亲和度将抗体种群动态分为记忆单元和一般抗体单元,并以球面杂交方式对种群进行调整,从而加快了算法的全局搜索速度。实例验证了所提算法的有效性、可行性。  相似文献   

12.
针对线性的互信息特征提取方法,通过研究互信息梯度在核空间中的线性不变性,提出一种快速、高效的非线性特征提取方法。该方法采用互信息二次熵快速算法及梯度上升的寻优策略,提取有判别能力的非线性高阶统计量;在计算时避免传统非线性特征提取中的特征值分解运算,有效降低计算量。通过UCT数据的投影和分类实验表明,该方法无论在投影空间的可分性上,还是在算法时间复杂度上,都明显优于传统算法。  相似文献   

13.
针对有特殊结构的文本,传统的文本分类算法已经不能满足需求,为此提出一种基于多示例学习框架的文本分类算法。将每个文本当作一个示例包,文本中的标题和正文视为该包的两个示例;利用基于一类分类的多类分类支持向量机算法,将包映射到高维特征空间中;引入高斯核函数训练分类器,完成对无标记文本的分类预测。实验结果表明,该算法相较于传统的机器学习分类算法具有更高的分类精度,为具有特殊文本结构的文本挖掘领域研究提供了新的角度。  相似文献   

14.
郑仙花  骆炎民 《计算机应用》2012,32(11):3201-3205
针对传统的克隆选择算法(CSA)只依次单独针对某一类样本数据进行监督学习从而造成分类效率和精确度不高的问题,提出一种基于改进克隆选择算法的多类监督分类算法。算法通过进化学习可以同时获得多类样本数据的最佳聚类中心,进化过程中抗体适度值的计算综合考虑各类的类内相似性和类间差异性,从而保证得到的最佳聚类中心更具代表性。后续的分类实验中,分别利用常用的4组UCI数据和红树林多光谱TM遥感图像对算法进行验证,实验结果表明遥感图像的分类总精度达到92%,Kappa系数为0.91,UCI数据分类结果也较好,证明该算法是一种有效的多类数据分类算法。  相似文献   

15.
Semi-supervised learning has attracted much attention in pattern recognition and machine learning. Most semi-supervised learning algorithms are proposed for binary classification, and then extended to multi-class cases by using approaches such as one-against-the-rest. In this work, we propose a semi-supervised learning method by using the multi-class boosting, which can directly classify the multi-class data and achieve high classification accuracy by exploiting the unlabeled data. There are two distinct features in our proposed semi-supervised learning approach: (1) handling multi-class cases directly without reducing them to multiple two-class problems, and (2) the classification accuracy of each base classifier requiring only at least 1/K or better than 1/K (K is the number of classes). Experimental results show that the proposed method is effective based on the testing of 21 UCI benchmark data sets.  相似文献   

16.
一种新的基于SVDD的多类分类算法   总被引:2,自引:0,他引:2  
  相似文献   

17.
改进的一对一支持向量机多分类算法   总被引:1,自引:0,他引:1  
支持向量机的一对一多分类算法具有良好的性能,但该算法在分类时存在不可分区域,影响了该方法的应用.因此,提出一种一对一与基于紧密度判决相结合的多分类方法,使用一对一算法分类,采用基于紧密度决策解决不可分区,依据样本到类中心之间的距离和基于kNN (k nearest neighbor)的样本分布情况结合的方式构建判别函数来确定类别归属.使用UCI (university of California Irvine)数据集做测试,测试结果表明,该算法能有效地解决不可分区域问题,而且表现出比其它算法更好的性能.  相似文献   

18.
When the Newton-Raphson algorithm or the Fisher scoring algorithm does not work and the EM-type algorithms are not available, the quadratic lower-bound (QLB) algorithm may be a useful optimization tool. However, like all EM-type algorithms, the QLB algorithm may also suffer from slow convergence which can be viewed as the cost for having the ascent property. This paper proposes a novel ‘shrinkage parameter’ approach to accelerate the QLB algorithm while maintaining its simplicity and stability (i.e., monotonic increase in log-likelihood). The strategy is first to construct a class of quadratic surrogate functions Qr(θ|θ(t)) that induces a class of QLB algorithms indexed by a ‘shrinkage parameter’ r (rR) and then to optimize r over R under some criterion of convergence. For three commonly used criteria (i.e., the smallest eigenvalue, the trace and the determinant), we derive a uniformly optimal shrinkage parameter and find an optimal QLB algorithm. Some theoretical justifications are also presented. Next, we generalize the optimal QLB algorithm to problems with penalizing function and then investigate the associated properties of convergence. The optimal QLB algorithm is applied to fit a logistic regression model and a Cox proportional hazards model. Two real datasets are analyzed to illustrate the proposed methods.  相似文献   

19.
As an extension of multi-class classification, machine learning algorithms have been proposed that are able to deal with situations in which the class labels are defined in a non-crisp way. Objects exhibit in that sense a degree of membership to several classes. In a similar setting, models are developed here for classification problems where an order relation is specified on the classes (i.e., non-crisp ordinal regression problems). As for traditional (crisp) ordinal regression problems, it is argued that the order relation on the classes should be reflected by the model structure as well as the performance measure used to evaluate the model. These arguments lead to a natural extension of the well-known proportional odds model for non-crisp ordinal regression problems, in which the underlying latent variable is not necessarily restricted to the class of linear models (by using kernel methods).  相似文献   

20.
《Applied Soft Computing》2007,7(3):1102-1111
Classification and association rule discovery are important data mining tasks. Using association rule discovery to construct classification systems, also known as associative classification, is a promising approach. In this paper, a new associative classification technique, Ranked Multilabel Rule (RMR) algorithm is introduced, which generates rules with multiple labels. Rules derived by current associative classification algorithms overlap in their training objects, resulting in many redundant and useless rules. However, the proposed algorithm resolves the overlapping between rules in the classifier by generating rules that does not share training objects during the training phase, resulting in a more accurate classifier. Results obtained from experimenting on 20 binary, multi-class and multi-label data sets show that the proposed technique is able to produce classifiers that contain rules associated with multiple classes. Furthermore, the results reveal that removing overlapping of training objects between the derived rules produces highly competitive classifiers if compared with those extracted by decision trees and other associative classification techniques, with respect to error rate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号