首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
This paper presents a fast adaptive iterative algorithm to solve linearly separable classification problems in R n.In each iteration,a subset of the sampling data (n-points,where n is the number of features) is adaptively chosen and a hyperplane is constructed such that it separates the chosen n-points at a margin and best classifies the remaining points.The classification problem is formulated and the details of the algorithm are presented.Further,the algorithm is extended to solving quadratically separable classification problems.The basic idea is based on mapping the physical space to another larger one where the problem becomes linearly separable.Numerical illustrations show that few iteration steps are sufficient for convergence when classes are linearly separable.For nonlinearly separable data,given a specified maximum number of iteration steps,the algorithm returns the best hyperplane that minimizes the number of misclassified points occurring through these steps.Comparisons with other machine learning algorithms on practical and benchmark datasets are also presented,showing the performance of the proposed algorithm.  相似文献   

2.
In machine learning and statistics, classification is the a new observation belongs, on the basis of a training set of data problem of identifying to which of a set of categories (sub-populations) containing observations (or instances) whose category membership is known. SVM (support vector machines) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes fon~as the output, making it a non-probabilistic binary linear classifier. In pattern recognition problem, the selection of the features used for characterization an object to be classified is importance. Kernel methods are algorithms that, by replacing the inner product with an appropriate positive definite function, impticitly perform a nonlinear mapping 4~ of the input data in Rainto a high-dimensional feature space H. Cover's theorem states that if the transformation is nonlinear and the dimensionality of the feature space is high enough, then the input space may be transformed into a new feature space where the patterns are linearly separable with high probability.  相似文献   

3.
The maximum diameter color-spanning set problem(MaxDCS) is defined as follows: given n points with m colors, select m points with m distinct colors such that the diameter of the set of chosen points is maximized. In this paper, we design an optimal O(n log n) time algorithm using rotating calipers for MaxDCS in the plane. Our algorithm can also be used to solve the maximum diameter problem of imprecise points modeled as polygons. We also give an optimal algorithm for the all farthest foreign neighbor problem(AFFN) in the plane, and propose algorithms to answer the farthest foreign neighbor query(FFNQ) of colored sets in two- and three-dimensional space. Furthermore, we study the problem of computing the closest pair of color-spanning set(CPCS) in d-dimensional space, and remove the log m factor in the best known time bound if d is a constant.  相似文献   

4.
Indefinite kernel support vector machine(IKSVM)has recently attracted increasing attentions in machine learning.Since IKSVM essentially is a non-convex problem,existing algorithms either change the spectrum of indefinite kernel directly but risking losing some valuable information or solve the dual form of IKSVM whereas suffering from a dual gap problem.In this paper,we propose a primal perspective for solving the problem.That is,we directly focus on the primal form of IKSVM and present a novel algorithm termed as IKSVM-DC for binary and multi-class classification.Concretely,according to the characteristics of the spectrum for the indefinite kernel matrix,IKSVM-DC decomposes the primal function into the subtraction of two convex functions as a difference of convex functions(DC)programming.To accelerate convergence rate,IKSVM-DC combines the classical DC algorithm with a line search step along the descent direction at each iteration.Furthermore,we construct a multi-class IKSVM model which can classify multiple classes in a unified form.A theoretical analysis is then presented to validate that IKSVM-DC can converge to a local minimum.Finally,we conduct experiments on both binary and multi-class datasets and the experimental results show that IKSVM-DC is superior to other state-of-the-art IKSVM algorithms.  相似文献   

5.
This paper presents a methodology based on computational intelligence techniques for classification of hydrological cycles that can infer the change in the physico-chemieal parameters and metals from the water of a reservoir in the Amazon. The methodology initially consists in perform a pre-processing the data to select the most relevant variables of the samples. After that, we compared two different machine learning classifiers, namely SVM (support vector machine) and ANN (artificial neural network). The automatic model selection is made to choose the parameters of the classifiers. The results indicate that the support vector machine classifier using radial basis function or polynomial kernel exhibited superior results to ANN in terms of overall accuracy and robustness. The SVM classifier accuracy (89.1%) can be considered satisfactory, since there is a great variability of physico-chemical parameters and metals in the hydrological cycles and in the different ecosystems where are the sampling station.  相似文献   

6.
Acquiring a set of features that emphasize the differences between normal data points and outliers can drastically facilitate the task of identifying outliers. In our work, we present a novel non-parametric evaluation criterion for filter-based feature selection which has an eye towards the final goal of outlier detection. The proposed method seeks the subset of features that represent the inherent characteristics of the normal dataset while forcing outliers to stand out, making them more easily distinguished by outlier detection algorithms. Experimental results on real datasets show the advantage of our feature selection algorithm compared with popular and state-of-the-art methods. We also show that the proposed algorithm is able to overcome the small sample space problem and perform well on highly imbalanced datasets. Furthermore, due to the highly parallelizable nature of the feature selection, we implement the algorithm on a graphics processing unit (GPU) to gain significant speedup over the serial version. The benefits of the GPU implementation are two-fold, as its performance scales very well in terms of the number of features, as well as the number of data points.  相似文献   

7.
For classifying large data sets, we propose a discriminant kernel that introduces a nonlinear mapping from the joint space of input data and output label to a discriminant space. Our method differs from traditional ones, which correspond to map nonlinearly from the input space to a feature space. The induced distance of our discriminant kernel is Eu- clidean and Fisher separable, as it is defined based on distance vectors of the feature space to distance vectors on the discriminant space. Unlike the support vector machines or the kernel Fisher discriminant analysis, the classifier does not need to solve a quadric program- ming problem or eigen-decomposition problems. Therefore, it is especially appropriate to the problems of processing large data sets. The classifier can be applied to face recognition, shape comparison and image classification benchmark data sets. The method is significantly faster than other methods and yet it can deliver comparable classification accuracy.  相似文献   

8.
A least squares support vector fuzzy regression model(LS-SVFR) is proposed to estimate uncertain and imprecise data by applying the fuzzy set principle to weight vectors.This model only requires a set of linear equations to obtain the weight vector and the bias term,which is different from the solution of a complicated quadratic programming problem in existing support vector fuzzy regression models.Besides,the proposed LS-SVFR is a model-free method in which the underlying model function doesn’t need to be predefined.Numerical examples and fault detection application are applied to demonstrate the effectiveness and applicability of the proposed model.  相似文献   

9.
Kernel selection is one of the key issues both in recent research and application of kernel methods. This is usually done by minimizing either an estimate of generalization error or some other related performance measure. Use of notions of stability to estimate the generalization error has attracted much attention in recent years. Unfortunately, the existing notions of stability, proposed to derive the theoretical generalization error bounds, are difficult to be used for kernel selection in practice. It is well known that the kernel matrix contains most of the information needed by kernel methods, and the eigenvalues play an important role in the kernel matrix. Therefore, we aim at introducing a new notion of stability, called the spectral perturbation stability, to study the kernel selection problem. This proposed stability quantifies the spectral perturbation of the kernel matrix with respect to the changes in the training set. We establish the connection between the spectral perturbation stability and the generalization error. By minimizing the derived generalization error bound, we propose a new kernel selection criterion that can guarantee good generalization properties. In our criterion, the perturbation of the eigenvalues of the kernel matrix is efficiently computed by solving the derivative of a newly defined generalized kernel matrix. Both theoretical analysis and experimental results demonstrate that our criterion is sound and effective.  相似文献   

10.
11.
用支持向量机进行中文地名识别的研究   总被引:3,自引:0,他引:3  
用支持向量机(SVM)方法对中文地名的自动识别进行了探讨,对于舍特征词的地名和非地名用支持向量机进行分类:结合中文地名的特点,抽取地名构词可信度及其前后词的词性作为特征向量的属性,建立了一定规模的训练集,并通过对不同kernel函数的测试,得到了地名分类的机器学习模型.实验表明,对于切分正确的地名,本方法具有良好的效果.  相似文献   

12.
一种改进的支持向量机NN-SVM   总被引:39,自引:0,他引:39  
支持向量机(SVM)是一种较新的机器学习方法,它利用靠近边界的少数向量构造一个最优分类超平面。在训练分类器时,SVM的着眼点在于两类的交界部分,那些混杂在另一类中的点往往无助于提高分类器的性能,反而会大大增加训练器的计算负担,同时它们的存在还可能造成过学习,使泛化能力减弱.为了改善支持向量机的泛化能力,该文在其基础上提出了一种改进的SVM—NN-SVM:它先对训练集进行修剪,根据每个样本与其最近邻类标的异同决定其取舍,然后再用SVM训练得到分类器.实验表明,NN-SVM相比SVM在分类正确率、分类速度以及适用的样本规模上都表现出了一定的优越性.  相似文献   

13.
当支持向量机中存在相互混叠的海量训练样本时,不但支持向量求取困难,且支持向量数目巨大,这两个问题已成为限制其应用的瓶颈问题。该文通过对支持向量几何意义的分析,首先研究了支持向量的分布特性,并提出了基于几何分析的支持向量机快速算法,该算法首先从训练样本中选择出部分近邻向量,然后在进行混叠度分析的基础上,选择真实的边界向量样本子空间用来代替全部训练集,这样既大大减少了训练样本数目,同时去除了混叠严重的奇异样本的影响,并大大减少了支持向量的数目。实验结果表明:该算法在不影响分类性能的条件下,可以加快支持向量机的训练速度和分类速度。  相似文献   

14.
Selecting training points for one-class support vector machines   总被引:1,自引:0,他引:1  
This paper proposes a training points selection method for one-class support vector machines. It exploits the feature of a trained one-class SVM, which uses points only residing on the exterior region of data distribution as support vectors. Thus, the proposed training set reduction method selects the so-called extreme points which sit on the boundary of data distribution, through local geometry and k-nearest neighbours. Experimental results demonstrate that the proposed method can reduce training set considerably, while the obtained model maintains generalization capability to the level of a model trained on the full training set, but uses less support vectors and exhibits faster training speed.  相似文献   

15.
一种基于morlet小波核的约简支持向量机   总被引:7,自引:0,他引:7  
针对支持向量机(SVM)的训练数据量仅局限于较小样本集的问题,结合Morlet小波核函数,提出了一种基于Morlet小波核的约倚支持向量机(MWRSVM—DC).算法的核心是通过密度聚类寻找聚类中每个簇的边缘点作为约倚集合,并利用该约倚集合寻找支持向量.实验表明,利用小波核,该算法不仅提高了分类的准确率,而且提高了整体分类效率.  相似文献   

16.
基于向量投影的支撑向量预选取   总被引:21,自引:0,他引:21  
支撑向量机是近年来新兴的模式识别方法,在解决小样本、非线性及高维模式识别问题中表现出了突出的优点.但在支撑向量机中,支撑向量的选取相当困难,这也成为限制其应用的瓶颈问题.该文对支撑向量机的机理经过认真分析,研究其支撑向量的分布特性,在不影响分类性能的前提下,提出了基于向量投影的支撑向量预选取法,从训练样本中预先选择具有一定特征的边界向量来代替训练样本进行训练,这样就减少了训练样本,大大加快了支撑向量机的训练速度。  相似文献   

17.
The choice of the kernel function is crucial to most applications of support vector machines. In this paper, however, we show that in the case of text classification, term-frequency transformations have a larger impact on the performance of SVM than the kernel itself. We discuss the role of importance-weights (e.g. document frequency and redundancy), which is not yet fully understood in the light of model complexity and calculation cost, and we show that time consuming lemmatization or stemming can be avoided even when classifying a highly inflectional language like German.  相似文献   

18.
基于闭凸包收缩的最大边缘线性分类器   总被引:12,自引:1,他引:12  
SVM(support vector machines)是一种基于结构风险最小化原理的分类技术.给出实现结构风险最小化原理(最大边缘)的另一种方法.对线性可分情形,提出一种精确意义下的最大边缘算法,并通过闭凸包收缩的概念,将线性不可分的情形转化为线性可分情形.该算法与SVM算法及其Cortes软边缘算法异曲同工,但理论体系简单、严谨,其中的优化问题几何意义清楚、明确.  相似文献   

19.
双支持向量机是一种新的非平行二分类算法,其处理速度比传统支持向量机快很多,但是双支持向量机在训练之前要进行大量的复杂逆矩阵计算;在非线性情况下,它不能像传统支持向量机那样把核技巧直接运用到对偶优化问题中;并且双支持向量机没有考虑不同输入样本点会对最优分类超平面产生不同的影响。针对这些情况,提出了一种模糊简约双支持向量机。该模糊简约双支持向量机通过对二次规划函数和拉格朗日函数的改进,省略大量的逆矩阵计算,同时核技巧能直接运用到非线性分类情况下;对于混合模糊隶属度函数,不仅每个样本点到类中心的距离影响着该混合模糊隶属度,而且该样本点的邻域密度同样影响着该混合模糊隶属度。实验结果表明,与支持向量机、标准双支持向量机、双边界支持向量机、模糊双支持向量机相比,具有该混合模糊隶属度函数的简约双支持向量机不仅分类时间短,计算简单,而且分类精度高。  相似文献   

20.
针对v-支持向量机在样本集规模较大的情况下,需要占用大量训练时间的问题,提出基于粗糙集边界的v-支持向量机混合分类算法。该算法根据粗糙集理论边界区域的优点,生成分类数据的边界集,使其包括全部的支持向量,用此边界向量集替代原始样本作为训练集,减少训练集的数量,则可以在不影响分类精度和泛化性能的前提下显著缩短v-支持向量机的训练时间。仿真结果表明该算法的有效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号