首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
支持向量机是数据挖掘的新方法,由于其优秀的学习能力而得到了广泛的应用,但是,传统的支持向量机算法在处理大规模问题时存在训练时间过长和内存空间需求过大的问题,而应用多个支持向量机构成多分类器系统进行并行学习,是目前解决文本分类中大规模数据处理问题的一种有效方法。在分析传统并行算法的基础上,提出了一种改进的基于多支持向量机的并行学习算法,实验结果表明,采用该算法可使得分类效率得到较大的程度的提高,虽然,分类精度相对传统的方法略差,但是,在可接受的范围之内。  相似文献   

2.
支持向量机针对大规模数据集学习问题的处理需要耗费很长的时间,提出一种数据预处理的方法对学习样本进行聚 类,以此为基础得到一种模糊支持向量机.计算机仿真结果表明提出的SVM算法与传统的SVM训练算法相比,在不降低分 类精度的情况下,大大缩短了支持向量机的学习训练时间.  相似文献   

3.
梁鸿  葛宇飞  陈林  王雯娇 《计算机应用》2015,35(11):3087-3091
针对入侵检测技术在处理大规模数据时存在的高误报率、低训练速度和低实时性的问题,提出了一种基于树突细胞算法与对支持向量机的入侵检测策略(DCTWSVM).利用树突细胞算法(DCA)对威胁数据进行初始检测,在此基础上利用对支持向量机(TWSVM)进行检测结果的优化处理.为了验证策略的有效性,设计性能对比实验,实验结果表明,相较于DCA、支持向量机(SVM)、反向传播(BP)神经网络,DCTWSVM策略的检测精度提高了2.02%、2.30%、5.44%,误报率分别降低了0.26%、0.46%、0.90%,训练速度相较于SVM提高了两倍且只需耗费极少的训练时间,可以更好地适用于大规模数据下的实时入侵检测环境.  相似文献   

4.
对支持向量机的大规模训练问题进行了深入研究,提出一种类似SMO的块增量算法.该算法利用increase和decrease两个过程依次对每个输入数据块进行学习,避免了传统支持向量机学习算法在大规模数据集情况下急剧增大的计算开销.理论分析表明新算法能够收敛到近似最优解.基于KDD数据集的实验结果表明,该算法能够获得接近线性的训练速率,且泛化性能和支持向量数目与LIBSVM方法的结果接近.  相似文献   

5.
研究支持向量机参数优化问题,由于算法要求准确选择SVM参数,支持向量机在处理大样本数据集时和最优模型参数确定时,消耗的时间长、占有内存大,易获得局部最优解的难题.为了解决支持向量机存在的不足,采用深度优先搜索算法对其参数优化机机制进行改进.将向量机参数优化视成一个组合优化问题,将支持向量机模型的分类误差作为优化目标函数,采用深度优先算法对其进行求解,最后将模型应用于3个标准分类数据集.仿真结果表明,优化参数后的支持向量机加快模型的训练速度度,提高了分类的准确率,很好的解决了支持向量机参数优化难题.  相似文献   

6.
针对多分类支持向量机算法中的低效问题和样本不平衡问题,提出一种有向无环图-双支持向量机DAG-TWSVM(directed acyclic graph and twin support vector machine)的多分类方法。该算法综合了双支持向量机和有向无环图支持向量机的优势,使其不仅能够得到较好的分类精度,同时还能够大大缩减训练时间。在处理较大规模数据集多分类问题时,其时间优势更为突出。采用UCI(University of California Irvine)机器学习数据库和Statlog数据库对该算法进行验证,实验结果表明,有向无环图-双支持向量机多分类方法在训练时间上较其他多分类支持向量机大大缩短,且在样本不平衡时的分类性能要优于其他多分类支持向量机,同时解决了经典支持向量机一对一多分类算法可能存在的不可分区域问题。  相似文献   

7.
分类预测是数据挖掘、机器学习和模式识别等很多领域共同关注的问题,已经存在了许多有效的分类算法,但这些算法还不能解决所有的问题。支持向量机作为一种新的分类预测工具,能根据有限样本信息在模型的复杂性和学习能力间取得平衡,并能获得更好的泛化能力。SMO算法是支持向量机中使用最多的算法,它体现了支持向量机的优点,同时也能处理大规模训练集。  相似文献   

8.
针对支持向量机SMO训练算法在遇到大规模问题时训练过慢的问题,提出了一种改进的工作集选择模型的并行算法.在该算法中,根据支持向量机训练过程中的特点,提出了限定工作集选择次数、工作集选择的过程中跳过稳定样本的策略.对该SMO算法进行并行训练,3组著名数据集的实验结果表明,该模型在保持精度的情况下,进一步提高了训练的速度.  相似文献   

9.
为了减小支持向量回归机(SVR)的计算复杂度、缩短训练时间,将应用于分类问题的近似支持向量机(PSVM)扩展到回归问题中,针对其原始优化问题采用直接法求取最优解,而不是转换为对偶问题求解,给出了近似支持向量回归机(PSVR)线性和非线性回归算法.并与同样基于等式约束的最小二乘支持向量回归机(LSSVR)进行了比较,在一维、二维函数回归以及不同规模通用数据集上的测试结果表明,PSVR算法简单,训练速度快,尤其在大规模数据集处理上更具优势.  相似文献   

10.
多类支持向量机文本分类方法   总被引:8,自引:3,他引:5  
文本分类是数据挖掘的基础和核心,支持向量机(SVM)是解决文本分类问题的最好算法之一.传统的支持向量机是两类分类问题,如何有效地将其推广到多类分类问题仍是一项有待研究的课题.介绍了支持向量机的基本原理,对现有主要的多类支持向量机文本分类算法进行了讨论和比较.提出了多类支持向量机文本分类中存在的问题和今后的发展.  相似文献   

11.
An online incremental learning support vector machine for large-scale data   总被引:1,自引:1,他引:0  
Support Vector Machines (SVMs) have gained outstanding generalization in many fields. However, standard SVM and most of modified SVMs are in essence batch learning, which make them unable to handle incremental learning or online learning well. Also, such SVMs are not able to handle large-scale data effectively because they are costly in terms of memory and computing consumption. In some situations, plenty of Support Vectors (SVs) are produced, which generally means a long testing time. In this paper, we propose an online incremental learning SVM for large data sets. The proposed method mainly consists of two components: the learning prototypes (LPs) and the learning Support Vectors (LSVs). LPs learn the prototypes and continuously adjust prototypes to the data concept. LSVs are to get a new SVM by combining learned prototypes with trained SVs. The proposed method has been compared with other popular SVM algorithms and experimental results demonstrate that the proposed algorithm is effective for incremental learning problems and large-scale problems.  相似文献   

12.
支持向量机(SVM)是最为流行的分类工具,但处理大规模的数据集时,需要大量的内存资源和训练时间,通常在大集群并行环境下才能实现。提出一种新的并行SVM算法,RF-CCASVM,可在有限计算资源上求解大规模SVM。通过随机傅里叶映射,应用低维显示特征映射一致近似高斯核对应的无限维隐式特征映射,从而用线性SVM一致近似高斯核SVM。提出一致中心调节的并行化方法。具体地,将数据集划分成若干子数据集,多个进程并行地在各自的子数据集上独立训练SVM。当各个子数据集上的最优超平面即将求出时,用由各个子集上获得的一致中心解取代当前解,继续在各子集上训练直到一致中心解在各个子集上达到最优。标准数据集的对比实验验证了RF-CCASVM的正确性和有效性。  相似文献   

13.
传统支持向量机通常关注于数据分布的边缘样本,支持向量通常在这些边缘样本中产生。本文提出一个新的支持向量算法,该算法的支持向量从全局的数据分布中产生,其稀疏性能在大部分数据集上远远优于经典支持向量机算法。该算法在多类问题上的时间复杂度仅等价于原支持向量机算法的二值问题,解决了设计多类算法时变量数目庞大或者二值子分类器数目过多的问题。  相似文献   

14.
Training a support vector machine in the primal   总被引:3,自引:0,他引:3  
Chapelle O 《Neural computation》2007,19(5):1155-1178
Most literature on support vector machines (SVMs) concentrates on the dual optimization problem. In this letter, we point out that the primal problem can also be solved efficiently for both linear and nonlinear SVMs and that there is no reason for ignoring this possibility. On the contrary, from the primal point of view, new families of algorithms for large-scale SVM training can be investigated.  相似文献   

15.
The discrete wavelet transform (DWT) provides a multiresolution decomposition of hyperspectral data. Wavelet features of each level are downsampled from the band features. Fine-scale and large-scale information from hyperspectral signals can be separated and this method might provide specific discriminant capability compared to using band features alone. This article proposes using a combination of band and wavelet features (BWFs) in the stacked support vector machine (SSVM), where each feature set is solved independently by level-0 support vector machines (SVMs), and level-1 SVMs are used to correct the errors of level-0 SVMs and obtain the final classification result. The effectiveness of the proposed method was examined using two benchmark hyperspectral data sets collected over forest and urban areas, respectively. For both data sets, the proposed method significantly outperformed SVMs using band features, wavelet energy features (WEFs), wavelet concatenated features (WFs concatenated), and both BWFs and the SSVM using only WFs.  相似文献   

16.
Texture classification using the support vector machines   总被引:12,自引:0,他引:12  
Shutao  James T.  Hailong  Yaonan 《Pattern recognition》2003,36(12):2883-2893
In recent years, support vector machines (SVMs) have demonstrated excellent performance in a variety of pattern recognition problems. In this paper, we apply SVMs for texture classification, using translation-invariant features generated from the discrete wavelet frame transform. To alleviate the problem of selecting the right kernel parameter in the SVM, we use a fusion scheme based on multiple SVMs, each with a different setting of the kernel parameter. Compared to the traditional Bayes classifier and the learning vector quantization algorithm, SVMs, and, in particular, the fused output from multiple SVMs, produce more accurate classification results on the Brodatz texture album.  相似文献   

17.
Support vector machines for texture classification   总被引:18,自引:0,他引:18  
This paper investigates the application of support vector machines (SVMs) in texture classification. Instead of relying on an external feature extractor, the SVM receives the gray-level values of the raw pixels, as SVMs can generalize well even in high-dimensional spaces. Furthermore, it is shown that SVMs can incorporate conventional texture feature extraction methods within their own architecture, while also providing solutions to problems inherent in these methods. One-against-others decomposition is adopted to apply binary SVMs to multitexture classification, plus a neural network is used as an arbitrator to make final classifications from several one-against-others SVM outputs. Experimental results demonstrate the effectiveness of SVMs in texture classification.  相似文献   

18.
Support vector machines (SVMs), initially proposed for two-class classification problems, have been very successful in pattern recognition problems. For multi-class classification problems, the standard hyperplane-based SVMs are made by constructing and combining several maximal-margin hyperplanes, and each class of data is confined into a certain area constructed by those hyperplanes. Instead of using hyperplanes, hyperspheres that tightly enclosed the data of each class can be used. Since the class-specific hyperspheres are constructed for each class separately, the spherical-structured SVMs can be used to deal with the multi-class classification problem easily. In addition, the center and radius of the class-specific hypersphere characterize the distribution of examples from that class, and may be useful for dealing with imbalance problems. In this paper, we incorporate the concept of maximal margin into the spherical-structured SVMs. Besides, the proposed approach has the advantage of using a new parameter on controlling the number of support vectors. Experimental results show that the proposed method performs well on both artificial and benchmark datasets.  相似文献   

19.
数据分块数的选择是并行/分布式机器学习模型选择的基本问题之一,直接影响着机器学习算法的泛化性和运行效率。现有并行/分布式机器学习方法往往根据经验或处理器个数来选择数据分块数,没有明确的数据分块数选择准则。提出一个并行效率敏感的并行/分布式机器学习数据分块数选择准则,该准则可在保证并行/分布式机器学习模型测试精度的情况下,提高计算效率。首先推导并行/分布式机器学习模型的泛化误差与分块数目的关系。然后以此为基础,提出折衷泛化性与并行效率的数据分块数选择准则。最后,在ADMM框架下随机傅里叶特征空间中,给出采用该数据分块数选择准则的大规模支持向量机实现方案,并在高性能计算集群和大规模标准数据集上对所提出的数据分块数选择准则的有效性进行实验验证。  相似文献   

20.
We describe the use of support vector machines (SVMs) for continuous speech recognition by incorporating them in segmental minimum Bayes risk decoding. Lattice cutting is used to convert the Automatic Speech Recognition search space into sequences of smaller recognition problems. SVMs are then trained as discriminative models over each of these problems and used in a rescoring framework. We pose the estimation of a posterior distribution over hypotheses in these regions of acoustic confusion as a logistic regression problem. We also show that GiniSVMs can be used as an approximation technique to estimate the parameters of the logistic regression problem. On a small vocabulary recognition task we show that the use of GiniSVMs can improve the performance of a well trained hidden Markov model system trained under the Maximum Mutual Information criterion. We also find that it is possible to derive reliable confidence scores over the GiniSVM hypotheses and that these can be used to good effect in hypothesis combination. We discuss the problems that we expect to encounter in extending this approach to large vocabulary continuous speech recognition and describe initial investigation of constrained estimation techniques to derive feature spaces for SVMs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号