首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In many pattern classification applications, data are represented by high dimensional feature vectors, which induce high computational cost and reduce classification speed in the context of support vector machines (SVMs). To reduce the dimensionality of pattern representation, we develop a discriminative function pruning analysis (DFPA) feature subset selection method in the present study. The basic idea of the DFPA method is to learn the SVM discriminative function from training data using all input variables available first, and then to select feature subset through pruning analysis. In the present study, the pruning is implement using a forward selection procedure combined with a linear least square estimation algorithm, taking advantage of linear-in-the-parameter structure of the SVM discriminative function. The strength of the DFPA method is that it combines good characters of both filter and wrapper methods. Firstly, it retains the simplicity of the filter method avoiding training of a large number of SVM classifier. Secondly, it inherits the good performance of the wrapper method by taking the SVM classification algorithm into account.  相似文献   

2.
样例约简支持向量机   总被引:1,自引:0,他引:1       下载免费PDF全文
支持向量机(support vector machine,SVM)仅利用靠近分类边界的支持向量构造最优分类超平面,但求解SVM需要整个训练集,当训练集的规模较大时,求解SVM需要占用大量的内存空间,寻优速度非常慢。针对这一问题,提出了一种称为样例约简的寻找候选支持向量的方法。在该方法中,支持向量大多靠近分类边界,可利用相容粗糙集技术选出边界域中的样例,作为候选支持向量,然后将选出的样例作为训练集来求解SVM。实验结果证实了该方法的有效性,特别是对大型数据库,该方法能有效减少存储空间和执行时间。  相似文献   

3.
针对支持向量机(Support Vector Machine,SVM)处理大规模数据集的学习时间长、泛化能力下降等问题,提出基于边界样本选择的支持向量机加速算法。首先,进行无监督的K均值聚类;然后,在各个聚簇内依照簇的混合度、支持度因素应用K近邻算法剔除非边界样本,获得最终的类别边界区域样本,参与SVM模型训练。在标准数据集上的实验结果表明,算法在保持传统支持向量机的分类泛化能力的同时,显著降低了模型训练时间。  相似文献   

4.
The central problem in training a radial basis function neural network (RBFNN) is the selection of hidden layer neurons, which includes the selection of the center and width of those neurons. In this paper, we propose an enhanced swarm intelligence clustering (ESIC) method to select hidden layer neurons, and then, train a cosine RBFNN based on the gradient descent learning process. Also, we apply this new method for classification of deep Web sources. Experimental results show that the average Precision, Recall and F of our ESIC-based RBFNN classifier achieve higher performance than BP, Support Vector Machines (SVM) and OLS RBF for our deep Web sources classification problems.  相似文献   

5.
In many pattern recognition applications, high-dimensional feature vectors impose a high computational cost as well as the risk of “overfitting”. Feature Selection addresses the dimensionality reduction problem by determining a subset of available features which is most essential for classification. This paper presents a novel feature selection method named filtered and supported sequential forward search (FS_SFS) in the context of support vector machines (SVM). In comparison with conventional wrapper methods that employ the SFS strategy, FS_SFS has two important properties to reduce the time of computation. First, it dynamically maintains a subset of samples for the training of SVM. Because not all the available samples participate in the training process, the computational cost to obtain a single SVM classifier is decreased. Secondly, a new criterion, which takes into consideration both the discriminant ability of individual features and the correlation between them, is proposed to effectively filter out nonessential features. As a result, the total number of training is significantly reduced and the overfitting problem is alleviated. The proposed approach is tested on both synthetic and real data to demonstrate its effectiveness and efficiency.  相似文献   

6.
The central problem in training a radial basis function neural network (RBFNN) is the selection of hidden layer neurons, which includes the selection of the center and width of those neurons. In this paper, we propose an enhanced swarm intelligence clustering (ESIC) method to select hidden layer neurons, and then, train a cosine RBFNN based on the gradient descent learning process. Also, we apply this new method for classification of deep Web sources. Experimental results show that the average Precision, Recall and F of our ESIC-based RBFNN classifier achieve higher performance than BP, Support Vector Machines (SVM) and OLS RBF for our deep Web sources classification problems.  相似文献   

7.
In conjunction with the advance in computer technology, virtual screening of small molecules has been started to use in drug discovery. Since there are thousands of compounds in early-phase of drug discovery, a fast classification method, which can distinguish between active and inactive molecules, can be used for screening large compound collections. In this study, we used Support Vector Machines (SVM) for this type of classification task. SVM is a powerful classification tool that is becoming increasingly popular in various machine-learning applications. The data sets consist of 631 compounds for training set and 216 compounds for a separate test set. In data pre-processing step, the Pearson's correlation coefficient used as a filter to eliminate redundant features. After application of the correlation filter, a single SVM has been applied to this reduced data set. Moreover, we have investigated the performance of SVM with different feature selection strategies, including SVM–Recursive Feature Elimination, Wrapper Method and Subset Selection. All feature selection methods generally represent better performance than a single SVM while Subset Selection outperforms other feature selection methods. We have tested SVM as a classification tool in a real-life drug discovery problem and our results revealed that it could be a useful method for classification task in early-phase of drug discovery.  相似文献   

8.
基于支持向量机和距离度量的纹理分类   总被引:9,自引:1,他引:9       下载免费PDF全文
针对图象纹理分类问题,提出了一种将支持向量机和距离度量相结合,以构成两级组合分类器的分类方法,用该方法分类时,先采用距离度量进行前级分类,然后根据图象的纹理统计特征,采用欧氏距离来度量图象之间的相似性,若符合条件,则给出分类结果,否则拒识,并转入后级分类器,而后级分类器则采用一种新的模式分类方法-支持向量机进行分类,该组合分类方法不仅充分利用了支持向量机识别率高和距离度量速度快的优点,并且还利用距离度量的结果去指导支持向量机的训练和测试,由纹理图象分类的实验表明,该算法具有较高的效率和识别精度,同时也对推动支持向量机这一新的模式分类方法的实际应用具有积极意义。  相似文献   

9.
基于平衡策略的SMO改进算法   总被引:2,自引:0,他引:2  
韩冰  冯博琴  傅向华  马兆丰 《计算机工程》2005,31(12):10-12,107
支持向量机是一种非常优秀的机器学习技术,求解大规模二次规划问题是训练SVM的关键。该文提出了一种改进方法,保持计算代价与优化步长之间的平衡,从而加速收敛,缩短训练时问。实验结果表明,在大数据集的情况下,该方法是十分有效的。  相似文献   

10.
基于决策支持向量机的中文网页分类器   总被引:10,自引:0,他引:10  
提出了基于决策支持向量机的中文网页分类算法。把支持向量机方法和二叉决策树的基本思想结合起来构成多类别的分类器,用于中文网页分类,从而减少支持向量机分类器训练样本的数量,提高训练效率。实验表明,该方法训练数据规模大大减少,训练效率较高,同时具有较好的精确率和召回率。  相似文献   

11.
针对现有的主动学习算法在多分类器应用中存在准确率低、速度慢等问题,将基于仿射传播(AP)聚类的主动学习算法引入到多分类支持向量机中,每次迭代主动选择最有利于改善多类SVM分类器性能的N个新样本点添加到训练样本点中进行学习,使得在花费较小标注代价情况下,能够获得较高的分类性能。在多个不同数据集上的实验结果表明,新方法能够有效地减少分类器训练时所需的人工标注样本点的数量,并获得较高的准确率和较好的鲁棒性。  相似文献   

12.
中文网页分类技术是数据挖掘中一个研究热点领域,而支持向量机(SVM)是一种高效的分类识别方法,在解决高维模式识别问题中表现出许多特有的优势.提出了基于支持向量机的中文网页分类方法,其中包括对该过程中的网页文本预处理、特征提取和多分类算法等关键技术的介绍.实验表明,该方法训练数据规模大大减少,训练效率较高,同时具有较好的精确率和召回率.  相似文献   

13.
针对管道内表面图像的分类问题,提出了一种将支持向量机和距离度量相结合,构成组合分类器的分类方法。分类时先采用距离度量进行前级分类,符合条件则给出分类结果,否则拒识并转入SVM分类器进行分类。该方法充分利用了SVM识别率高和距离度量速度快的优点,并且利用距离度量的结果去指导SVM的训练和测试。实验表明本方法具有较高的效率和识别精度,进一步提高了系统的识别率和容噪性能。  相似文献   

14.
Normal support vector machine (SVM) is not suitable for classification of large data sets because of high training complexity. Convex hull can simplify the SVM training. However, the classification accuracy becomes lower when there exist inseparable points. This paper introduces a novel method for SVM classification, called convex–concave hull SVM (CCH-SVM). After grid processing, the convex hull is used to find extreme points. Then, we use Jarvis march method to determine the concave (non-convex) hull for the inseparable points. Finally, the vertices of the convex–concave hull are applied for SVM training. The proposed CCH-SVM classifier has distinctive advantages on dealing with large data sets. We apply the proposed method on several benchmark problems. Experimental results demonstrate that our approach has good classification accuracy while the training is significantly faster than other SVM classifiers. Compared with the other convex hull SVM methods, the classification accuracy is higher.  相似文献   

15.
该文提出一种多层grams特征抽取方法来提升基于在线支持向量模型的垃圾邮件过滤器。基于在线支持向量机模型的垃圾邮件过滤器在大规模垃圾邮件数据集已取得了很好的过滤效果,但与逻辑回归模型相比,计算性能的耗时是巨大的,很难被工业界所运用。该文提出的多层grams特征抽取方法能够有效减少特征数,抽取更精准有效的特征,大幅降低模型的运行时间,同时提升过滤器的过滤效果。实验表明,该方法使得在线支持向量机模型的运行时间从10337s减少到3784s,同时模型(1-ROCA)%降低了一半。  相似文献   

16.
提出一种基于二维主分量(2DPCA)分析和支持向量机的层叠人脸检测算法,用于复杂背景灰度图像的人脸检测。算法首先采用2DPCA分析方法滤去大量非人脸窗口,之后用支持向量机对通过的窗口进行检测。由于在通过2DPCA分析方法的子空间内训练SVM,降低了分类器的训练难度。并且和传统的PCA方法相比,2DPCA直接采用二维图像矩阵表示人脸,进行特征提取,提高了计算效率。实验对比数据表明该算法大大提高了检测速度,降低了虚警率。  相似文献   

17.
Support Vector Machines (SVM) is becoming a popular alternative to traditional image classification methods because it makes possible accurate classification from small training samples. Nevertheless, concerns regarding SVM parameterization and computational effort have arisen. This Letter is an evaluation of an automated SVM‐based method for image classification. The method is applied to a land‐cover classification experiment using a hyperspectral dataset. The results suggest that SVM can be parameterized to obtain accurate results while being computationally efficient. However, automation of parameter tuning does not solve all SVM problems. Interestingly, the method produces fuzzy image‐regions whose contextual properties may be potentially useful for improving the image classification process.  相似文献   

18.
为了提高大规模高维度数据的训练速度和分类精度,提出了一种基于局部敏感哈希的SVM快速增量学习方法。算法首先利用局部敏感哈希能快速查找相似数据的特性,在SVM算法的基础上筛选出增量中可能成为SV的样本,然后将这些样本与已有SV一起作为后续训练的基础。使用多个数据集对该算法进行了验证。实验表明,在大规模增量数据样本中,提出的SVM快速增量学习算法能有效地提高训练学习的速度,并能保持有效的准确率。  相似文献   

19.
求解支持向量机的核心问题是对一个大规模凸二次规划问题进行求解。基于支持向量机的修正模型,得到一个与之等价的互补问题,利用Fischer-Burmeister互补函数,从一个新的角度提出了求解互补支持向量机的非单调信赖域算法。新算法避免了求解Hesse矩阵或矩阵求逆运算,减少了工作量,提高了运算效率。在不需要任何假设的情况下,证明算法具有全局收敛性。数值实验结果表明,对于大规模非线性分类问题,该算法的运行速度比LSVM算法和下降法快,为求解SVM优化问题提供了一种新的可行方法。  相似文献   

20.
Recurrent networks can generate spatio-temporal neural sequences of very large cycles, having an apparent random behavior. Nonetheless a proximity measure between these sequences may be defined through comparison of the synaptic weight matrices that generate them. Following the dynamic neural filter (DNF) formalism we demonstrate this concept by comparing teacher and student recurrent networks of binary neurons. We show that large sequences, providing a training set well exceeding the Cover limit, allow for good determination of the synaptic matrices. Alternatively, assuming the matrices to be known, very fast determination of the biases can be achieved. Thus, a spatio-temporal sequence may be regarded as spatio-temporal encoding of the bias vector. We introduce a linear support vector machine (SVM) variant of the DNF in order to specify an optimal weight matrix. This approach allows us to deal with noise. Spatio-temporal sequences generated by different DNFs with the same number of neurons may be compared by calculating correlations of the synaptic matrices of the reconstructed DNFs. Other types of spatio-temporal sequences need the introduction of hidden neurons, and/or the use of a kernel variant of the SVM approach. The latter is being defined as a recurrent support vector network (RSVN).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号