首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
核函数及相关参数的选择是支持向量机中的一个重要问题, 它对模型的推广能力有很大的影响。当有大量样本参与训练的时候,寻找最优参数的网格搜索算法将消耗过长的时间。针对这一问题,提出一种舍弃非支持向量的样本点的策略,从而缩减了训练样本集。能够在基本保持原有测试准确度的前提下,将搜索时间减少一半。  相似文献   

2.
数据分块数的选择是并行/分布式机器学习模型选择的基本问题之一,直接影响着机器学习算法的泛化性和运行效率。现有并行/分布式机器学习方法往往根据经验或处理器个数来选择数据分块数,没有明确的数据分块数选择准则。提出一个并行效率敏感的并行/分布式机器学习数据分块数选择准则,该准则可在保证并行/分布式机器学习模型测试精度的情况下,提高计算效率。首先推导并行/分布式机器学习模型的泛化误差与分块数目的关系。然后以此为基础,提出折衷泛化性与并行效率的数据分块数选择准则。最后,在ADMM框架下随机傅里叶特征空间中,给出采用该数据分块数选择准则的大规模支持向量机实现方案,并在高性能计算集群和大规模标准数据集上对所提出的数据分块数选择准则的有效性进行实验验证。  相似文献   

3.
Support vector machines (SVMs) are a class of popular classification algorithms for their high generalization ability. However, it is time-consuming to train SVMs with a large set of learning samples. Improving learning efficiency is one of most important research tasks on SVMs. It is known that although there are many candidate training samples in some learning tasks, only the samples near decision boundary which are called support vectors have impact on the optimal classification hyper-planes. Finding these samples and training SVMs with them will greatly decrease training time and space complexity. Based on the observation, we introduce neighborhood based rough set model to search boundary samples. Using the model, we firstly divide sample spaces into three subsets: positive region, boundary and noise. Furthermore, we partition the input features into four subsets: strongly relevant features, weakly relevant and indispensable features, weakly relevant and superfluous features, and irrelevant features. Then we train SVMs only with the boundary samples in the relevant and indispensable feature subspaces, thus feature and sample selection is simultaneously conducted with the proposed model. A set of experimental results show the model can select very few features and samples for training; in the mean time the classification performances are preserved or even improved.  相似文献   

4.
核选择问题是支持向量机(Support Vector Machine,SVM)建模中的一个关键问题,虽然支持向量机具有良好的泛化性能,但其性能受核函数的影响比较明显,而对于一个给定问题,选择合适的核函数及参数通常很困难。提出一种基于SVM集成的核选择方法,利用不同的核函数构造子SVM学习器,然后对子学习器的预测结果集成。提出的核选择方法将SVM集成学习与核选择同时进行,不仅避免了单个SVM的核选择对泛化能力的影响,而且可以获得良好的泛化能力。在UCI标准数据集上的结果说明了提出的方法的有效性。  相似文献   

5.
Support vector machines (SVMs) are the effective machine-learning methods based on the structural risk minimization (SRM) principle, which is an approach to minimize the upper bound risk functional related to the generalization performance. The parameter selection is an important factor that impacts the performance of SVMs. Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) is an evolutionary optimization strategy, which is used to optimize the parameters of SVMs in this paper. Compared with the traditional SVMs, the optimal SVMs using CMA-ES have more accuracy in predicting the Lorenz signal. The industry case illustrates that the proposed method is very successfully in forecasting the short-term fault of large machinery.  相似文献   

6.
Type-2 fuzzy logic-based classifier fusion for support vector machines   总被引:1,自引:0,他引:1  
As a machine-learning tool, support vector machines (SVMs) have been gaining popularity due to their promising performance. However, the generalization abilities of SVMs often rely on whether the selected kernel functions are suitable for real classification data. To lessen the sensitivity of different kernels in SVMs classification and improve SVMs generalization ability, this paper proposes a fuzzy fusion model to combine multiple SVMs classifiers. To better handle uncertainties existing in real classification data and in the membership functions (MFs) in the traditional type-1 fuzzy logic system (FLS), we apply interval type-2 fuzzy sets to construct a type-2 SVMs fusion FLS. This type-2 fusion architecture takes considerations of the classification results from individual SVMs classifiers and generates the combined classification decision as the output. Besides the distances of data examples to SVMs hyperplanes, the type-2 fuzzy SVMs fusion system also considers the accuracy information of individual SVMs. Our experiments show that the type-2 based SVM fusion classifiers outperform individual SVM classifiers in most cases. The experiments also show that the type-2 fuzzy logic-based SVMs fusion model is better than the type-1 based SVM fusion model in general.  相似文献   

7.
A parallel mixture of SVMs for very large scale problems   总被引:7,自引:0,他引:7  
Support vector machines (SVMs) are the state-of-the-art models for many classification problems, but they suffer from the complexity of their training algorithm, which is at least quadratic with respect to the number of examples. Hence, it is hopeless to try to solve real-life problems having more than a few hundred thousand examples with SVMs. This article proposes a new mixture of SVMs that can be easily implemented in parallel and where each SVM is trained on a small subset of the whole data set. Experiments on a large benchmark data set (Forest) yielded significant time improvement (time complexity appears empirically to locally grow linearly with the number of examples). In addition, and surprisingly, a significant improvement in generalization was observed.  相似文献   

8.
张耿  张桂新 《微机发展》2007,17(7):24-27
支持向量机(SVM)算法是统计学习理论中最年轻的分支。结构风险最小化原则使其具有良好的学习推广性。但在实际应用中,训练速度慢一直是支持向量机理论几个亟待解决的问题之一,这一点在SVM向多类问题领域推广时表现的尤为明显。文中将从样本分布与类别数量两方面入手,对传统的SVM多分类OAO算法进行训练时间性能上的分析,并引入分层的思想,提出传统OAO-SVMs算法的改进模型H-OAO-SVMs。通过与其他常见多分类SVMs训练时间的比较表明:改进后的H-OAO-SVMs模型具有更优的训练时间性能。  相似文献   

9.
Support vector machines (SVMs) are one of the most popular classification tools and show the most potential to address under-sampled noisy data (a large number of features and a relatively small number of samples). However, the computational cost is too expensive, even for modern-scale samples, and the performance largely depends on the proper setting of parameters. As the data scale increases, the improvement in speed becomes increasingly challenging. As the dimension (feature number) largely increases while the sample size remains small, the avoidance of overfitting becomes a significant challenge. In this study, we propose a two-phase sequential minimal optimization (TSMO) to largely reduce the training cost for large-scale data (tested with 3186–70,000-sample datasets) and a two-phased-in differential-learning particle swarm optimization (tDPSO) to ensure the accuracy for under-sampled data (tested with 2000–24481-feature datasets). Because the purpose of training SVMs is to identify support vectors that denote a hyperplane, TSMO is developed to quickly select support vector candidates from the entire dataset and then identify support vectors from those candidates. In this manner, the computational burden is largely reduced (a 29.4%–65.3% reduction rate). The proposed tDPSO uses topology variation and differential learning to solve PSO’s premature convergence issue. Population diversity is ensured through dynamic topology until a ring connection is achieved (topology-variation phases). Further, particles initiate chemo-type simulated-annealing operations, and the global-best particle takes a two-turn diversion in response to stagnation (event-induced phases). The proposed tDPSO-embedded SVMs were tested with several under-sampled noisy cancer datasets and showed superior performance over various methods, even those methods with feature selection for the preprocessing of data.  相似文献   

10.
GA-based learning bias selection mechanism for real-time scheduling systems   总被引:1,自引:0,他引:1  
The use of machine learning technologies in order to develop knowledge bases (KBs) for real-time scheduling (RTS) problems has produced encouraging results in recent researches. However, few researches focus on the manner of selecting proper learning biases in the early developing stage of the RTS system to enhance the generalization ability of the resulting KBs. The selected learning bias usually assumes a set of proper system features that are known in advance. Moreover, the machine learning algorithm for developing scheduling KBs is predetermined. The purpose of this study is to develop a genetic algorithm (GA)-based learning bias selection mechanism to determine an appropriate learning bias that includes the machine learning algorithm, feature subset, and learning parameters. Three machine learning algorithms are considered: the back propagation neural network (BPNN), C4.5 decision tree (DT) learning, and support vector machines (SVMs). The proposed GA-based learning bias selection mechanism can search the best machine learning algorithm and simultaneously determine the optimal subset of features and the learning parameters used to build the RTS system KBs. In terms of the accuracy of prediction of unseen data under various performance criteria, it also offers better generalization ability as compared to the case where the learning bias selection mechanism is not used. Furthermore, the proposed approach to build RTS system KBs can improve the system performance as compared to other classifier KBs under various performance criteria over a long period.  相似文献   

11.
Ikeda K  Murata N 《Neural computation》2005,17(11):2508-2529
By employing the L1 or Linfinity norms in maximizing margins, support vector machines (SVMs) result in a linear programming problem that requires a lower computational load compared to SVMs with the L2 norm. However, how the change of norm affects the generalization ability of SVMs has not been clarified so far except for numerical experiments. In this letter, the geometrical meaning of SVMs with the Lp norm is investigated, and the SVM solutions are shown to have rather little dependency on p.  相似文献   

12.
In this paper, we present a genetic fuzzy feature transformation method for support vector machines (SVMs) to do more accurate data classification. Given data are first transformed into a high feature space by a fuzzy system, and then SVMs are used to map data into a higher feature space and then construct the hyperplane to make a final decision. Genetic algorithms are used to optimize the fuzzy feature transformation so as to use the newly generated features to help SVMs do more accurate biomedical data classification under uncertainty. The experimental results show that the new genetic fuzzy SVMs have better generalization abilities than the traditional SVMs in terms of prediction accuracy.  相似文献   

13.
Linear programming support vector machines   总被引:4,自引:0,他引:4  
Weida  Li  Licheng 《Pattern recognition》2002,35(12):2927-2936
Based on the analysis of the conclusions in the statistical learning theory, especially the VC dimension of linear functions, linear programming support vector machines (or SVMs) are presented including linear programming linear and nonlinear SVMs. In linear programming SVMs, in order to improve the speed of the training time, the bound of the VC dimension is loosened properly. Simulation results for both artificial and real data show that the generalization performance of our method is a good approximation of SVMs and the computation complex is largely reduced by our method.  相似文献   

14.
文益民 《计算机工程》2006,32(21):177-179,182
基于支持向量能够代表训练集分类特征的特点,该文提出了一种基于支持向量的分层并行筛选训练样本的机器学习方法。该方法按照分而治之的思想将原分类问题分解成若干子问题,将训练样本的筛选过程分解成级联的2个层次。每层采用并行方法提取各训练集中的支持向量,这些被提取的支持向量将作为下一层的训练样本,各层训练集中的非支持向量通过学习被逐步筛选掉。为了保证问题的一致性,引入了交叉合并规则,仿真实验结果表明该方法在保证分类器推广能力的情况下,缩短了支持向量机的训练时间,减少了支持向量的数目。  相似文献   

15.
核方法在人脸识别中的应用   总被引:1,自引:0,他引:1  
1 引言人脸识别技术广泛应用于身份验证、门检系统以及人员监视等方面,在过去的几年里,人脸识别技术有了很大的发展。人脸识别技术与普通的模式识别不同,主要是因为在一般的模式识别中,有几个分类,每个分类中有很多样本,这样可以安排大量样本进行训练;相反,人脸识别中通常会有很多不同的人脸,每个人脸代表一个分类,而每个分类中的样本数都比较少,在很多情况下,甚至每个人只有一张图片(如身份证照片),在文[4]中提出了处理只有一个样本情况下的人脸识别。  相似文献   

16.
Model selection for support vector machines via uniform design   总被引:2,自引:0,他引:2  
The problem of choosing a good parameter setting for a better generalization performance in a learning task is the so-called model selection. A nested uniform design (UD) methodology is proposed for efficient, robust and automatic model selection for support vector machines (SVMs). The proposed method is applied to select the candidate set of parameter combinations and carry out a k-fold cross-validation to evaluate the generalization performance of each parameter combination. In contrast to conventional exhaustive grid search, this method can be treated as a deterministic analog of random search. It can dramatically cut down the number of parameter trials and also provide the flexibility to adjust the candidate set size under computational time constraint. The key theoretic advantage of the UD model selection over the grid search is that the UD points are “far more uniform”and “far more space filling” than lattice grid points. The better uniformity and space-filling phenomena make the UD selection scheme more efficient by avoiding wasteful function evaluations of close-by patterns. The proposed method is evaluated on different learning tasks, different data sets as well as different SVM algorithms.  相似文献   

17.
In the past decade, support vector machines (SVMs) have gained the attention of many researchers. SVMs are non-parametric supervised learning schemes that rely on statistical learning theory which enables learning machines to generalize well to unseen data. SVMs refer to kernel-based methods that have been introduced as a robust approach to classification and regression problems, lately has handled nonlinear identification problems, the so called support vector regression. In SVMs designs for nonlinear identification, a nonlinear model is represented by an expansion in terms of nonlinear mappings of the model input. The nonlinear mappings define a feature space, which may have infinite dimension. In this context, a relevant identification approach is the least squares support vector machines (LS-SVMs). Compared to the other identification method, LS-SVMs possess prominent advantages: its generalization performance (i.e. error rates on test sets) either matches or is significantly better than that of the competing methods, and more importantly, the performance does not depend on the dimensionality of the input data. Consider a constrained optimization problem of quadratic programing with a regularized cost function, the training process of LS-SVM involves the selection of kernel parameters and the regularization parameter of the objective function. A good choice of these parameters is crucial for the performance of the estimator. In this paper, the LS-SVMs design proposed is the combination of LS-SVM and a new chaotic differential evolution optimization approach based on Ikeda map (CDEK). The CDEK is adopted in tuning of regularization parameter and the radial basis function bandwith. Simulations using LS-SVMs on NARX (Nonlinear AutoRegressive with eXogenous inputs) for the identification of a thermal process show the effectiveness and practicality of the proposed CDEK algorithm when compared with the classical DE approach.  相似文献   

18.
Random forest classifier for remote sensing classification   总被引:4,自引:0,他引:4  
Growing an ensemble of decision trees and allowing them to vote for the most popular class produced a significant increase in classification accuracy for land cover classification. The objective of this study is to present results obtained with the random forest classifier and to compare its performance with the support vector machines (SVMs) in terms of classification accuracy, training time and user defined parameters. Landsat Enhanced Thematic Mapper Plus (ETM+) data of an area in the UK with seven different land covers were used. Results from this study suggest that the random forest classifier performs equally well to SVMs in terms of classification accuracy and training time. This study also concludes that the number of user‐defined parameters required by random forest classifiers is less than the number required for SVMs and easier to define.  相似文献   

19.
Business failure prediction (BFP) is an effective tool to help financial institutions and relevant people to make the right decision in investments, especially in the current competitive environment. This topic belongs to a classification-type task, one of whose aims is to generate more accurate hit ratios. Support vector machine (SVM) is a statistical learning technique, whose advantage is its high generalization performance. The objective of this context is threefold. Firstly, SVM is used to predict business failure by utilizing a straightforward wrapper approach to help the model produce more accurate prediction. The wrapper approach is fulfilled by employing a forward feature selection method, composed of feature ranking and feature selection. Meanwhile, this work attempts to investigate the feasibility of using linear SVMs to select features for all SVMs in the wrapper since non-linear SVMs yield to over-fit the data. Finally, a robust re-sampling approach is used to evaluate model performances for the task of BFP in China. In the empirical research, performances of linear SVM, polynomial SVM, Gaussian SVM, and sigmoid SVM with the best filter of stepwise MDA, and wrappers respectively using linear SVM and non-linear SVMs as evaluating functions are to be compared. The results indicate that the non-linear SVM with radial basis function kernel and features selected by linear SVM compare significantly superiorly to all the other SVMs. Meanwhile, all SVMs with features selected by linear SVM produce at least as good performances as SVMs with other optimal features.  相似文献   

20.
The antenna design is a complicated and time‐consuming procedure. This work explores using support vector machines (SVMs), a statistical learning theory based on the structural risk minimization principle and has a great generalization capability, as a fast and accurate tool in the antenna design. As examples, SVMs is used to design a rectangular patch antenna and a rectangular patch antenna array. Results show, after an appropriate training, SVMs is able to effectively design antennas with high accuracy. © 2010 Wiley Periodicals, Inc. Int J RF and Microwave CAE, 2011.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号