首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Embedding feature selection in nonlinear support vector machines (SVMs) leads to a challenging non-convex minimization problem, which can be prone to suboptimal solutions. This paper develops an effective algorithm to directly solve the embedded feature selection primal problem. We use a trust-region method, which is better suited for non-convex optimization compared to line-search methods, and guarantees convergence to a minimizer. We devise an alternating optimization approach to tackle the problem efficiently, breaking it down into a convex subproblem, corresponding to standard SVM optimization, and a non-convex subproblem for feature selection. Importantly, we show that a straightforward alternating optimization approach can be susceptible to saddle point solutions. We propose a novel technique, which shares an explicit margin variable to overcome saddle point convergence and improve solution quality. Experiment results show our method outperforms the state-of-the-art embedded SVM feature selection method, as well as other leading filter and wrapper approaches.  相似文献   

2.
We introduce an embedded method that simultaneously selects relevant features during classifier construction by penalizing each feature’s use in the dual formulation of support vector machines (SVM). This approach called kernel-penalized SVM (KP-SVM) optimizes the shape of an anisotropic RBF Kernel eliminating features that have low relevance for the classifier. Additionally, KP-SVM employs an explicit stopping condition, avoiding the elimination of features that would negatively affect the classifier’s performance. We performed experiments on four real-world benchmark problems comparing our approach with well-known feature selection techniques. KP-SVM outperformed the alternative approaches and determined consistently fewer relevant features.  相似文献   

3.
In many pattern recognition applications, high-dimensional feature vectors impose a high computational cost as well as the risk of “overfitting”. Feature Selection addresses the dimensionality reduction problem by determining a subset of available features which is most essential for classification. This paper presents a novel feature selection method named filtered and supported sequential forward search (FS_SFS) in the context of support vector machines (SVM). In comparison with conventional wrapper methods that employ the SFS strategy, FS_SFS has two important properties to reduce the time of computation. First, it dynamically maintains a subset of samples for the training of SVM. Because not all the available samples participate in the training process, the computational cost to obtain a single SVM classifier is decreased. Secondly, a new criterion, which takes into consideration both the discriminant ability of individual features and the correlation between them, is proposed to effectively filter out nonessential features. As a result, the total number of training is significantly reduced and the overfitting problem is alleviated. The proposed approach is tested on both synthetic and real data to demonstrate its effectiveness and efficiency.  相似文献   

4.
5.
The efficiency of the intrusion detection is mainly depended on the dimension of data features. By using the gradually feature removal method, 19 critical features are chosen to represent for the various network visit. With the combination of clustering method, ant colony algorithm and support vector machine (SVM), an efficient and reliable classifier is developed to judge a network visit to be normal or not. Moreover, the accuracy achieves 98.6249% in 10-fold cross validation and the average Matthews correlation coefficient (MCC) achieves 0.861161.  相似文献   

6.
Early detection of ventricular fibrillation (VF) is crucial for the success of the defibrillation therapy in automatic devices. A high number of detectors have been proposed based on temporal, spectral, and time-frequency parameters extracted from the surface electrocardiogram (ECG), showing always a limited performance. The combination ECG parameters on different domain (time, frequency, and time-frequency) using machine learning algorithms has been used to improve detection efficiency. However, the potential utilization of a wide number of parameters benefiting machine learning schemes has raised the need of efficient feature selection (FS) procedures. In this study, we propose a novel FS algorithm based on support vector machines (SVM) classifiers and bootstrap resampling (BR) techniques. We define a backward FS procedure that relies on evaluating changes in SVM performance when removing features from the input space. This evaluation is achieved according to a nonparametric statistic based on BR. After simulation studies, we benchmark the performance of our FS algorithm in AHA and MIT-BIH ECG databases. Our results show that the proposed FS algorithm outperforms the recursive feature elimination method in synthetic examples, and that the VF detector performance improves with the reduced feature set.  相似文献   

7.
Support vector machine (SVM) is a novel pattern classification method that is valuable in many applications. Kernel parameter setting in the SVM training process, along with the feature selection, significantly affects classification accuracy. The objective of this study is to obtain the better parameter values while also finding a subset of features that does not degrade the SVM classification accuracy. This study develops a simulated annealing (SA) approach for parameter determination and feature selection in the SVM, termed SA-SVM.To measure the proposed SA-SVM approach, several datasets in UCI machine learning repository are adopted to calculate the classification accuracy rate. The proposed approach was compared with grid search which is a conventional method of performing parameter setting, and various other methods. Experimental results indicate that the classification accuracy rates of the proposed approach exceed those of grid search and other approaches. The SA-SVM is thus useful for parameter determination and feature selection in the SVM.  相似文献   

8.
基于网格模式搜索的支持向量机模型选择   总被引:2,自引:0,他引:2  
支持向量机的模型选择问题就是对于一个给定的核函数,调节核参数和惩罚因子C。分析了网格搜索算法和模式搜索算法,通过结合上述两种算法的优点提出了网格模式搜索算法。其核心原理是先用网格算法在全局范围内进行快速搜索,找到最优解的最小区间,再在这个最小区间内用模式搜索算法找到最优解。实验证明,网格模式搜索具有学习精度高和速度快的优点。  相似文献   

9.
Feature selection for text categorization is a well-studied problem and its goal is to improve the effectiveness of categorization, or the efficiency of computation, or both. The system of text categorization based on traditional term-matching is used to represent the vector space model as a document; however, it needs a high dimensional space to represent the document, and does not take into account the semantic relationship between terms, which leads to a poor categorization accuracy. The latent semantic indexing method can overcome this problem by using statistically derived conceptual indices to replace the individual terms. With the purpose of improving the accuracy and efficiency of categorization, in this paper we propose a two-stage feature selection method. Firstly, we apply a novel feature selection method to reduce the dimension of terms; and then we construct a new semantic space, between terms, based on the latent semantic indexing method. Through some applications involving the spam database categorization, we find that our two-stage feature selection method performs better.  相似文献   

10.
特征子集选择和训练参数的优化一直是SVM研究中的两个重要方面,选择合适的特征和合理的训练参数可以提高SVM分类器的性能,以往的研究是将两个问题分别进行解决。随着遗传优化等自然计算技术在人工智能领域的应用,开始出现特征选择及参数的同时优化研究。研究采用免疫遗传算法(IGA)对特征选择及SVM 参数的同时优化,提出了一种IGA-SVM 算法。实验表明,该方法可找出合适的特征子集及SVM 参数,并取得较好的分类效果,证明算法的有效性。  相似文献   

11.
We present a two-step method to speed-up object detection systems in computer vision that use support vector machines as classifiers. In the first step we build a hierarchy of classifiers. On the bottom level, a simple and fast linear classifier analyzes the whole image and rejects large parts of the background. On the top level, a slower but more accurate classifier performs the final detection. We propose a new method for automatically building and training a hierarchy of classifiers. In the second step we apply feature reduction to the top level classifier by choosing relevant image features according to a measure derived from statistical learning theory. Experiments with a face detection system show that combining feature reduction with hierarchical classification leads to a speed-up by a factor of 335 with similar classification performance.  相似文献   

12.
支持向量机和最小二乘支持向量机的比较及应用研究   总被引:56,自引:3,他引:56  
介绍和比较了支持向量机分类器和量小二乘支持向量机分类器的算法。并将支持向量机分类器和量小二乘支持向量机分类器应用于心脏病诊断,取得了较高的准确率。所用数据来自UCI bench—mark数据集。实验结果表明,支持向量机和量小二乘支持向量机在医疗诊断中有很大的应用潜力。  相似文献   

13.
基于启发式遗传算法的SVM模型自动选择   总被引:6,自引:0,他引:6  
支撑矢量机(SVM)模型的自动选择是其实际应用的关键.常用的基于穷举搜索的留一法(LOO)很繁杂且效率很低.到目前为止,大多数的算法并不能有效地实现模型自动选择.本文利用实值编码的启发式遗传算法实现基于高斯核函数的SVM模型自动选择.在重点分析了SVM超参数对其性能的影响和两种SVM性能估计的基础上,确定了合适的遗传算法适应度函数.人造数据及实际数据的仿真结果表明了所提方法的可行性和高效性.  相似文献   

14.
One of the most powerful, popular and accurate classification techniques is support vector machines (SVMs). In this work, we want to evaluate whether the accuracy of SVMs can be further improved using training set selection (TSS), where only a subset of training instances is used to build the SVM model. By contrast to existing approaches, we focus on wrapper TSS techniques, where candidate subsets of training instances are evaluated using the SVM training accuracy. We consider five wrapper TSS strategies and show that those based on evolutionary approaches can significantly improve the accuracy of SVMs.  相似文献   

15.
Support vector machines (SVMs) are a class of popular classification algorithms for their high generalization ability. However, it is time-consuming to train SVMs with a large set of learning samples. Improving learning efficiency is one of most important research tasks on SVMs. It is known that although there are many candidate training samples in some learning tasks, only the samples near decision boundary which are called support vectors have impact on the optimal classification hyper-planes. Finding these samples and training SVMs with them will greatly decrease training time and space complexity. Based on the observation, we introduce neighborhood based rough set model to search boundary samples. Using the model, we firstly divide sample spaces into three subsets: positive region, boundary and noise. Furthermore, we partition the input features into four subsets: strongly relevant features, weakly relevant and indispensable features, weakly relevant and superfluous features, and irrelevant features. Then we train SVMs only with the boundary samples in the relevant and indispensable feature subspaces, thus feature and sample selection is simultaneously conducted with the proposed model. A set of experimental results show the model can select very few features and samples for training; in the mean time the classification performances are preserved or even improved.  相似文献   

16.
Model selection for support vector machines via uniform design   总被引:2,自引:0,他引:2  
The problem of choosing a good parameter setting for a better generalization performance in a learning task is the so-called model selection. A nested uniform design (UD) methodology is proposed for efficient, robust and automatic model selection for support vector machines (SVMs). The proposed method is applied to select the candidate set of parameter combinations and carry out a k-fold cross-validation to evaluate the generalization performance of each parameter combination. In contrast to conventional exhaustive grid search, this method can be treated as a deterministic analog of random search. It can dramatically cut down the number of parameter trials and also provide the flexibility to adjust the candidate set size under computational time constraint. The key theoretic advantage of the UD model selection over the grid search is that the UD points are “far more uniform”and “far more space filling” than lattice grid points. The better uniformity and space-filling phenomena make the UD selection scheme more efficient by avoiding wasteful function evaluations of close-by patterns. The proposed method is evaluated on different learning tasks, different data sets as well as different SVM algorithms.  相似文献   

17.
An automated approach to degradation analysis is proposed that uses a rotating machine’s acoustic signal to determine Remaining Useful Life (RUL). High resolution spectral features are extracted from the acoustic data collected over the entire lifetime of the machine. A novel approach to the computation of Mutual Information based Feature Subset Selection is applied, to remove redundant and irrelevant features, that does not require class label boundaries of the dataset or spectral locations of developing defect to be known or pre-estimated. Using subsets of the feature space, multi-class linear and Radial Basis Function (RBF) Support Vector Machine (SVM) classifiers are developed and a comparison of their performance is provided. Performance of all classifiers is found to be very high, 85 to 98%, with RBF SVMs outperforming linear SVMs when a smaller number of features are used. As larger numbers of features are used for classification, the problem space becomes more linearly separable and the linear SVMs are shown to have comparable performance. A detailed analysis of the misclassifications is provided and an approach to better understand and interpret costly misclassifications is discussed. While defining class label boundaries using an automated k-means clustering algorithm improves performance with an accuracy of approximately 99%, further analysis shows that in 88% of all misclassifications the actual class of failure had the next highest probability of occurring. Thus, a system that incorporates probability distributions as a measure of confidence for the predicted RUL would provide additional valuable information for scheduling preventative maintenance. This work was supported by IDA Ireland.  相似文献   

18.
This paper presents a hybrid filter-wrapper feature subset selection algorithm based on particle swarm optimization (PSO) for support vector machine (SVM) classification. The filter model is based on the mutual information and is a composite measure of feature relevance and redundancy with respect to the feature subset selected. The wrapper model is a modified discrete PSO algorithm. This hybrid algorithm, called maximum relevance minimum redundancy PSO (mr2PSO), is novel in the sense that it uses the mutual information available from the filter model to weigh the bit selection probabilities in the discrete PSO. Hence, mr2PSO uniquely brings together the efficiency of filters and the greater accuracy of wrappers. The proposed algorithm is tested over several well-known benchmarking datasets. The performance of the proposed algorithm is also compared with a recent hybrid filter-wrapper algorithm based on a genetic algorithm and a wrapper algorithm based on PSO. The results show that the mr2PSO algorithm is competitive in terms of both classification accuracy and computational performance.  相似文献   

19.
A note on Platt’s probabilistic outputs for support vector machines   总被引:9,自引:0,他引:9  
Platt’s probabilistic outputs for Support Vector Machines (Platt, J. in Smola, A., et al. (eds.) Advances in large margin classifiers. Cambridge, 2000) has been popular for applications that require posterior class probabilities. In this note, we propose an improved algorithm that theoretically converges and avoids numerical difficulties. A simple and ready-to-use pseudo code is included. Editor: Dale Schuurmans.  相似文献   

20.
Pedestrians are the vulnerable participants in transportation system when crashes happen. It is important to detect pedestrian efficiently and accurately in many computer vision applications, such as intelligent transportation systems (ITSs) and safety driving assistant systems (SDASs). This paper proposes a two-stage pedestrian detection method based on machine vision. In the first stage, AdaBoost algorithm and cascading method are adopted to segment pedestrian candidates from image. To confirm whether each candidate is pedestrian or not, a second stage is needed to eliminate some false positives. In this stage, a pedestrian recognizing classifier is trained with support vector machine (SVM). The input features used for SVM training are extracted from both the sample gray images and edge images. Finally, the performance of the proposed pedestrian detection method is tested with real-world data. Results show that the performance is better than conventional single-stage classifier, such as AdaBoost based or SVM based classifier.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号