共查询到20条相似文献,搜索用时 0 毫秒
1.
Simultaneous feature selection and classification using kernel-penalized support vector machines 总被引:2,自引:0,他引:2
We introduce an embedded method that simultaneously selects relevant features during classifier construction by penalizing each feature’s use in the dual formulation of support vector machines (SVM). This approach called kernel-penalized SVM (KP-SVM) optimizes the shape of an anisotropic RBF Kernel eliminating features that have low relevance for the classifier. Additionally, KP-SVM employs an explicit stopping condition, avoiding the elimination of features that would negatively affect the classifier’s performance. We performed experiments on four real-world benchmark problems comparing our approach with well-known feature selection techniques. KP-SVM outperformed the alternative approaches and determined consistently fewer relevant features. 相似文献
2.
Minh Hoai Nguyen Author Vitae Fernando de la Torre Author Vitae 《Pattern recognition》2010,43(3):584-591
Selecting relevant features for support vector machine (SVM) classifiers is important for a variety of reasons such as generalization performance, computational efficiency, and feature interpretability. Traditional SVM approaches to feature selection typically extract features and learn SVM parameters independently. Independently performing these two steps might result in a loss of information related to the classification process. This paper proposes a convex energy-based framework to jointly perform feature selection and SVM parameter learning for linear and non-linear kernels. Experiments on various databases show significant reduction of features used while maintaining classification performance. 相似文献
3.
Yi Liu Author Vitae Author Vitae 《Pattern recognition》2006,39(7):1333-1345
In many pattern recognition applications, high-dimensional feature vectors impose a high computational cost as well as the risk of “overfitting”. Feature Selection addresses the dimensionality reduction problem by determining a subset of available features which is most essential for classification. This paper presents a novel feature selection method named filtered and supported sequential forward search (FS_SFS) in the context of support vector machines (SVM). In comparison with conventional wrapper methods that employ the SFS strategy, FS_SFS has two important properties to reduce the time of computation. First, it dynamically maintains a subset of samples for the training of SVM. Because not all the available samples participate in the training process, the computational cost to obtain a single SVM classifier is decreased. Secondly, a new criterion, which takes into consideration both the discriminant ability of individual features and the correlation between them, is proposed to effectively filter out nonessential features. As a result, the total number of training is significantly reduced and the overfitting problem is alleviated. The proposed approach is tested on both synthetic and real data to demonstrate its effectiveness and efficiency. 相似文献
4.
Chien-Feng Huang 《Applied Soft Computing》2012,12(2):807-818
In the areas of investment research and applications, feasible quantitative models include methodologies stemming from soft computing for prediction of financial time series, multi-objective optimization of investment return and risk reduction, as well as selection of investment instruments for portfolio management based on asset ranking using a variety of input variables and historical data, etc. Among all these, stock selection has long been identified as a challenging and important task. This line of research is highly contingent upon reliable stock ranking for successful portfolio construction. Recent advances in machine learning and data mining are leading to significant opportunities to solve these problems more effectively. In this study, we aim at developing a methodology for effective stock selection using support vector regression (SVR) as well as genetic algorithms (GAs). We first employ the SVR method to generate surrogates for actual stock returns that in turn serve to provide reliable rankings of stocks. Top-ranked stocks can thus be selected to form a portfolio. On top of this model, the GA is employed for the optimization of model parameters, and feature selection to acquire optimal subsets of input variables to the SVR model. We will show that the investment returns provided by our proposed methodology significantly outperform the benchmark. Based upon these promising results, we expect this hybrid GA-SVR methodology to advance the research in soft computing for finance and provide an effective solution to stock selection in practice. 相似文献
5.
Felipe Alonso-Atienza José Luis Rojo-ÁlvarezAlfredo Rosado-Muñoz Juan J. VinagreArcadi García-Alberola Gustavo Camps-Valls 《Expert systems with applications》2012,39(2):1956-1967
Early detection of ventricular fibrillation (VF) is crucial for the success of the defibrillation therapy in automatic devices. A high number of detectors have been proposed based on temporal, spectral, and time-frequency parameters extracted from the surface electrocardiogram (ECG), showing always a limited performance. The combination ECG parameters on different domain (time, frequency, and time-frequency) using machine learning algorithms has been used to improve detection efficiency. However, the potential utilization of a wide number of parameters benefiting machine learning schemes has raised the need of efficient feature selection (FS) procedures. In this study, we propose a novel FS algorithm based on support vector machines (SVM) classifiers and bootstrap resampling (BR) techniques. We define a backward FS procedure that relies on evaluating changes in SVM performance when removing features from the input space. This evaluation is achieved according to a nonparametric statistic based on BR. After simulation studies, we benchmark the performance of our FS algorithm in AHA and MIT-BIH ECG databases. Our results show that the proposed FS algorithm outperforms the recursive feature elimination method in synthetic examples, and that the VF detector performance improves with the reduced feature set. 相似文献
6.
This article proposes a new genetic algorithm (GA) methodology to obtain parsimonious support vector regression (SVR) models capable of predicting highly precise setpoints in a continuous annealing furnace (GA-PARSIMONY). The proposal combines feature selection, model tuning, and parsimonious model selection in order to achieve robust SVR models. To this end, a novel GA selection procedure is introduced based on separate cost and complexity evaluations. The best individuals are initially sorted by an error fitness function, and afterwards, models with similar costs are rearranged according to model complexity measurement so as to foster models of lesser complexity. Therefore, the user-supplied penalty parameter, utilized to balance cost and complexity in other fitness functions, is rendered unnecessary. GA-PARSIMONY performed similarly to classical GA on twenty benchmark datasets from public repositories, but used a lower number of features in a striking 65% of models. Moreover, the performance of our proposal also proved useful in a real industrial process for predicting three temperature setpoints for a continuous annealing furnace. The results demonstrated that GA-PARSIMONY was able to generate more robust SVR models with less input features, as compared to classical GA. 相似文献
7.
针对传统方法的不足,提出将一种模拟退火组合算法用于支持向量机的参数选择,将优化指标设定为最大化SVM的泛化能力,并据此确立适当的目标函数;同时借鉴交叉检验的思想,建立以训练集和测试集中的数据分别选择模型和搜索最优参数组合的研究手段。最后,在仿真实验的基础上同基于遗传算法和精化网格法的选取方法进行了对比分析,结果表明该组合算法具有更好的全局搜索性能和收敛速度,是SVM参数选取的一种有效方法,具有较强的实用价值。 相似文献
8.
Past work on object detection has emphasized the issues of feature extraction and classification, however, relatively less attention has been given to the critical issue of feature selection. The main trend in feature extraction has been representing the data in a lower dimensional space, for example, using principal component analysis (PCA). Without using an effective scheme to select an appropriate set of features in this space, however, these methods rely mostly on powerful classification algorithms to deal with redundant and irrelevant features. In this paper, we argue that feature selection is an important problem in object detection and demonstrate that genetic algorithms (GAs) provide a simple, general, and powerful framework for selecting good subsets of features, leading to improved detection rates. As a case study, we have considered PCA for feature extraction and support vector machines (SVMs) for classification. The goal is searching the PCA space using GAs to select a subset of eigenvectors encoding important information about the target concept of interest. This is in contrast to traditional methods selecting some percentage of the top eigenvectors to represent the target concept, independently of the classification task. We have tested the proposed framework on two challenging applications: vehicle detection and face detection. Our experimental results illustrate significant performance improvements in both cases. 相似文献
9.
Optimization techniques for improving power quality data mining using wavelet packet based support vector machine 总被引:3,自引:0,他引:3
This paper aims at automatic classification of power quality events using Wavelet Packet Transform (WPT) and Support Vector Machines (SVM). The features of the disturbance signals are extracted using WPT and given to the SVM for effective classification. Recent literature dealing with power quality establishes that support vector machine methods generally outperform traditional statistical and neural methods in classification problems involving power disturbance signals. However, the two vital issues namely the determination of the most appropriate feature subset and the model selection, if suitably addressed, could pave way for further improvement of their performances in terms of classification accuracy and computation time. This paper addresses these issues through a classification system using two optimization techniques, the genetic algorithms and simulated annealing. This system detects the best discriminative features and estimates the best SVM kernel parameters in a fully automatic way. Effectiveness of the proposed detection method is shown in comparison with the conventional parameter optimization methods discussed in literature like grid search method, neural classifiers like Probabilistic Neural Network (PNN), fuzzy k-nearest neighbor classifier (FkNN) and hence proved that the proposed method is reliable as it produces consistently better results. 相似文献
10.
11.
S.P. Moustakidis Author Vitae Author Vitae 《Pattern recognition》2010,43(11):3712-3729
An efficient filter feature selection (FS) method is proposed in this paper, the SVM-FuzCoC approach, achieving a satisfactory trade-off between classification accuracy and dimensionality reduction. Additionally, the method has reasonably low computational requirements, even in high-dimensional feature spaces. To assess the quality of features, we introduce a local fuzzy evaluation measure with respect to patterns that embraces fuzzy membership degrees of every pattern in their classes. Accordingly, the above measure reveals the adequacy of data coverage provided by each feature. The required membership grades are determined via a novel fuzzy output kernel-based support vector machine, applied on single features. Based on a fuzzy complementary criterion (FuzCoC), the FS procedure iteratively selects features with maximum additional contribution in regard to the information content provided by previously selected features. This search strategy leads to small subsets of powerful and complementary features, alleviating the feature redundancy problem. We also devise different SVM-FuzCoC variants by employing seven other methods to derive fuzzy degrees from SVM outputs, based on probabilistic or fuzzy criteria. Our method is compared with a set of existing FS methods, in terms of performance capability, dimensionality reduction, and computational speed, via a comprehensive experimental setup, including synthetic and real-world datasets. 相似文献
12.
Chuan-Yu Chang Author Vitae Author Vitae Ming-Fong Tsai Author Vitae 《Pattern recognition》2010,43(10):3494-3506
Most thyroid nodules are heterogeneous with various internal components, which confuse many radiologists and physicians with their various echo patterns in ultrasound images. Numerous textural feature extraction methods are used to characterize these patterns to reduce the misdiagnosis rate. Thyroid nodules can be classified using the corresponding textural features. In this paper, six support vector machines (SVMs) are adopted to select significant textural features and to classify the nodular lesions of a thyroid. Experiment results show that the proposed method can correctly and efficiently classify thyroid nodules. A comparison with existing methods shows that the feature-selection capability of the proposed method is similar to that of the sequential-floating-forward-selection (SFFS) method, while the execution time is about 3-37 times faster. In addition, the proposed criterion function achieves higher accuracy than those of the F-score, T-test, entropy, and Bhattacharyya distance methods. 相似文献
13.
Method for prediction of protein-protein interactions in yeast using genomics/proteomics information and feature selection 总被引:1,自引:0,他引:1
J.M. UrquizaAuthor Vitae I. RojasAuthor VitaeH. PomaresAuthor Vitae L.J. HerreraAuthor VitaeJ. OrtegaAuthor Vitae A. PrietoAuthor Vitae 《Neurocomputing》2011,74(16):2683-2690
Protein-protein interaction (PPI) prediction is one of the main goals in the current Proteomics. This work presents a method for prediction of protein-protein interactions through a classification technique known as support vector machines. The dataset considered is a set of positive and negative examples taken from a high reliability source, from which we extracted a set of genomic features, proposing a similarity measure. From this dataset we extracted 26 proteomics/genomics features using well-known databases and datasets. Feature selection was performed to obtain the most relevant variables through a modified method derived from other feature selection methods for classification. Using the selected subset of features, we constructed a support vector classifier that obtains values of specificity and sensitivity higher than 90% in prediction of PPIs, and also providing a confidence score in interaction prediction of each pair of proteins. 相似文献
14.
Embedding feature selection in nonlinear support vector machines (SVMs) leads to a challenging non-convex minimization problem, which can be prone to suboptimal solutions. This paper develops an effective algorithm to directly solve the embedded feature selection primal problem. We use a trust-region method, which is better suited for non-convex optimization compared to line-search methods, and guarantees convergence to a minimizer. We devise an alternating optimization approach to tackle the problem efficiently, breaking it down into a convex subproblem, corresponding to standard SVM optimization, and a non-convex subproblem for feature selection. Importantly, we show that a straightforward alternating optimization approach can be susceptible to saddle point solutions. We propose a novel technique, which shares an explicit margin variable to overcome saddle point convergence and improve solution quality. Experiment results show our method outperforms the state-of-the-art embedded SVM feature selection method, as well as other leading filter and wrapper approaches. 相似文献
15.
One of the most powerful, popular and accurate classification techniques is support vector machines (SVMs). In this work, we want to evaluate whether the accuracy of SVMs can be further improved using training set selection (TSS), where only a subset of training instances is used to build the SVM model. By contrast to existing approaches, we focus on wrapper TSS techniques, where candidate subsets of training instances are evaluated using the SVM training accuracy. We consider five wrapper TSS strategies and show that those based on evolutionary approaches can significantly improve the accuracy of SVMs. 相似文献
16.
正交设计利用较少的实验次数就可以找出因素间的最优搭配,支持向量机能处理小样本、具有很好的泛化能力且不受数据集维数的制约。结合二者的优势,提出了基于支持向量机和正交设计的特征选择方法,根据数据集的特征数目及相应正交表的结构,安排训练、测试,最后对优选出的特征子集检验,实验结果表明该特征选择方法能够去除冗余特征而且能取得比使用特征全集更高的分类率。 相似文献
17.
Automatic recognition of sleep spindles in EEG via radial basis support vector machine based on a modified feature selection algorithm 总被引:1,自引:0,他引:1
This paper presents an application of a radial basis support vector machine (RB-SVM) to the recognition of the sleep spindles (SSs) in electroencephalographic (EEG) signal. The proposed system comprises of two stages. In the first stage, for feature extraction, a set of raw amplitude values, a set of discrete cosine transform (DCT) coefficients, a set of discrete wavelet transform (DWT) approximation coefficients and a set of adaptive autoregressive (AAR) parameters are calculated and extracted from signals separately as four different sets of feature vectors. Thus, four different feature vectors for the same data are comparatively examined. In the second stage, these features are then selected by a modified adaptive feature selection method based on sensitivity analysis, which mainly supports input dimension reduction via selecting the most significant feature elements. Then, the feature vectors are classified by a support vector machine (SVM) classifier, which is relatively new and powerful technique for solving supervised binary classification problems due to its generalization ability. Visual evaluation, by two electroencephalographers (EEGers), of 19 channel EEG records of six subjects showed that the best performance is obtained with an RB-SVM providing an average sensitivity of 97.7%, an average specificity of 97.4% and an average accuracy of 97.5%. 相似文献
18.
基于模拟退火支持向量机的入侵检测系统 总被引:2,自引:0,他引:2
为了提高入侵检测系统在小样本集条件下的检测效率,将支持向量机用于网络入侵检测.支持向量机的参数决定了检测效率,然而难以选择合适的参数值,因此提出利用模拟退火算法来优化这些参数,并设计出基于参数优化的支持向量机用于入侵检测.通过对样本数据集中的样本进行实验性检测,并与原始支持向量机入侵检测系统进行比较,结果表明模拟退火支持向量机入侵检测系统检测率高、误报率低,并且缩短了训练时间和检测时间. 相似文献
19.
We present a two-step method to speed-up object detection systems in computer vision that use support vector machines as classifiers. In the first step we build a hierarchy of classifiers. On the bottom level, a simple and fast linear classifier analyzes the whole image and rejects large parts of the background. On the top level, a slower but more accurate classifier performs the final detection. We propose a new method for automatically building and training a hierarchy of classifiers. In the second step we apply feature reduction to the top level classifier by choosing relevant image features according to a measure derived from statistical learning theory. Experiments with a face detection system show that combining feature reduction with hierarchical classification leads to a speed-up by a factor of 335 with similar classification performance. 相似文献
20.
An automated approach to degradation analysis is proposed that uses a rotating machine’s acoustic signal to determine Remaining
Useful Life (RUL). High resolution spectral features are extracted from the acoustic data collected over the entire lifetime
of the machine. A novel approach to the computation of Mutual Information based Feature Subset Selection is applied, to remove
redundant and irrelevant features, that does not require class label boundaries of the dataset or spectral locations of developing
defect to be known or pre-estimated. Using subsets of the feature space, multi-class linear and Radial Basis Function (RBF)
Support Vector Machine (SVM) classifiers are developed and a comparison of their performance is provided. Performance of all
classifiers is found to be very high, 85 to 98%, with RBF SVMs outperforming linear SVMs when a smaller number of features
are used. As larger numbers of features are used for classification, the problem space becomes more linearly separable and
the linear SVMs are shown to have comparable performance. A detailed analysis of the misclassifications is provided and an
approach to better understand and interpret costly misclassifications is discussed. While defining class label boundaries
using an automated k-means clustering algorithm improves performance with an accuracy of approximately 99%, further analysis
shows that in 88% of all misclassifications the actual class of failure had the next highest probability of occurring. Thus,
a system that incorporates probability distributions as a measure of confidence for the predicted RUL would provide additional
valuable information for scheduling preventative maintenance.
This work was supported by IDA Ireland. 相似文献