期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Feature selection for support vector machine-based face-iris multimodal biometric system

Heng Fui Liau Dino Isa 《Expert systems with applications》2011,38(9):11105-11111

Multimodal biometric can overcome the limitation possessed by single biometric trait and give better classification accuracy. This paper proposes face-iris multimodal biometric system based on fusion at matching score level using support vector machine (SVM). The performances of face and iris recognition can be enhanced using a proposed feature selection method to select an optimal subset of features. Besides, a simple computation speed-up method is proposed for SVM. The results show that the proposed feature selection method is able improve the classification accuracy in terms of total error rate. The support vector machine-based fusion method also gave very promising results. 相似文献

2.

Speech recognition with improved support vector machine using dual classifiers and cross fitness validation

B.?Kanisha Email author S.?Lokesh Priyan?Malarvizhi?Kumar View author&#;s OrcID profile P.?Parthasarathy Gokulnath?Chandra Babu 《Personal and Ubiquitous Computing》2018,22(5-6):1083-1091

In this research, a new speech recognition method based on improved feature extraction and improved support vector machine (ISVM) is developed. A Gaussian filter is used to denoise the input speech signal. The feature extraction method extracts five features such as peak values, Mel frequency cepstral coefficient (MFCC), tri-spectral features, discrete wavelet transform (DWT), and the difference values between the input and the standard signal. Next, these features are scaled using linear identical scaling (LIS) method with the same scaling method and the same scaling factors for each set of features in both training and testing phases. Following this, to accomplish the training process, an ISVM is developed with best fitness validation. The ISVM consists of two stages: (i) linear dual classifier that finds the same class attributes and different class attributes simultaneously and (ii) cross fitness validation (CFV) method to prevent over fitting problem. The proposed speech recognition method offers 98.2% accuracy. 相似文献

3.

A support vector machine-based model for detecting top management fraud

Ping-Feng Pai Ming-Fu Hsu Ming-Chieh Wang 《Knowledge》2011,24(2):314-321

Detecting fraudulent financial statements (FFS) is critical in order to protect the global financial market. In recent years, FFS have begun to appear and continue to grow rapidly, which has shocked the confidence of investors and threatened the economics of entire countries. While auditors are the last line of defense to detect FFS, many auditors lack the experience and expertise to deal with the related risks. This study introduces a support vector machine-based fraud warning (SVMFW) model to reduce these risks. The model integrates sequential forward selection (SFS), support vector machine (SVM), and a classification and regression tree (CART). SFS is employed to overcome information overload problems, and the SVM technique is then used to assess the likelihood of FFS. To select the parameters of SVM models, particle swarm optimization (PSO) is applied. Finally, CART is employed to enable auditors to increase substantive testing during their audit procedures by adopting reliable, easy-to-grasp decision rules. The experiment results show that the SVMFW model can reduce unnecessary information, satisfactorily detect FFS, and provide directions for properly allocating audit resources in limited audits. The model is a promising alternative for detecting FFS caused by top management, and it can assist in both taxation and the banking system. 相似文献

4.

Leave one support vector out cross validation for fast estimation of generalization errors

K.W. Lau Author Vitae Author Vitae 《Pattern recognition》2004,37(9):1835-1840

A Support Vector Classifier (SVC) is formulated in terms of a kernel. The bandwidth of the kernel affects the generalization performance of the SVC. This paper presents a Leave One Support Vector Out Cross Validation (LOSVO-CV) algorithm for estimating the optimal bandwidth of the kernel for classification purpose. The proposed algorithm is based on the Leave One Out Cross Validation (LOO-CV) algorithm (Numer. Math. 31 (1979) 377) that was proposed to find the optimal bandwidth but difficult to be implemented due to its large amount of computation. The properties of LOSVO-CV are analyzed in comparison with the LOO-CV. The simulation study demonstrates that the LOSVO-CV is a fast algorithm and it has the same generalization performance optimized by a bootstrap method (Neural Process. Lett. 11 (2000) 51) which can find an optimal bandwidth of the kernel of the SVC. The LOSVO-CV algorithm is able to provide consistent results with different sizes of a benchmark data set which is obtained from the University of California (UCI) repository. 相似文献

5.

Associated evolution of a support vector machine-based classifier for pedestrian detection

X.B. Cao Y.W. Xu D. Chen H. Qiao 《Information Sciences》2009,179(8):1070-4877

Support vector machine (SVM) has become a dominant classification technique used in pedestrian detection systems. In such systems, classifiers are used to detect pedestrians in some input frames. The performance of a SVM classifier is mainly influenced by two factors: the selected features and the parameters of the kernel function. These two factors are highly related and therefore, it is desirable that the two factors can be analyzed simultaneously, which are usually not the case in the previous work.In this paper, we propose an evolutionary method to simultaneously optimize the feature set and the parameters for the SVM classifier. Specifically, adaptive genetic operators were designed to be suitable for the feature selection and parameter tuning. The proposed method is used to train a SVM classifier for pedestrian detection. Experiments in real city traffic scenes show that the proposed approach leads to higher detection accuracy and shorter detection time. 相似文献

6.

On cross validation for model selection 总被引：8，自引：0，他引：8

Rivals I Personnaz L 《Neural computation》1999,11(4):863-870

In response to Zhu and Rower (1996), a recent communication (Goutte, 1997) established that leave-one-out cross validation is not subject to the "no-free-lunch" criticism. Despite this optimistic conclusion, we show here that cross validation has very poor performances for the selection of linear models as compared to classic statistical tests. We conclude that the statistical tests are preferable to cross validation for linear as well as for nonlinear model selection. 相似文献

7.

Information criteria for support vector machines

Kobayashi K. Komaki F. 《Neural Networks, IEEE Transactions on》2006,17(3):571-577

This paper presents kernel regularization information criterion (KRIC), which is a new criterion for tuning regularization parameters in kernel logistic regression (KLR) and support vector machines (SVMs). The main idea of the KRIC is based on the regularization information criterion (RIC). We derive an eigenvalue equation to calculate the KRIC and solve the problem. The computational cost for parameter tuning by the KRIC is reduced drastically by using the Nystro/spl uml/m approximation. The test error rate of SVMs or KLR with the regularization parameter tuned by the KRIC is comparable with the one by the cross validation or evaluation of the evidence. The computational cost of the KRIC is significantly lower than the one of the other criteria. 相似文献

8.

Simultaneous feature selection and classification using kernel-penalized support vector machines 总被引：2，自引：0，他引：2

Sebastián Maldonado Jayanta Basak 《Information Sciences》2011,181(1):115-128

We introduce an embedded method that simultaneously selects relevant features during classifier construction by penalizing each feature’s use in the dual formulation of support vector machines (SVM). This approach called kernel-penalized SVM (KP-SVM) optimizes the shape of an anisotropic RBF Kernel eliminating features that have low relevance for the classifier. Additionally, KP-SVM employs an explicit stopping condition, avoiding the elimination of features that would negatively affect the classifier’s performance. We performed experiments on four real-world benchmark problems comparing our approach with well-known feature selection techniques. KP-SVM outperformed the alternative approaches and determined consistently fewer relevant features. 相似文献

9.

A hybrid stock selection model using genetic algorithms and support vector regression

Chien-Feng Huang 《Applied Soft Computing》2012,12(2):807-818

In the areas of investment research and applications, feasible quantitative models include methodologies stemming from soft computing for prediction of financial time series, multi-objective optimization of investment return and risk reduction, as well as selection of investment instruments for portfolio management based on asset ranking using a variety of input variables and historical data, etc. Among all these, stock selection has long been identified as a challenging and important task. This line of research is highly contingent upon reliable stock ranking for successful portfolio construction. Recent advances in machine learning and data mining are leading to significant opportunities to solve these problems more effectively. In this study, we aim at developing a methodology for effective stock selection using support vector regression (SVR) as well as genetic algorithms (GAs). We first employ the SVR method to generate surrogates for actual stock returns that in turn serve to provide reliable rankings of stocks. Top-ranked stocks can thus be selected to form a portfolio. On top of this model, the GA is employed for the optimization of model parameters, and feature selection to acquire optimal subsets of input variables to the SVR model. We will show that the investment returns provided by our proposed methodology significantly outperform the benchmark. Based upon these promising results, we expect this hybrid GA-SVR methodology to advance the research in soft computing for finance and provide an effective solution to stock selection in practice. 相似文献

10.

Credit scoring using support vector machines with direct search for parameters selection

Ligang Zhou Kin Keung Lai Lean Yu 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2009,13(2):149-155

Support vector machines (SVM) is an effective tool for building good credit scoring models. However, the performance of the model depends on its parameters’ setting. In this study, we use direct search method to optimize the SVM-based credit scoring model and compare it with other three parameters optimization methods, such as grid search, method based on design of experiment (DOE) and genetic algorithm (GA). Two real-world credit datasets are selected to demonstrate the effectiveness and feasibility of the method. The results show that the direct search method can find the effective model with high classification accuracy and good robustness and keep less dependency on the initial search space or point setting. 相似文献

11.

Optimal feature selection for support vector machines

Minh Hoai Nguyen Author Vitae Fernando de la Torre Author Vitae 《Pattern recognition》2010,43(3):584-591

Selecting relevant features for support vector machine (SVM) classifiers is important for a variety of reasons such as generalization performance, computational efficiency, and feature interpretability. Traditional SVM approaches to feature selection typically extract features and learn SVM parameters independently. Independently performing these two steps might result in a loss of information related to the classification process. This paper proposes a convex energy-based framework to jointly perform feature selection and SVM parameter learning for linear and non-linear kernels. Experiments on various databases show significant reduction of features used while maintaining classification performance. 相似文献

12.

Feature selection using support vector machines and bootstrap methods for ventricular fibrillation detection

Felipe Alonso-Atienza José Luis Rojo-ÁlvarezAlfredo Rosado-Muñoz Juan J. VinagreArcadi García-Alberola Gustavo Camps-Valls 《Expert systems with applications》2012,39(2):1956-1967

Early detection of ventricular fibrillation (VF) is crucial for the success of the defibrillation therapy in automatic devices. A high number of detectors have been proposed based on temporal, spectral, and time-frequency parameters extracted from the surface electrocardiogram (ECG), showing always a limited performance. The combination ECG parameters on different domain (time, frequency, and time-frequency) using machine learning algorithms has been used to improve detection efficiency. However, the potential utilization of a wide number of parameters benefiting machine learning schemes has raised the need of efficient feature selection (FS) procedures. In this study, we propose a novel FS algorithm based on support vector machines (SVM) classifiers and bootstrap resampling (BR) techniques. We define a backward FS procedure that relies on evaluating changes in SVM performance when removing features from the input space. This evaluation is achieved according to a nonparametric statistic based on BR. After simulation studies, we benchmark the performance of our FS algorithm in AHA and MIT-BIH ECG databases. Our results show that the proposed FS algorithm outperforms the recursive feature elimination method in synthetic examples, and that the VF detector performance improves with the reduced feature set. 相似文献

13.

Parameter determination of support vector machine and feature selection using simulated annealing approach 总被引：1，自引：0，他引：1

Shih-Wei Lin Zne-Jung Lee Shih-Chieh Chen Tsung-Yuan Tseng 《Applied Soft Computing》2008,8(4):1505-1512

Support vector machine (SVM) is a novel pattern classification method that is valuable in many applications. Kernel parameter setting in the SVM training process, along with the feature selection, significantly affects classification accuracy. The objective of this study is to obtain the better parameter values while also finding a subset of features that does not degrade the SVM classification accuracy. This study develops a simulated annealing (SA) approach for parameter determination and feature selection in the SVM, termed SA-SVM.To measure the proposed SA-SVM approach, several datasets in UCI machine learning repository are adopted to calculate the classification accuracy rate. The proposed approach was compared with grid search which is a conventional method of performing parameter setting, and various other methods. Experimental results indicate that the classification accuracy rates of the proposed approach exceed those of grid search and other approaches. The SA-SVM is thus useful for parameter determination and feature selection in the SVM. 相似文献

14.

Multisurface proximal support vector machine classification via generalized eigenvalues 总被引：6，自引：0，他引：6

Mangasarian OL Wild EW 《IEEE transactions on pattern analysis and machine intelligence》2006,28(1):69-74

A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the other data set. Each of the two nonparallel proximal planes is obtained by a single MATLAB command as the eigenvector corresponding to a smallest eigenvalue of a generalized eigenvalue problem. Classification by proximity to two distinct nonlinear surfaces generated by a nonlinear kernel also leads to two simple generalized eigenvalue problems. The effectiveness of the proposed method is demonstrated by tests on simple examples as well as on a number of public data sets. These examples show the advantages of the proposed approach in both computation time and test set correctness. 相似文献

15.

Fuzzy multi-category proximal support vector classification via generalized eigenvalues 总被引：2，自引：0，他引：2

Jayadeva Reshma Khemchandani Suresh Chandra 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2007,11(7):679-685

Given a dataset, where each point is labeled with one of M labels, we propose a technique for multi-category proximal support vector classification via generalized eigenvalues (MGEPSVMs). Unlike Support Vector Machines that classify points by assigning them to one of M disjoint half-spaces, here points are classified by assigning them to the closest of M non-parallel planes that are close to their respective classes. When the data contains samples belonging to several classes, classes often overlap, and classifiers that solve for several non-parallel planes may often be able to better resolve test samples. In multicategory classification tasks, a training point may have similarities with prototypes of more than one class. This information can be used in a fuzzy setting. We propose a fuzzy multi-category classifier that utilizes information about the membership of training samples, to improve the generalization ability of the classifier. The desired classifier is obtained by using one-from-rest (OFR) separation for each class, i.e. 1: M -1 classification. Experimental results demonstrate the efficacy of the proposed classifier over MGEPSVMs. 相似文献

16.

Neighborhood property-based pattern selection for support vector machines

Shin H Cho S 《Neural computation》2007,19(3):816-855

The support vector machine (SVM) has been spotlighted in the machine learning community because of its theoretical soundness and practical performance. When applied to a large data set, however, it requires a large memory and a long time for training. To cope with the practical difficulty, we propose a pattern selection algorithm based on neighborhood properties. The idea is to select only the patterns that are likely to be located near the decision boundary. Those patterns are expected to be more informative than the randomly selected patterns. The experimental results provide promising evidence that it is possible to successfully employ the proposed algorithm ahead of SVM training. 相似文献

17.

Feature selection in the Laplacian support vector machine

Sangjun Lee Ja-Yong Koo 《Computational statistics & data analysis》2011,55(1):567-577

Traditional classifiers including support vector machines use only labeled data in training. However, labeled instances are often difficult, costly, or time consuming to obtain while unlabeled instances are relatively easy to collect. The goal of semi-supervised learning is to improve the classification accuracy by using unlabeled data together with a few labeled data in training classifiers. Recently, the Laplacian support vector machine has been proposed as an extension of the support vector machine to semi-supervised learning. The Laplacian support vector machine has drawbacks in its interpretability as the support vector machine has. Also it performs poorly when there are many non-informative features in the training data because the final classifier is expressed as a linear combination of informative as well as non-informative features. We introduce a variant of the Laplacian support vector machine that is capable of feature selection based on functional analysis of variance decomposition. Through synthetic and benchmark data analysis, we illustrate that our method can be a useful tool in semi-supervised learning. 相似文献

18.

Ultrasonographic feature selection and pattern classification for cervical lymph nodes using support vector machines

Zhang J Wang Y Dong Y Wang Y 《Computer methods and programs in biomedicine》2007,88(1):75-84

A rough margin based support vector machine (RMSVM) classifier was proposed to improve the accuracy of ultrasound diagnoses for cervical lymph nodes. Thirty-six features belonging to 10 kinds of ultrasonographic characteristics were extracted for each of 110 lymph nodes in ultrasonograms. Comparison studies were done for three classifiers--the classical support vector machine (SVM), the general regression neural network and the proposed RMSVM, with or without the feature selection by the recursive feature elimination (RFE) algorithm, respectively, based on SVMs and the mean square error discriminant. It was indicated by experimental results that all classifiers benefited from the feature selection. The best classification performance was obtained by the RMSVM using thirteen features selected by the RMSVM based RFE, which yielded the normalized area under the receiver operating characteristic curve (A(z)) of 0.859. Compared with the radiologist's performance of A(z) of 0.787, the developed computer-aided diagnosis algorithm has the potential to improve the diagnostic accuracy. 相似文献

19.

Feedforward neural network construction using cross validation.

R Setiono 《Neural computation》2001,13(12):2865-2877

This article presents an algorithm that constructs feedforward neural networks with a single hidden layer for pattern classification. The algorithm starts with a small number of hidden units in the network and adds more hidden units as needed to improve the network's predictive accuracy. To determine when to stop adding new hidden units, the algorithm makes use of a subset of the available training samples for cross validation. New hidden units are added to the network only if they improve the classification accuracy of the network on the training samples and on the cross-validation samples. Extensive experimental results show that the algorithm is effective in obtaining networks with predictive accuracy rates that are better than those obtained by state-of-the-art decision tree methods. 相似文献

20.

A multiple criteria active learning method for support vector regression

Begüm Demir Lorenzo Bruzzone 《Pattern recognition》2014

This paper presents a novel active learning method developed in the framework of ε-insensitive support vector regression (SVR) for the solution of regression problems with small size initial training data. The proposed active learning method selects iteratively the most informative as well as representative unlabeled samples to be included in the training set by jointly evaluating three criteria: (i) relevancy, (ii) diversity, and (iii) density of samples. All three criteria are implemented according to the SVR properties and are applied in two clustering-based consecutive steps. In the first step, a novel measure to select the most relevant samples that have high probability to be located either outside or on the boundary of the ε-tube of SVR is defined. To this end, initially a clustering method is applied to all unlabeled samples together with the training samples that are inside the ε-tube (those that are not support vectors, i.e., non-SVs); then the clusters with non-SVs are eliminated. The unlabeled samples in the remaining clusters are considered as the most relevant patterns. In the second step, a novel measure to select diverse samples among the relevant patterns from the high density regions in the feature space is defined to better model the SVR learning function. To this end, initially clusters with the highest density of samples are chosen to identify the highest density regions in the feature space. Then, the sample from each selected cluster that is associated with the portion of feature space having the highest density (i.e., the most representative of the underlying distribution of samples contained in the related cluster) is selected to be included in the training set. In this way diverse samples taken from high density regions are efficiently identified. Experimental results obtained on four different data sets show the robustness of the proposed technique particularly when a small-size initial training set are available. 相似文献