首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The objective of the proposed study is to explore the performance of credit scoring using a two-stage hybrid modeling procedure with artificial neural networks and multivariate adaptive regression splines (MARS). The rationale under the analyses is firstly to use MARS in building the credit scoring model, the obtained significant variables are then served as the input nodes of the neural networks model. To demonstrate the effectiveness and feasibility of the proposed modeling procedure, credit scoring tasks are performed on one bank housing loan dataset using cross-validation approach. As the results reveal, the proposed hybrid approach outperforms the results using discriminant analysis, logistic regression, artificial neural networks and MARS and hence provides an alternative in handling credit scoring tasks.  相似文献   

2.
Least squares support vector machines ensemble models for credit scoring   总被引:1,自引:0,他引:1  
Due to recent financial crisis and regulatory concerns of Basel II, credit risk assessment is becoming one of the most important topics in the field of financial risk management. Quantitative credit scoring models are widely used tools for credit risk assessment in financial institutions. Although single support vector machines (SVM) have been demonstrated with good performance in classification, a single classifier with a fixed group of training samples and parameters setting may have some kind of inductive bias. One effective way to reduce the bias is ensemble model. In this study, several ensemble models based on least squares support vector machines (LSSVM) are brought forward for credit scoring. The models are tested on two real world datasets and the results show that ensemble strategies can help to improve the performance in some degree and are effective for building credit scoring models.  相似文献   

3.
Support vector machines (SVM) is an effective tool for building good credit scoring models. However, the performance of the model depends on its parameters’ setting. In this study, we use direct search method to optimize the SVM-based credit scoring model and compare it with other three parameters optimization methods, such as grid search, method based on design of experiment (DOE) and genetic algorithm (GA). Two real-world credit datasets are selected to demonstrate the effectiveness and feasibility of the method. The results show that the direct search method can find the effective model with high classification accuracy and good robustness and keep less dependency on the initial search space or point setting.  相似文献   

4.
Recently, credit scoring has become a very important task as credit cards are now widely used by customers. A method that can accurately predict credit scoring is greatly needed and good prediction techniques can help to predict credit more accurately. One powerful classifier, the support vector machine (SVM), was successfully applied to a wide range of domains. In recent years, researchers have applied the SVM-based in the prediction of credit scoring, and the results have been shown it to be effective. In this study, two real world credit datasets in the University of California Irvine Machine Learning Repository were selected. SVM and a new classifier, clustering-launched classification (CLC), were employed to predict the accuracy of credit scoring. The advantages of using CLC are that it can classify data efficiently and only need one parameter needs to be decided. In substance, the results show that CLC is better than SVM. Therefore, CLC is an effective tool to predict credit scoring.  相似文献   

5.
The commencement of the Basel II requirement, popularization of consumer loans and the intense competition in financial market has increased the awareness of the critical delinquency issue for financial institutions in granting loans to potential applicants. In the past few decades, the scheme of artificial neural networks has been successfully applied to the financial field. Recently, the Support Vector Machine (SVM) has emerged as the better neural network in dealing with classification and forecasting problems due to its superior features of generalization performance and global optimum. This study develops a loan evaluation model using SVM to identify potential applicants for consumer loans. In addition to conducting experiments on performance comparison via cross-validation and paired t test, we analyze misclassification errors in terms of Type I and Type II and their effect on selecting network parameters of SVM. The analysis findings facilitate the development of a useful visual decision-support tool. The experimental results using a real-world data set reveal that SVM surpasses traditional neural network models in generalization performance and visualization via the visual tool, which helps decision makers determine appropriate loan evaluation strategies.  相似文献   

6.
SVM (support vector machines) techniques have recently arrived to complete the wide range of classification methods for complex systems. These classification systems offer similar performances to other classifiers (such as the neuronal networks or classic statistical classifiers) and they are becoming a valuable tool in industry for the resolution of real problems. One of the fundamental elements of this type of classifier is the metric used for determining the distance between samples of the population to be classified. Although the Euclidean distance measure is the most natural metric for solving problems, it presents certain disadvantages when trying to develop classification systems that can be adapted as the characteristics of the sample space change. Our study proposes a means of avoiding this problem using the multivariate normalization of the inputs (both during the training and classification processes). Using experimental results produced from a significant number of populations, the study confirms the improvement achieved in the classification processes. Lastly, the study demonstrates that the multivariate normalization applied to a real SVM is equivalent to the use of a SVM that uses the Mahalanobis distance measure, for non-normalized data.  相似文献   

7.
Support vector regression (SVR) is a powerful tool in modeling and prediction tasks with widespread application in many areas. The most representative algorithms to train SVR models are Shevade et al.'s Modification 2 and Lin's WSS1 and WSS2 methods in the LIBSVM library. Both are variants of standard SMO in which the updating pairs selected are those that most violate the Karush-Kuhn-Tucker optimality conditions, to which LIBSVM adds a heuristic to improve the decrease in the objective function. In this paper, and after presenting a simple derivation of the updating procedure based on a greedy maximization of the gain in the objective function, we show how cycle-breaking techniques that accelerate the convergence of support vector machines (SVM) in classification can also be applied under this framework, resulting in significantly improved training times for SVR.  相似文献   

8.
Credit scoring with a data mining approach based on support vector machines   总被引:3,自引:0,他引:3  
The credit card industry has been growing rapidly recently, and thus huge numbers of consumers’ credit data are collected by the credit department of the bank. The credit scoring manager often evaluates the consumer’s credit with intuitive experience. However, with the support of the credit classification model, the manager can accurately evaluate the applicant’s credit score. Support Vector Machine (SVM) classification is currently an active research area and successfully solves classification problems in many domains. This study used three strategies to construct the hybrid SVM-based credit scoring models to evaluate the applicant’s credit score from the applicant’s input features. Two credit datasets in UCI database are selected as the experimental data to demonstrate the accuracy of the SVM classifier. Compared with neural networks, genetic programming, and decision tree classifiers, the SVM classifier achieved an identical classificatory accuracy with relatively few input features. Additionally, combining genetic algorithms with SVM classifier, the proposed hybrid GA-SVM strategy can simultaneously perform feature selection task and model parameters optimization. Experimental results show that SVM is a promising addition to the existing data mining methods.  相似文献   

9.
With the rapid growth of credit industry, credit scoring model has a great significance to issue a credit card to the applicant with a minimum risk. So credit scoring is very important in financial firm like bans etc. With the previous data, a model is established. From that model is decision is taken whether he will be granted for issuing loans, credit cards or he will be rejected. There are several methodologies to construct credit scoring model i.e. neural network model, statistical classification techniques, genetic programming, support vector model etc. Computational time for running a model has a great importance in the 21st century. The algorithms or models with less computational time are more efficient and thus gives more profit to the banks or firms. In this study, we proposed a new strategy to reduce the computational time for credit scoring. In this approach we have used SVM incorporated with the concept of reduction of features using F score and taking a sample instead of taking the whole dataset to create the credit scoring model. We run our method two real dataset to see the performance of the new method. We have compared the result of the new method with the result obtained from other well known method. It is shown that new method for credit scoring model is very much competitive to other method in the view of its accuracy as well as new method has a less computational time than the other methods.  相似文献   

10.
Consumer credit scoring is often considered a classification task where clients receive either a good or a bad credit status. Default probabilities provide more detailed information about the creditworthiness of consumers, and they are usually estimated by logistic regression. Here, we present a general framework for estimating individual consumer credit risks by use of machine learning methods. Since a probability is an expected value, all nonparametric regression approaches which are consistent for the mean are consistent for the probability estimation problem. Among others, random forests (RF), k-nearest neighbors (kNN), and bagged k-nearest neighbors (bNN) belong to this class of consistent nonparametric regression approaches. We apply the machine learning methods and an optimized logistic regression to a large dataset of complete payment histories of short-termed installment credits. We demonstrate probability estimation in Random Jungle, an RF package written in C++ with a generalized framework for fast tree growing, probability estimation, and classification. We also describe an algorithm for tuning the terminal node size for probability estimation. We demonstrate that regression RF outperforms the optimized logistic regression model, kNN, and bNN on the test data of the short-term installment credits.  相似文献   

11.
基于最小二乘支持向量机变形,得到一个极其简单快速的分类器--直接支持向量机.与最小二乘支持向量机相比,该分类器只需直接求解一个更小规模矩阵的逆,大大减小了计算量,并未降低分类精度.从理论上证明了该矩阵可逆,保证了分类面存在的唯一性.对于线性情形,采用Sherman-Morrison-Woodbury公式降低可逆矩阵的维数,进一步减少了计算复杂度,使其可适用于更大规模的样本集.数值实验表明,新分类器可行并具有上述优势.  相似文献   

12.
This paper introduces a cylindricity evaluation algorithm based on support vector machine learning with a specific kernel function, referred to as SVR, as a viable alternative to traditional least square method (LSQ) and non-linear programming algorithm (NLP). Using the theory of support vector machine regression, the proposed algorithm in this paper provides more robust evaluation in terms of CPU time and accuracy than NLP and this is supported by computational experiments. Interestingly, it has been shown that the SVR significantly outperforms LSQ in terms of the accuracy while it can evaluate the cylindricity in a more robust fashion than NLP when the variance of the data points increases. The robust nature of the proposed algorithm is expected because it converts the original nonlinear problem with nonlinear constraints into other nonlinear problem with linear constraints. In addition, the proposed algorithm is programmed using Java Runtime Environment to provide users with a Web based open source environment. In a real-world setting, this would provide manufacturers with an algorithm that can be trusted to give the correct answer rather than making a good part rejected because of inaccurate computational results.  相似文献   

13.
In cancer classification based on gene expression data, it would be desirable to defer a decision for observations that are difficult to classify. For instance, an observation for which the conditional probability of being cancer is around 1/2 would preferably require more advanced tests rather than an immediate decision. This motivates the use of a classifier with a reject option that reports a warning in cases of observations that are difficult to classify. In this paper, we consider a problem of gene selection with a reject option. Typically, gene expression data comprise of expression levels of several thousands of candidate genes. In such cases, an effective gene selection procedure is necessary to provide a better understanding of the underlying biological system that generates data and to improve prediction performance. We propose a machine learning approach in which we apply the l1 penalty to the SVM with a reject option. This method is referred to as the l1 SVM with a reject option. We develop a novel optimization algorithm for this SVM, which is sufficiently fast and stable to analyze gene expression data. The proposed algorithm realizes an entire solution path with respect to the regularization parameter. Results of numerical studies show that, in comparison with the standard l1 SVM, the proposed method efficiently reduces prediction errors without hampering gene selectivity.  相似文献   

14.
A parallel randomized support vector machine (PRSVM) and a parallel randomized support vector regression (PRSVR) algorithm based on a randomized sampling technique are proposed in this paper. The proposed PRSVM and PRSVR have four major advantages over previous methods. (1) We prove that the proposed algorithms achieve an average convergence rate that is so far the fastest bounded convergence rate, among all SVM decomposition training algorithms to the best of our knowledge. The fast average convergence bound is achieved by a unique priority based sampling mechanism. (2) Unlike previous work (Provably fast training algorithm for support vector machines, 2001) the proposed algorithms work for general linear-nonseparable SVM and general non-linear SVR problems. This improvement is achieved by modeling new LP-type problems based on Karush–Kuhn–Tucker optimality conditions. (3) The proposed algorithms are the first parallel version of randomized sampling algorithms for SVM and SVR. Both the analytical convergence bound and the numerical results in a real application show that the proposed algorithm has good scalability. (4) We present demonstrations of the algorithms based on both synthetic data and data obtained from a real word application. Performance comparisons with SVMlight show that the proposed algorithms may be efficiently implemented.  相似文献   

15.
Fuzzy functions with support vector machines   总被引:1,自引:0,他引:1  
A new fuzzy system modeling (FSM) approach that identifies the fuzzy functions using support vector machines (SVM) is proposed. This new approach is structurally different from the fuzzy rule base approaches and fuzzy regression methods. It is a new alternate version of the earlier FSM with fuzzy functions approaches. SVM is applied to determine the support vectors for each fuzzy cluster obtained by fuzzy c-means (FCM) clustering algorithm. Original input variables, the membership values obtained from the FCM together with their transformations form a new augmented set of input variables. The performance of the proposed system modeling approach is compared to previous fuzzy functions approaches, standard SVM, LSE methods using an artificial sparse dataset and a real-life non-sparse dataset. The results indicate that the proposed fuzzy functions with support vector machines approach is a feasible and stable method for regression problems and results in higher performances than the classical statistical methods.  相似文献   

16.
Support vector machine (SVM) is a novel pattern classification method that is valuable in many applications. Kernel parameter setting in the SVM training process, along with the feature selection, significantly affects classification accuracy. The objective of this study is to obtain the better parameter values while also finding a subset of features that does not degrade the SVM classification accuracy. This study develops a simulated annealing (SA) approach for parameter determination and feature selection in the SVM, termed SA-SVM.To measure the proposed SA-SVM approach, several datasets in UCI machine learning repository are adopted to calculate the classification accuracy rate. The proposed approach was compared with grid search which is a conventional method of performing parameter setting, and various other methods. Experimental results indicate that the classification accuracy rates of the proposed approach exceed those of grid search and other approaches. The SA-SVM is thus useful for parameter determination and feature selection in the SVM.  相似文献   

17.
We propose new support vector machines (SVMs) that incorporate the geometric distribution of an input data set by associating each data point with a possibilistic membership, which measures the relative strength of the self class membership. By using a possibilistic distance measure based on the possibilistic membership, we reformulate conventional SVMs in three ways. The proposed methods are shown to have better classification performance than conventional SVMs in various tests.  相似文献   

18.
Texture classification using the support vector machines   总被引:12,自引:0,他引:12  
Shutao  James T.  Hailong  Yaonan 《Pattern recognition》2003,36(12):2883-2893
In recent years, support vector machines (SVMs) have demonstrated excellent performance in a variety of pattern recognition problems. In this paper, we apply SVMs for texture classification, using translation-invariant features generated from the discrete wavelet frame transform. To alleviate the problem of selecting the right kernel parameter in the SVM, we use a fusion scheme based on multiple SVMs, each with a different setting of the kernel parameter. Compared to the traditional Bayes classifier and the learning vector quantization algorithm, SVMs, and, in particular, the fused output from multiple SVMs, produce more accurate classification results on the Brodatz texture album.  相似文献   

19.
Recently, researchers are focusing more on the study of support vector machine (SVM) due to its useful applications in a number of areas, such as pattern recognition, multimedia, image processing and bioinformatics. One of the main research issues is how to improve the efficiency of the original SVM model, while preventing any deterioration of the classification performance of the model. In this paper, we propose a modified SVM based on the properties of support vectors and a pruning strategy to preserve support vectors, while eliminating redundant training vectors at the same time. The experiments on real images show that (1) our proposed approach can reduce the number of input training vectors, while preserving the support vectors, which leads to a significant reduction in the computational cost while attaining similar levels of accuracy. (2)The approach also works well when applied to image segmentation.  相似文献   

20.
Support vector clustering involves three steps—solving an optimization problem, identification of clusters and tuning of hyper-parameters. In this paper, we introduce a pre-processing step that eliminates data points from the training data that are not crucial for clustering. Pre-processing is efficiently implemented using the R*-tree data structure. Experiments on real-world and synthetic datasets show that pre-processing drastically decreases the run-time of the clustering algorithm. Also, in many cases reduction in the number of support vectors is achieved. Further, we suggest an improvement for the step of identification of clusters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号