共查询到20条相似文献,搜索用时 0 毫秒
1.
A data driven ensemble classifier for credit scoring analysis 总被引:2,自引:0,他引:2
This study focuses on predicting whether a credit applicant can be categorized as good, bad or borderline from information initially supplied. This is essentially a classification task for credit scoring. Given its importance, many researchers have recently worked on an ensemble of classifiers. However, to the best of our knowledge, unrepresentative samples drastically reduce the accuracy of the deployment classifier. Few have attempted to preprocess the input samples into more homogeneous cluster groups and then fit the ensemble classifier accordingly. For this reason, we introduce the concept of class-wise classification as a preprocessing step in order to obtain an efficient ensemble classifier. This strategy would work better than a direct ensemble of classifiers without the preprocessing step. The proposed ensemble classifier is constructed by incorporating several data mining techniques, mainly involving optimal associate binning to discretize continuous values; neural network, support vector machine, and Bayesian network are used to augment the ensemble classifier. In particular, the Markov blanket concept of Bayesian network allows for a natural form of feature selection, which provides a basis for mining association rules. The learned knowledge is represented in multiple forms, including causal diagram and constrained association rules. The data driven nature of the proposed system distinguishes it from existing hybrid/ensemble credit scoring systems. 相似文献
2.
María-Dolores Cubiles-De-La-Vega Antonio Blanco-Oliver Rafael Pino-Mejías Juan Lara-Rubio 《Expert systems with applications》2013,40(17):6910-6917
A wide range of supervised classification algorithms have been successfully applied for credit scoring in non-microfinance environments according to recent literature. However, credit scoring in the microfinance industry is a relatively recent application, and current research is based, to the best of our knowledge, on classical statistical methods. This lack is surprising since the implementation of credit scoring based on supervised classification algorithms should contribute towards the efficiency of microfinance institutions, thereby improving their competitiveness in an increasingly constrained environment. This paper explores an extensive list of Statistical Learning techniques as microfinance credit scoring tools from an empirical viewpoint. A data set of microcredits belonging to a Peruvian Microfinance Institution is considered, and the following models are applied to decide between default and non-default credits: linear and quadratic discriminant analysis, logistic regression, multilayer perceptron, support vector machines, classification trees, and ensemble methods based on bagging and boosting algorithm. The obtained results suggest the use of a multilayer perceptron trained in the R statistical system with a second order algorithm. Moreover, our findings show that, with the implementation of this MLP-based model, the MFI? misclassification costs could be reduced to 13.7% with respect to the application of other classic models. 相似文献
3.
In this paper, a generalized adaptive ensemble generation and aggregation (GAEGA) method for the design of multiple classifier systems (MCSs) is proposed. GAEGA adopts an “over-generation and selection” strategy to achieve a good bias-variance tradeoff. In the training phase, different ensembles of classifiers are adaptively generated by fitting the validation data globally with different degrees. The test data are then classified by each of the generated ensembles. The final decision is made by taking into consideration both the ability of each ensemble to fit the validation data locally and reducing the risk of overfitting. In this paper, the performance of GAEGA is assessed experimentally in comparison with other multiple classifier aggregation methods on 16 data sets. The experimental results demonstrate that GAEGA significantly outperforms the other methods in terms of average accuracy, ranging from 2.6% to 17.6%. 相似文献
4.
Due to recent financial crisis and regulatory concerns of Basel II, credit risk assessment is becoming one of the most important topics in the field of financial risk management. Quantitative credit scoring models are widely used tools for credit risk assessment in financial institutions. Although single support vector machines (SVM) have been demonstrated with good performance in classification, a single classifier with a fixed group of training samples and parameters setting may have some kind of inductive bias. One effective way to reduce the bias is ensemble model. In this study, several ensemble models based on least squares support vector machines (LSSVM) are brought forward for credit scoring. The models are tested on two real world datasets and the results show that ensemble strategies can help to improve the performance in some degree and are effective for building credit scoring models. 相似文献
5.
A number of credit scoring models have been developed to evaluate credit risk of new loan applicants and existing loan customers, respectively. This study proposes a method to manage existing customers by using misclassification patterns of credit scoring model. We divide two groups of customers, the currently good and bad credit customers, into two subgroups, respectively, according to whether their credit status is misclassified or not by the neural network model. In addition, we infer the characteristics of each subgroup and propose management strategies corresponding to each subgroup. 相似文献
6.
Analyzing bank databases for customer behavior management is difficult since bank databases are multi-dimensional, comprised of monthly account records and daily transaction records. This study proposes an integrated data mining and behavioral scoring model to manage existing credit card customers in a bank. A self-organizing map neural network was used to identify groups of customers based on repayment behavior and recency, frequency, monetary behavioral scoring predicators. It also classified bank customers into three major profitable groups of customers. The resulting groups of customers were then profiled by customer's feature attributes determined using an Apriori association rule inducer. This study demonstrates that identifying customers by a behavioral scoring model is helpful characteristics of customer and facilitates marketing strategy development. 相似文献
7.
The aim of bankruptcy prediction in the areas of data mining and machine learning is to develop an effective model which can provide the higher prediction accuracy. In the prior literature, various classification techniques have been developed and studied, in/with which classifier ensembles by combining multiple classifiers approach have shown their outperformance over many single classifiers. However, in terms of constructing classifier ensembles, there are three critical issues which can affect their performance. The first one is the classification technique actually used/adopted, and the other two are the combination method to combine multiple classifiers and the number of classifiers to be combined, respectively. Since there are limited, relevant studies examining these aforementioned disuses, this paper conducts a comprehensive study of comparing classifier ensembles by three widely used classification techniques including multilayer perceptron (MLP) neural networks, support vector machines (SVM), and decision trees (DT) based on two well-known combination methods including bagging and boosting and different numbers of combined classifiers. Our experimental results by three public datasets show that DT ensembles composed of 80–100 classifiers using the boosting method perform best. The Wilcoxon signed ranked test also demonstrates that DT ensembles by boosting perform significantly different from the other classifier ensembles. Moreover, a further study over a real-world case by a Taiwan bankruptcy dataset was conducted, which also demonstrates the superiority of DT ensembles by boosting over the others. 相似文献
8.
The credit card industry has been growing rapidly recently, and thus huge numbers of consumers’ credit data are collected by the credit department of the bank. The credit scoring manager often evaluates the consumer’s credit with intuitive experience. However, with the support of the credit classification model, the manager can accurately evaluate the applicant’s credit score. Support Vector Machine (SVM) classification is currently an active research area and successfully solves classification problems in many domains. This study used three strategies to construct the hybrid SVM-based credit scoring models to evaluate the applicant’s credit score from the applicant’s input features. Two credit datasets in UCI database are selected as the experimental data to demonstrate the accuracy of the SVM classifier. Compared with neural networks, genetic programming, and decision tree classifiers, the SVM classifier achieved an identical classificatory accuracy with relatively few input features. Additionally, combining genetic algorithms with SVM classifier, the proposed hybrid GA-SVM strategy can simultaneously perform feature selection task and model parameters optimization. Experimental results show that SVM is a promising addition to the existing data mining methods. 相似文献
9.
基于数据挖掘聚类技术的信用评分评级 总被引:7,自引:0,他引:7
本文提出了一个基于数据挖掘聚类技术的信用评分评级方法。该方法使用数据挖掘的聚类算法,对传统信用评分模型进行了改进,本文给出了方法的理论证明,并在一个信用卡分析系统DMCA中实现了该方法,进行了详细的数据测试。理论证明及实验结果都表明,聚类技术在传统信用评分模型的DM/MTM,分界值,均方差,交叉验证等问题上取得了良好的效果。 相似文献
10.
Neural nets have become one of the most important tools using in credit scoring. Credit scoring is regarded as a core appraised tool of commercial banks during the last few decades. The purpose of this paper is to investigate the ability of neural nets, such as probabilistic neural nets and multi-layer feed-forward nets, and conventional techniques such as, discriminant analysis, probit analysis and logistic regression, in evaluating credit risk in Egyptian banks applying credit scoring models. The credit scoring task is performed on one bank’s personal loans’ data-set. The results so far revealed that the neural nets-models gave a better average correct classification rate than the other techniques. A one-way analysis of variance and other tests have been applied, demonstrating that there are some significant differences amongst the means of the correct classification rates, pertaining to different techniques. 相似文献
11.
Bo-Wen ChiChiun-Chieh Hsu 《Expert systems with applications》2012,39(3):2650-2661
Credit scoring model is an important tool for assessing risks in financial industry, consequently the majority of financial institutions actively develops credit scoring model on the credit approval assessment of new customers and the credit risk management of existing customers. Nonetheless, most past researches used the one-dimensional credit scoring model to measure customer risk. In this study, we select important variables by genetic algorithm (GA) to combine the bank’s internal behavioral scoring model with the external credit bureau scoring model to construct the dual scoring model for credit risk management of mortgage accounts. It undergoes more accurate risk judgment and segmentation to further discover the parts which are required to be enhanced in management or control from mortgage portfolio. The results show that the predictive ability of the dual scoring model outperforms both one-dimensional behavioral scoring model and credit bureau scoring model. Moreover, this study proposes credit strategies such as on-lending retaining and collection actions for corresponding customers in order to contribute benefits to the practice of banking credit. 相似文献
12.
In this paper the problem of finding piecewise linear boundaries between sets is considered and is applied for solving supervised data classification problems. An algorithm for the computation of piecewise linear boundaries, consisting of two main steps, is proposed. In the first step sets are approximated by hyperboxes to find so-called “indeterminate” regions between sets. In the second step sets are separated inside these “indeterminate” regions by piecewise linear functions. These functions are computed incrementally starting with a linear function. Results of numerical experiments are reported. These results demonstrate that the new algorithm requires a reasonable training time and it produces consistently good test set accuracy on most data sets comparing with mainstream classifiers. 相似文献
13.
Jue Wang Abdel-Rahman HedarShouyang Wang Jian Ma 《Expert systems with applications》2012,39(6):6123-6128
As the credit industry has been growing rapidly, credit scoring models have been widely used by the financial industry during this time to improve cash flow and credit collections. However, a large amount of redundant information and features are involved in the credit dataset, which leads to lower accuracy and higher complexity of the credit scoring model. So, effective feature selection methods are necessary for credit dataset with huge number of features. In this paper, a novel approach, called RSFS, to feature selection based on rough set and scatter search is proposed. In RSFS, conditional entropy is regarded as the heuristic to search the optimal solutions. Two credit datasets in UCI database are selected to demonstrate the competitive performance of RSFS consisted in three credit models including neural network model, J48 decision tree and Logistic regression. The experimental result shows that RSFS has a superior performance in saving the computational costs and improving classification accuracy compared with the base classification methods. 相似文献
14.
Recently, credit scoring has become a very important task as credit cards are now widely used by customers. A method that can accurately predict credit scoring is greatly needed and good prediction techniques can help to predict credit more accurately. One powerful classifier, the support vector machine (SVM), was successfully applied to a wide range of domains. In recent years, researchers have applied the SVM-based in the prediction of credit scoring, and the results have been shown it to be effective. In this study, two real world credit datasets in the University of California Irvine Machine Learning Repository were selected. SVM and a new classifier, clustering-launched classification (CLC), were employed to predict the accuracy of credit scoring. The advantages of using CLC are that it can classify data efficiently and only need one parameter needs to be decided. In substance, the results show that CLC is better than SVM. Therefore, CLC is an effective tool to predict credit scoring. 相似文献
15.
采用一种有限资源人工免疫分类器研究高速公路事件检测问题.阐述了人工免疫识别系统(AIRS)的算法,然后分析了高速公路事件对交通流的影响,并选取多个时刻的上游流量、上游占有率和下游流量、下游占有率作为AIRS的输入量,最后用高速公路管理部门提供的样本数据进行了仿真实验.实验结果表明,人工免疫分类器具有很快的学习速度和较高的分类精度,它为高速公路事件检测提供了一种切实可行的新途径. 相似文献
16.
人工免疫数据挖掘方法的分析与研究展望 总被引:3,自引:2,他引:3
目前,受生物免疫系统启发而产生的人工免疫系统正在兴起,它作为计算智能研究的新领域,提供了一种强大的信息处理和问题求解范式,简要介绍了生物免疫系统的结构和相关机理。对人工免疫系统在数据挖掘方面的应用进行了回顾,分析了近年来AIS在数据挖掘应用领域的研究成果,指出了进一步的研究方向。 相似文献
17.
《Expert systems with applications》2014,41(8):3825-3830
Previous studies about ensembles of classifiers for bankruptcy prediction and credit scoring have been presented. In these studies, different ensemble schemes for complex classifiers were applied, and the best results were obtained using the Random Subspace method. The Bagging scheme was one of the ensemble methods used in the comparison. However, it was not correctly used. It is very important to use this ensemble scheme on weak and unstable classifiers for producing diversity in the combination. In order to improve the comparison, Bagging scheme on several decision trees models is applied to bankruptcy prediction and credit scoring. Decision trees encourage diversity for the combination of classifiers. Finally, an experimental study shows that Bagging scheme on decision trees present the best results for bankruptcy prediction and credit scoring. 相似文献
18.
An artificial immune system approach to CNC tool path generation 总被引:2,自引:0,他引:2
Erkan Ülker Mehmet Emin Turanalp H. Selçuk Halkaci 《Journal of Intelligent Manufacturing》2009,20(1):67-77
Reduced machining time and increased accuracy for a sculptured surface are both very important when producing complicated
parts, so, the step-size and tool-path interval are essential components in high-speed and high-resolution machining. If they
are too small, the machining time will increase, whereas if they are too large, rough surfaces will result. In particular,
the machining time, which is a key factor in high-speed machining, is affected by the tool-path interval more than the step
size. The present paper introduces a ‘system software’ developed to reduce machining time and increased accuracy for a sculptured
surface with Non-Uniform Rational B-Spline (NURBS) patches. The system is mainly based on a new and a powerful artificial
intelligence (AI) tool, called artificial immune systems (AIS). It is implemented using C programming language on a PC. It can be used as stand alone system or as the integrated module
of a CNC machine tool. With the use of AIS, the impact and power of AI techniques have been reflected on the performance of the tool path optimization system. The methodology
of the developed tool path optimization system is illustrated with practical examples in this paper. 相似文献
19.
人工免疫系统的基本理论及其应用 总被引:2,自引:0,他引:2
介绍了生物免疫系统的工作机制与特性及人工免疫算法,且将人工免疫系统与其他智能方法进行比较.还归纳了人工免疫系统的工程应用并对人工免疫系统需深入研究的方向进行了展望. 相似文献
20.
Akhil Bandhu HensManoj Kumar Tiwari 《Expert systems with applications》2012,39(8):6774-6781
With the rapid growth of credit industry, credit scoring model has a great significance to issue a credit card to the applicant with a minimum risk. So credit scoring is very important in financial firm like bans etc. With the previous data, a model is established. From that model is decision is taken whether he will be granted for issuing loans, credit cards or he will be rejected. There are several methodologies to construct credit scoring model i.e. neural network model, statistical classification techniques, genetic programming, support vector model etc. Computational time for running a model has a great importance in the 21st century. The algorithms or models with less computational time are more efficient and thus gives more profit to the banks or firms. In this study, we proposed a new strategy to reduce the computational time for credit scoring. In this approach we have used SVM incorporated with the concept of reduction of features using F score and taking a sample instead of taking the whole dataset to create the credit scoring model. We run our method two real dataset to see the performance of the new method. We have compared the result of the new method with the result obtained from other well known method. It is shown that new method for credit scoring model is very much competitive to other method in the view of its accuracy as well as new method has a less computational time than the other methods. 相似文献