首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
代价敏感支持向量机   总被引:11,自引:1,他引:11  
以分类精度为目标的传统分类算法通常假定:每个样本的误分类具有同样的代价且每类样本数大致相等.但现实数据挖掘中该假定不成立时,这些算法的直接应用不能取得理想的分类和预测.针对此缺隙,并基于标准的SVM,通过在SVM的设计中集成样本的不同误分类代价,提出代价敏感支持向量机(CS-SVM)的设计方法.实验结果表明CS-SVM是有效的.  相似文献   

2.
基于支持向量机的代价敏感挖掘   总被引:4,自引:0,他引:4  
针对一些数据挖掘应用中反例样本和正例样本具有不同误分类代价的情况,提出一种代价敏感支持向量机算法CS-SVM.CS-SVM包括3个步骤:首先,引入Sigmoid函数,根据样本到分类超平面的距离估计其后验概率;然后,根据误分类代价最小原则重构训练样本的类标号;最后,在重构后的训练集上使用标准SVM进行学习即得到嵌入误分类代价的最优分类超平面.基于CS-SVM的思路,提出一个通用的嵌入误分类代价的代价敏感分类算法G-CSC.试验结果表明:相比于SVM,CS-SVM大大降低测试集上的平均误分类代价.  相似文献   

3.
In this paper, we propose a normalized semi-supervised probabilistic expectation-maximization neural network (PEMNN) that minimizes Bayesian misclassification cost risk. Using simulated and real-world datasets, we compare the proposed PEMNN with supervised cost sensitive probabilistic neural network (PNN), discriminant analysis (DA), mathematical integer programming (MIP) model and support vector machines (SVM) for different misclassification cost asymmetries and class biases. The results of our experiments indicate that the PEMNN performs better when class data distributions are normal or uniform. However, when class data distribution is exponential the performance of PEMNN deteriorates giving slight advantage to competing MIP, DA, PNN and SVM techniques. For real-world data with non-parametric distributions and mixed decision-making attributes (continuous and categorical), the PEMNN outperforms the PNN.  相似文献   

4.
软件缺陷预测是典型的非平衡学习问题。基于CS SVM和聚类算法改进代价敏感支持向量机(SVM)算法,提出了CCS SVM软件缺陷预测模型。在CCS SVM预测模型中,将SVM与类别误分代价结合起来,以非平衡数据评价指标作为目标函数,优化错分代价因子,提升少数类样本的识别率。通过聚类找到每类样本的中心点,根据样本到其中心点的距离定义每个样本的类别置信度,给每个样本分配不同的误分代价系数,并把样本的置信度引入到代价敏感SVM优化问题中,提高算法鲁棒性,提升SVM分类性能。此外,为了提高模型的泛化能力,使用遗传算法优化特征选择和模型参数。通过美国航空航天局NASA MDP数据集实验表明,本文方法的G mean和F measure模型评价值有明显的提升。  相似文献   

5.
基于代价敏感SVM的电信客户流失预测研究*   总被引:3,自引:0,他引:3  
针对客户流失数据集的非平衡性问题和错分代价的差异性问题,将代价敏感学习应用于Veropoulos提出的采用不同惩罚系数的支持向量机,建立客户流失预测模型,对实际的电信客户流失数据进行验证。通过与传统SVM、C4.5和ANN对比研究,结果显示此方法在精确度、命中率、覆盖率和提升度均有所改善,表明此方法有效地解决了数据集的非平衡性和错分代价问题,是进行客户流失预测的有效方法。  相似文献   

6.
The commencement of the Basel II requirement, popularization of consumer loans and the intense competition in financial market has increased the awareness of the critical delinquency issue for financial institutions in granting loans to potential applicants. In the past few decades, the scheme of artificial neural networks has been successfully applied to the financial field. Recently, the Support Vector Machine (SVM) has emerged as the better neural network in dealing with classification and forecasting problems due to its superior features of generalization performance and global optimum. This study develops a loan evaluation model using SVM to identify potential applicants for consumer loans. In addition to conducting experiments on performance comparison via cross-validation and paired t test, we analyze misclassification errors in terms of Type I and Type II and their effect on selecting network parameters of SVM. The analysis findings facilitate the development of a useful visual decision-support tool. The experimental results using a real-world data set reveal that SVM surpasses traditional neural network models in generalization performance and visualization via the visual tool, which helps decision makers determine appropriate loan evaluation strategies.  相似文献   

7.
柯孔林 《控制理论与应用》2009,26(12):1365-1370
建立了粗糙集和支持向量机集成的企业贷款违约判别模型,该模型首先利用自组织映射 (SOM)神经网络对具有连续属性值的财务数据进行离散处理,并应用遗传算法约简评价指标,然后将约简得到的最小条件属性集及相应的原始数据送入支持向量机进行训练,最后对企业短期贷款检验样本进行违约判别.采用贷款企业数据库558家制造业样本企业和522家房地产业样本企业进行交叉验证的实证研究,结果表明,与BP神经网络、多元判别分析、Logistic等违约判别模型相比,粗糙集和支持向量机集成的违约判别模型有更好的预测效果.  相似文献   

8.
Hourly energy prices in a competitive electricity market are volatile. Forecast of energy price is key information to help producers and purchasers involved in electricity market to prepare their corresponding bidding strategies so as to maximize their profits. It is difficult to forecast all the hourly prices with only one model for different behaviors of different hourly prices. Neither will it get excellent results with 24 different models to forecast the 24 hourly prices respectively, for there are always not sufficient data to train the models, especially the peak price in summer. This paper proposes a novel technique to forecast day-ahead electricity prices based on Self-Organizing Map neural network (SOM) and Support Vector Machine (SVM) models. SOM is used to cluster the data automatically according to their similarity to resolve the problem of insufficient training data. SVM models for regression are built on the categories clustered by SOM separately. Parameters of the SVM models are chosen by Particle Swarm Optimization (PSO) algorithm automatically to avoid the arbitrary parameters decision of the tester, improving the forecasting accuracy. The comparison suggests that SOM–SVM–PSO has considerable value in forecasting day-ahead price in Pennsylvania–New Jersey–Maryland (PJM) market, especially for summer peak prices.  相似文献   

9.
We consider a feature selection problem where the decision-making objective is to minimize overall misclassification cost by selecting relevant features from a training dataset. We propose a two-stage solution approach for solving misclassification cost minimizing feature selection (MCMFS) problem. Additionally, we propose a maximum-margin genetic algorithm (MMGA) that maximizes margin of separation between classes by taking into account all examples as opposed to maximizing margin of separation using a few support vectors. Feature selection is carried out by either an exhaustive or a heuristic simulated annealing approach in the first stage and a cost sensitive classification using either MMGA or cost sensitive support vector machines (SVM) in the second stage. Using simulated and real-world data sets and different misclassification cost matrices, we test our two-stage approach for solving the MCMFS problem. Our results indicate that feature selection plays an important role when misclassification cost asymmetries increase and the MMGA shows equal or better performance than the SVM.  相似文献   

10.
This paper proposes an artificial neural network (ANN) based software reliability model trained by novel particle swarm optimization (PSO) algorithm for enhanced forecasting of the reliability of software. The proposed ANN is developed considering the fault generation phenomenon during software testing with the fault complexity of different levels. We demonstrate the proposed model considering three types of faults residing in the software. We propose a neighborhood based fuzzy PSO algorithm for competent learning of the proposed ANN using software failure data. Fitting and prediction performances of the neighborhood fuzzy PSO based proposed neural network model are compared with the standard PSO based proposed neural network model and existing ANN based software reliability models in the literature through three real software failure data sets. We also compare the performance of the proposed PSO algorithm with the standard PSO algorithm through learning of the proposed ANN. Statistical analysis shows that the neighborhood fuzzy PSO based proposed neural network model has comparatively better fitting and predictive ability than the standard PSO based proposed neural network model and other ANN based software reliability models. Faster release of software is achievable by applying the proposed PSO based neural network model during the testing period.   相似文献   

11.
张玲  王玲  吴桐 《计算机应用》2014,34(3):775-779
针对热舒适度预测是一个复杂的非线性过程,不便于空调的实时控制应用的问题,提出一种基于改进的粒子群优化(PSO)算法优化反向传播(BP)神经网络的热舒适度预测模型。这一预测模型通过采用PSO算法优化BP神经网络的初始权值和阈值,改善了传统BP算法收敛速度慢及对网络初始值敏感的问题。同时,针对标准PSO算法易出现早熟收敛、局部寻优能力弱等缺点,提出了相应改进策略,进一步提高了PSO优化BP神经网络的能力。实验结果表明:与传统BP模型和标准PSO-BP模型相比,基于改进的PSO-BP算法的热舒适度预测模型具有更高的预测精度和更快的收敛速度。  相似文献   

12.
基于标准支持向量机的托攻击检测方法不能体现由于用户误分代价不同对分类效果带来的影响,提出了一种基于代价敏感支持向量机的托攻击检测新方法,该方法在代价敏感性学习机制下引入支持向量机作为分类工具,对支持向量机输出进行后验概率建模,建立了基于类别隶属度的动态代价函数,更准确地反映不同样本的分类代价,在此基础上设计了代价敏感支持向量机分类器。将该分类器应用在推荐系统托攻击检测中,并与标准的支持向量机方法、代价敏感支持向量机方法进行比较,实验结果表明,本方法可以更精确地控制代价敏感性,进一步提高对攻击用户的检测精度,降低总体的误分类代价。  相似文献   

13.
针对BP神经网络对初始权重敏感,容易陷入局部最优解的问题,引入粒子群优化算法(PSO),对网络权重进行全局搜索,同时采用BP神经网络权重更新方法对PSO搜索到的权重和阈值进行进一步的更新,构建改进后的PSO-BP神经网络模型,对一般盗窃犯罪数量进行预测。应用美国芝加哥市2015年-2017年盗窃犯罪数据以及总人口数、房价中位数、本科率等11个影响因子数据,对改进前后的模型进行了预测对比实验。结果表明,改进后的PSO-BP神经网络模型成功克服了BP模型的缺陷,相对误差由4.68%降低到1.635%。  相似文献   

14.
In this paper, we investigate the performance of statistical, mathematical programming and heuristic linear models for cost‐sensitive classification. In particular, we use five cost‐sensitive techniques including Fisher's discriminant analysis (DA), asymmetric misclassification cost mixed integer programming (AMC‐MIP), cost‐sensitive support vector machine (CS‐SVM), a hybrid support vector machine and mixed integer programming (SVMIP) and heuristic cost‐sensitive genetic algorithm (CGA) techniques. Using simulated datasets of varying group overlaps, data distributions and class biases, and real‐world datasets from financial and medical domains, we compare the performances of our five techniques based on overall holdout sample misclassification cost. The results of our experiments on simulated datasets indicate that when group overlap is low and data distribution is exponential, DA appears to provide superior performance. For all other situations with simulated datasets, CS‐SVM provides superior performance. In case of real‐world datasets from financial domain, CGA and AMC‐MIP hold a slight edge over the two SVM‐based classifiers. However, for medical domains with mixed continuous and discrete attributes, SVM classifiers perform better than heuristic (CGA) and AMC‐MIP classifiers. The SVMIP model is the most computationally inefficient model and poor performing model.  相似文献   

15.
基于PSO优化SVM的转炉炼钢用氧量预测研究   总被引:1,自引:0,他引:1  
用氧量是影响钢水质量的主要因素之一,为提高转炉炼钢用氧量模型的预测精度,提出基于PSO优化SVM的吹氧量建模预测方法。针对SVM结构参数依据经验选取,致使预测模型的泛化能力差,在标准PSO算法的基础上,优化SVM的惩罚系数、不敏感损失系数和高斯核宽度系数3个结构参数,并建立转炉炼钢用氧量预测模型;在此基础上利用UCI数据库中的Auto-MPG标准数据,验证了方法的有效性;最后以某钢厂100 t转炉的实际生产数据建立吹氧量预测模型,结果表明,与标准BP、RBF及SVM相比,基于PSO优化SVM的转炉炼钢吹氧量预测模型精度高、泛化能力强。  相似文献   

16.
首先利用一种改进后的粒子群算法对BP神经网络权值的选取进行优化,然后以LAN/WLAN集成网络为背景,用三种方法(BP神经网络、改进PSO算法优化后的BP神经网络、SVM)建立了LAN/WLAN集成网络可靠性的预测模型,最后通过实验比较,证明了改进后的神经网络模型预测通信网的可靠性、有效性和优越性。  相似文献   

17.
The paper deals with the problem of predicting the time to default in credit behavioural scoring. This area opens a possibility of including a dynamic component in behavioural scoring modelling which enables making decisions related to limit, collection and recovery strategies, retention and attrition, as well as providing an insight into the profitability, pricing or term structure of the loan. In this paper, we compare survival analysis and neural networks in terms of modelling and results. The neural network architecture is designed such that its output is comparable to the survival analysis output. Six neural network models were created, one for each period of default. A radial basis neural network algorithm was used to test all six models. The survival model used a Cox modelling procedure. Further, different performance measures of all models were discussed since even in highly accurate scoring models, misclassification patterns appear. A systematic comparison ‘3 + 2 + 2’ procedure is suggested to find the most effective model for a bank. Additionally, the survival analysis model is compared to neural network models according to the relative importance of different variables in predicting the time to default. Although different models can have very similar performance measures they may consist of different variables. The dataset used for the research was collected from a Croatian bank and credit customers were observed during a 12-month period. The paper emphasizes the importance of conducting a detailed comparison procedure while selecting the best model that satisfies the users’ interest.  相似文献   

18.
Reliability-based design optimization (RBDO) is concerned with designing an engineering system to minimize a cost function subject to the reliability requirement that failure probability should not exceed a threshold. Conventional RBDO methods are less than satisfactory in dealing with discrete design parameters and complex limit state functions (nonlinear and non-differentiable). Methods that are flexible enough to address the concerns above, however, come at a high computational cost. To enhance computational efficiency without sacrificing model flexibility, we propose a new RBDO framework: PS2, which combines Particle Swarm Optimization (PSO), Support Vector Machine (SVM), and Subset Simulation (SS). SS can efficiently estimate small failure probabilities, based on which SVM is adopted to evaluate the reliability of candidate solutions using binary classification. PSO is employed to solve the discrete optimization problem. Primary emphasis is placed upon the cooperation between SVM and PSO. The cooperation is mutually beneficial since the SVM classifier helps PSO evaluate the feasibility of solutions with high efficiency while the optimal solutions obtained by PSO assist in retraining the SVM classifier to attain better accuracy. The PS2 framework is implemented to find the optimal design of a ten-bar truss, whose component sizes are selected from a commercial standard. The reliability constraints are non-differentiable with two failure modes: yield stress and buckling stress. The interactive process between PSO and SVM contributes greatly to the success of the PS2 framework. It is shown that in various trials the PS2 framework consistently outperforms both the double-loop and single-loop approaches in terms of computational efficiency, solution quality, and model flexibility.  相似文献   

19.
宽度学习系统(broad learning system,BLS)作为深度神经网络的替代框架,具有快速自适应模型结构选择和在线增量学习能力,被认为是知识发现和数据工程领域中一种极具前途的技术.传统的BLS主要应用于数据分 布均衡且误分类代价相同的模式分类任务,但大多数实际应用的数据是非均衡分布的,如网络入侵监测、医疗诊断、信用卡欺诈检测等.基于此,提出一种基于数据分布特性的代价敏感BLS(data distribution-based cost-sensitive-BLS,DDbCs-BLS),解决数据分布不均、误分代价不同的模式分类任务.DDbCs-BLS在充分考虑数据统计分布特性的基础上寻找代价敏感型BLS分类器的最佳分类边界,保证少数类样本信息不被丢失,从而提高BLS在各类数据集上的模式分类性能.在多种公共数据集(包括均衡和不均衡数据集)上进行大量的验证性和对比性实验,结果表明DDbCs-BLS能有效确定分类边界线的最佳位置,无论是在均衡数据集还是在不均衡数据集上均能获得更好的分类性能.  相似文献   

20.
基于自适应脊波网络的高光谱遥感图像分类   总被引:1,自引:0,他引:1  
神经网络是遥感地物自动分类的重要工具之一。利用多尺度几何分析中的眷波基函数建立了一种自适应眷波网络模型。在传统自适应粒子群算法的基础上,提出一种引入粒子密度因子的自适应粒子群优化算法作为网络训练算法。为验证其性能,利用互信息约简技术对22。波段AVIRIS 92AV3C高光谱数据进行约简,并将它们作为网络输入实现对高光谱遥感地物的自动分类。仿真试验表明:引入粒子密度因子的粒子群算法与传统粒子群算法相比,不易出现早熟问题,在处理高维非线性组合优化问题时具有一定优势;由于眷波函数对高维奇异性的表征能力,相比于传统的RBF和SVM分类器,脊波神经网络分类器对具有明显边界特征的地物分类问题具有较高的精度,同时网络规模小,结构简单。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号