首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
基于贝叶斯网络的电信客户流失预测分析   总被引:6,自引:0,他引:6  
电信客户流失分析常用的数据挖掘方法有自动聚类、决策树和人工神经网络,它们是采用数据本身来训练模型的,没有利用先验知识。电信客户流失是由客户心理、服务质量和对手竞争等诸多复杂的因素造成的,利用这些已有的先验知识,可以提高预测的精度。该文根据先验知识选取分析变量,采集样本数据,通过贝叶斯网络的结构学习和参数学习,建立客户流失模型并进行客户流失趋势预测,取得了比标准数据集更准确的结果,该结果和决策树方法的预测结果相比还具有较大的优势,说明贝叶斯网络是分析客户流失等不确定性问题的有效工具。  相似文献   

2.
为了提高铁路零散白货客户流失预测的准确性和高效性,根据铁路零散白货客户的流失特征,提出了基于CDL模型的客户流失识别方法,在此基础上,针对数据量大的问题,提出了基于Hadoop并行框架的C4.5决策树客户流失预测模型。通过仿真实验,证明该模型具有较好的准确性和预测能力,并且随着样本数量的增加,Hadoop并行框架的效率得到了明显的提升,且不影响客户流失预测模型的准确性和预测能力。  相似文献   

3.
王林  郭娜娜 《计算机应用》2017,37(4):1032-1037
针对传统分类技术对不均衡电信客户数据集中流失客户识别能力不足的问题,提出一种基于差异度的改进型不均衡数据分类(IDBC)算法。该算法在基于差异度分类(DBC)算法的基础上改进了原型选择策略。在原型选择阶段,利用改进型的样本子集优化方法从整体数据集中选择最具参考价值的原型集,从而避免了随机选择所带来的不确定性;在分类阶段,分别利用训练集和原型集、测试集和原型集样本之间的差异性构建相应的特征空间,进而采用传统的分类预测算法对映射到相应特征空间内的差异度数据集进行学习。最后选用了UCI数据库中的电信客户数据集和另外6个普通的不均衡数据集对该算法进行验证,相对于传统基于特征的不均衡数据分类算法,DBC算法对稀有类的识别率平均提高了8.3%,IDBC算法对稀有类的识别率平均提高了11.3%。实验结果表明,所提IDBC算法不受类别分布的影响,而且对不均衡数据集中稀有类的识别能力优于已有的先进分类技术。  相似文献   

4.
Several studies have demonstrated the superior performance of ensemble classification algorithms, whereby multiple member classifiers are combined into one aggregated and powerful classification model, over single models. In this paper, two rotation-based ensemble classifiers are proposed as modeling techniques for customer churn prediction. In Rotation Forests, feature extraction is applied to feature subsets in order to rotate the input data for training base classifiers, while RotBoost combines Rotation Forest with AdaBoost. In an experimental validation based on data sets from four real-life customer churn prediction projects, Rotation Forest and RotBoost are compared to a set of well-known benchmark classifiers. Moreover, variations of Rotation Forest and RotBoost are compared, implementing three alternative feature extraction algorithms: principal component analysis (PCA), independent component analysis (ICA) and sparse random projections (SRP). The performance of rotation-based ensemble classifier is found to depend upon: (i) the performance criterion used to measure classification performance, and (ii) the implemented feature extraction algorithm. In terms of accuracy, RotBoost outperforms Rotation Forest, but none of the considered variations offers a clear advantage over the benchmark algorithms. However, in terms of AUC and top-decile lift, results clearly demonstrate the competitive performance of Rotation Forests compared to the benchmark algorithms. Moreover, ICA-based Rotation Forests outperform all other considered classifiers and are therefore recommended as a well-suited alternative classification technique for the prediction of customer churn that allows for improved marketing decision making.  相似文献   

5.
移动通信领域迫切需要在地理分布的经营分析系统之间交换标准的数据挖掘模型。尽管预测模型标记语言已经成为数据挖掘模型交换格式的业界标准,但并没形成可用的框架来指导标准交换模型的生产过程。该文提出了支持挖掘模型交换和移动通信客户流失分析的决策树算法框架。利用该框架构建了流失预警系统,并使用模拟客户数据验证了其有效性。对标准交换模型进行了适当扩展,以支持对移动通信数据更加有效的流失分析。  相似文献   

6.
We describe CHAMP (CHurn Analysis, Modeling, and Prediction), an automated system for modeling cellular customer behavior on a large scale. Using historical data from GTE's data warehouse for cellular phone customers, every month CHAMP identifies churn factors for several geographic regions and updates models to generate churn scores predicting who is likely to churn within the near future. CHAMP is capable of developing customized monthly models and churn scores for over one hundred GTE cellular phone markets totaling over 5 million customers.  相似文献   

7.
8.
针对数据挖掘方法在电信客户流失预测中的局限性,提出将信息融合与数据挖掘相结合,分别从数据层、特征层、决策层构建客户流失预测模型。确定客户流失预测指标;根据客户样本在特征空间分布的差异性对客户进行划分,得到不同特征的客户群;不同客户群采用不同算法构建客户流失预测模型,再通过人工蚁群算法求得模型融合权重,将各模型的预测结果加权得到预测最终结果。实验结果表明,基于信息融合的客户流失预测模型确实比传统模型更优。  相似文献   

9.
Customer churn has become a critical issue, especially in the competitive and mature credit card industry. From an economic and risk management perspective, it is important to understand customer characteristics in order to retain customers and differentiate high-quality credit customers from bad ones. However, studies have not yet adequately introduced rules based on customer characteristics and churn forms of original data. This study uses rough set theory, a rule-based decision-making technique, to extract rules related to customer churn; then uses a flow network graph, a path-dependent approach, to infer decision rules and variables; and finally presents the relationships between rules and different kinds of churn. An empirical case of credit card customer churn is also illustrated. In this study, we collect 21,000 customer samples, equally divided into three classes: survival, voluntary churn and involuntary churn. The data from these samples includes demographic, psychographic and transactional variables for analyzing and segmenting customer characteristics. The results show that this combined model can fully predict customer churn and provide useful information for decision-makers in devising marketing strategy.  相似文献   

10.
To build a successful customer churn prediction model, a classification algorithm should be chosen that fulfills two requirements: strong classification performance and a high level of model interpretability. In recent literature, ensemble classifiers have demonstrated superior performance in a multitude of applications and data mining contests. However, due to an increased complexity they result in models that are often difficult to interpret. In this study, GAMensPlus, an ensemble classifier based upon generalized additive models (GAMs), in which both performance and interpretability are reconciled, is presented and evaluated in a context of churn prediction modeling. The recently proposed GAMens, based upon Bagging, the Random Subspace Method and semi-parametric GAMs as constituent classifiers, is extended to include two instruments for model interpretability: generalized feature importance scores, and bootstrap confidence bands for smoothing splines. In an experimental comparison on data sets of six real-life churn prediction projects, the competitive performance of the proposed algorithm over a set of well-known benchmark algorithms is demonstrated in terms of four evaluation metrics. Further, the ability of the technique to deliver valuable insight into the drivers of customer churn is illustrated in a case study on data from a European bank. Firstly, it is shown how the generalized feature importance scores allow the analyst to identify the relative importance of churn predictors in function of the criterion that is used to measure the quality of the model predictions. Secondly, the ability of GAMensPlus to identify nonlinear relationships between predictors and churn probabilities is demonstrated.  相似文献   

11.
The classification algorithm extreme SVM (ESVM) proposed recently has been proved to provide very good generalization performance in relatively short time, however, it is inappropriate to deal with large-scale data set due to the highly intensive computation. Thus we propose to implement an efficient parallel ESVM (PESVM) based on the current and powerful parallel programming framework MapReduce. Furthermore, we investigate that for some new coming training data, it is brutal for ESVM to always retrain a new model on all training data (including old and new coming data). Along this line, we develop an incremental learning algorithm for ESVM (IESVM), which can meet the requirement of online learning to update the existing model. Following that we also provide the parallel version of IESVM (PIESVM), which can solve both the large-scale problem and the online problem at the same time. The experimental results show that the proposed parallel algorithms not only can tackle large-scale data set, but also scale well in terms of the evaluation metrics of speedup, sizeup and scaleup. It is also worth to mention that PESVM, IESVM and PIESVM are much more efficient than ESVM, while the same solutions as ESVM are exactly obtained.  相似文献   

12.
Customer Segmentation is an increasingly pressing issue in today’s over-competitive commercial area. More and more literatures have researched the application of data mining technology in customer segmentation, and achieved sound effectives. But most of them segment customer only by single data mining technology from a special view, rather than from systematical framework. Furthermore, one of the key purposes of customer segmentation is customer retention. Although previous segment methods may identify which group needs more care, it is unable to identify customer churn trend for taking different actions. This paper focus on proposing a customer segmentation framework based on data mining and constructs a new customer segmentation method based on survival character. The new customer segmentation method consists of two steps. Firstly, with K-means clustering arithmetic, customers are clustered into different segments in which customers have the similar survival characters (churn trend). Secondly, each cluster’s survival/hazard function is predicted by survival analyzing, the validity of clustering is tested and customer churn trend is identified. The method mentioned above has been applied to a dataset from China Telecom, which acquired some useful management measures and suggestions. Some propositions for further research is also suggested.  相似文献   

13.
The wireless service subscriber calls a customer service representative to complain about dropped calls. During the conversation with the customer, the CSR views a display that shows this customer's probability of churn-switching from this service provider to another-as well as the most probable reasons to churn and the best strategy to retain this customer. The CSR then quickly responds to the subscriber according to the system's recommendation. This is an intelligent customer-care system designed to predict customer behavior. Predicting customer churn is a component in the decision framework for retaining customers and maximizing profitability. Companies can use these probability and revenue estimates in a decision-theoretic framework to determine a churn intervention strategy and a profitability optimization strategy. Predicting customer behavior helps service providers build customer loyalty and maximize profitability. For the success of a project, data preparation is often a critical part of the predictive algorithm.  相似文献   

14.
The early detection of potential churners enables companies to target these customers using specific retention actions, and subsequently increase profits. This analytical CRM (Customer Relationship Management) approach is illustrated using real-life data of a European pay-TV company. Their very high churn rate has had a devastating effect on their customer base. This paper first develops different churn-prediction models: the introduction of Markov chains in churn prediction, and a random forest model are benchmarked to a basic logistic model.The most appropriate model is subsequently used to target those customers with a high churn probability in a field experiment. Three alternative courses of marketing action are applied: giving free incentives, organizing special customer events, obtaining feedback on customer satisfaction through questionnaires. The results of this field experiment show that profits can be doubled using our churn-prediction model. Moreover, profits vary enormously with respect to the selected retention action, indicating that a customer satisfaction questionnaire yields the best results, a phenomenon known in the psychological literature as the ‘mere-measurement effect’.  相似文献   

15.
基于代价敏感SVM的电信客户流失预测研究*   总被引:3,自引:0,他引:3  
针对客户流失数据集的非平衡性问题和错分代价的差异性问题,将代价敏感学习应用于Veropoulos提出的采用不同惩罚系数的支持向量机,建立客户流失预测模型,对实际的电信客户流失数据进行验证。通过与传统SVM、C4.5和ANN对比研究,结果显示此方法在精确度、命中率、覆盖率和提升度均有所改善,表明此方法有效地解决了数据集的非平衡性和错分代价问题,是进行客户流失预测的有效方法。  相似文献   

16.
夏国恩 《计算机应用》2008,28(1):149-151
将核主成分分析(KPCA)引入到客户流失预测中,提出了相应的特征提取算法。将KPCA与Logistic回归结合,设计了预测模型。通过对某电信公司客户流失预测的试验结果表明:该方法获得的命中率、覆盖率、准确率和提升系数高于原始属性集和主成分分析(PCA)特征提取法。这表明KPCA能提取客户数据的非线性特征,是研究客户流失预测问题的有效方法。  相似文献   

17.
梳理了客户流失和客户流失管理的定义,客户流失问题的研究内容、应用场景,客户流失预测算法及特征选择方法,模型评估的常用技术与度量等方面的研究现状,指出当前研究的不足,并提出未来的研究方向.  相似文献   

18.
CIAS:一个客户智能分析数据挖掘平台   总被引:3,自引:0,他引:3  
CIAS是将数据挖掘技术应用在CRM领域而开发的一个客户智能分析平台。它将数据挖掘划分为三个层次:算法层、商业逻辑层、行业应用层,构建了一种新型的数据挖掘系统体系结构。CIAS的商业逻辑层包括交叉销售、客户响应、客户细分、客户流失、客户利润,五个商业模型。通过在商业模型和挖掘算法之间建立映射,CIAS使得用户直接利用商业模型解决问题,而不是面对复杂的算法,从而提供友好、易用的数据挖掘应用环境。  相似文献   

19.
针对于大样本数据的客户流失预测,从特征有效表达的角度,提出了一种基于谱回归特征约简的预测模型.模型在原始客户特征基础上,利用基于谱回归的流形降维,建立可区分性的低维特征空间,在此之上采用支持向量机实现客户流失的二分类.通过在网络客户和传统电信客户两种不同数据集上的大样本实验,并与不同分类器、不同特征约简或选择方法的对比,证明了该方法的有效性.  相似文献   

20.
We describe CHAMP (CHurn Analysis, Modeling, andPrediction), an automated system for modeling cellularsubscriber churn that is predicting which customerswill discontinue cellular phone service. We describevarious issues related to developing and deployingthis system including automating data access from aremote data warehouse, preprocessing, featureselection, model validation, and optimization toreflect business tradeoffs. Using data from GTE'sdata warehouse for cellular phone customers, CHAMP iscapable of developing churn models customized byregion for over one hundred GTE cellular phone marketstotaling over 5 million customers. Every month churnfactors are identified for each geographic region andmodels are updated to generate churn scores predictingwho is likely to churn in the short term. Learningmethods such as decision trees and genetic algorithmsare used for feature selection and a cascade neuralnetwork is used for predicting churn scores. Inaddition to producing churn scores, CHAMP alsoproduces qualitative results in the form of rules andcomparison of market trends that are disseminatedthrough a web based interface.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号