首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We studied the problem of optimizing the performance of a DSS for churn prediction. In particular, we investigated the beneficial effect of adding the voice of customers through call center emails – i.e. textual information – to a churn-prediction system that only uses traditional marketing information. We found that adding unstructured, textual information into a conventional churn-prediction model resulted in a significant increase in predictive performance. From a managerial point of view, this integrated framework helps marketing-decision makers to better identify customers most prone to switch. Consequently, their customer retention campaigns can be targeted more effectively because the prediction method is better at detecting those customers who are likely to leave.  相似文献   

2.
Several studies have demonstrated the superior performance of ensemble classification algorithms, whereby multiple member classifiers are combined into one aggregated and powerful classification model, over single models. In this paper, two rotation-based ensemble classifiers are proposed as modeling techniques for customer churn prediction. In Rotation Forests, feature extraction is applied to feature subsets in order to rotate the input data for training base classifiers, while RotBoost combines Rotation Forest with AdaBoost. In an experimental validation based on data sets from four real-life customer churn prediction projects, Rotation Forest and RotBoost are compared to a set of well-known benchmark classifiers. Moreover, variations of Rotation Forest and RotBoost are compared, implementing three alternative feature extraction algorithms: principal component analysis (PCA), independent component analysis (ICA) and sparse random projections (SRP). The performance of rotation-based ensemble classifier is found to depend upon: (i) the performance criterion used to measure classification performance, and (ii) the implemented feature extraction algorithm. In terms of accuracy, RotBoost outperforms Rotation Forest, but none of the considered variations offers a clear advantage over the benchmark algorithms. However, in terms of AUC and top-decile lift, results clearly demonstrate the competitive performance of Rotation Forests compared to the benchmark algorithms. Moreover, ICA-based Rotation Forests outperform all other considered classifiers and are therefore recommended as a well-suited alternative classification technique for the prediction of customer churn that allows for improved marketing decision making.  相似文献   

3.
随着客户关系管理系统的不断发展和应用,使用先进的算法进行客户分析变得越来越重要。尤其是象银行这种以客户为导向的行业,客户分析是十分必要的。当前,支持向量机方法作为一种统计学习理论的分类方法已经发展的比较成熟而且成功应用到了很多领域。文章解决的主要问题是对银行的客户数据根据其属性对客户进行分类,为银行的客户关系管理系统提供一种可靠的分类方法。文中主要介绍了银行的客户分类学习的过程和结果,如,客户数据清洗,数据预处理,SVM进行数据分类,多类分类处理,客户属性选择等问题。  相似文献   

4.
Random projections for linear SVM ensembles   总被引:1,自引:1,他引:0  
This paper presents an experimental study using different projection strategies and techniques to improve the performance of Support Vector Machine (SVM) ensembles. The study has been made over 62 UCI datasets using Principal Component Analysis (PCA) and three types of Random Projections (RP), taking into account the size of the projected space and using linear SVMs as base classifiers. Random Projections are also combined with the sparse matrix strategy used by Rotation Forests, which is a method based in projections too. Experiments show that for SVMs ensembles (i) sparse matrix strategy leads to the best results, (ii) results improve when projected space dimension is bigger than the original one, and (iii) Random Projections also contribute to the results enhancement when used instead of PCA. Finally, random projected SVMs are tested as base classifiers of some state of the art ensembles, improving their performance.  相似文献   

5.
6.
Retaining customers has been considered one of the most critical challenges among those included in Customer Relationship Management (CRM), particularly in the grocery retail sector. In this context, an accurate prediction whether or not a customer will leave the company, i.e. churn prediction, is crucial for companies to conduct effective retention campaigns. This paper proposes to include in partial churn detection models the succession of first products’ categories purchased as a proxy of the state of trust and demand maturity of a customer towards a company in grocery retailing. Motivated by the importance of the first impressions and risks experienced recently on the current state of the relationship, we model the first purchase succession in chronological order as well as in reverse order, respectively. Due to the variable relevance of the first customer–company interactions and of the most recent interactions, these two variables are modeled by considering a variable length of the sequence. In this study we use logistic regression as the classification technique. A real sample of approximately 75,000 new customers taken from the data warehouse of a European retail company is used to test the proposed models. The area under the receiver operating characteristic curve and 1%, 5% and 10% percentiles lift are used to assess the performance of the partial-churn prediction models. The empirical results reveal that both proposed models outperform the standard RFM model.  相似文献   

7.
Choosing a suitable classifier for a given dataset is an important part of developing a pattern recognition system. Since a large variety of classification algorithms are proposed in literature, non-experts do not know which method should be used in order to obtain good classification results on their data. Meta-learning tries to address this problem by recommending promising classifiers based on meta-features computed from a given dataset. In this paper, we empirically evaluate five different categories of state-of-the-art meta-features for their suitability in predicting classification accuracies of several widely used classifiers (including Support Vector Machines, Neural Networks, Random Forests, Decision Trees, and Logistic Regression). Based on the evaluation results, we have developed the first open source meta-learning system that is capable of accurately predicting accuracies of target classifiers. The user provides a dataset as input and gets an automatically created high-performance ready-to-use pattern recognition system in a few simple steps. A user study of the system with non-experts showed that the users were able to develop more accurate pattern recognition systems in significantly less development time when using our system as compared to using a state-of-the-art data mining software.  相似文献   

8.
Individual human travel patterns captured by mobile phone data have been quantitatively characterized by mathematical models, but the underlying activities which initiate the movement are still in a less-explored stage. As a result of the nature of how activity and related travel decisions are made in daily life, human activity-travel behavior exhibits a high degree of spatial and temporal regularities as well as sequential ordering. In this study, we investigate to what extent the behavioral routines could reveal the activities being performed at mobile phone call locations that are captured when users initiate or receive a voice call or message.Our exploration consists of four steps. First, we define a set of comprehensive temporal variables characterizing each call location. Feature selection techniques are then applied to choose the most effective variables in the second step. Next, a set of state-of-the-art machine learning algorithms including Support Vector Machines, Logistic Regression, Decision Trees and Random Forests are employed to build classification models. Alongside, an ensemble of the results of the above models is also tested. Finally, the inference performance is further enhanced by a post-processing algorithm.Using data collected from natural mobile phone communication patterns of 80 users over a period of more than one year, we evaluated our approach via a set of extensive experiments. Based on the ensemble of the models, we achieved prediction accuracy of 69.7%. Furthermore, using the post processing algorithm, the performance obtained a 7.6% improvement. The experiment results demonstrate the potential to annotate mobile phone locations based on the integration of data mining techniques with the characteristics of underlying activity-travel behavior, contributing towards the semantic comprehension and further application of the massive data.  相似文献   

9.
The key question of this study is: How long should customer event history be for customer churn prediction? While most studies in predictive churn modeling aim to improve models by data augmentation or algorithm improvement, this study focuses on a another dimension: time window optimization with respect to predictive performance. This paper first presents a formalization of the time window selection strategy, along with a literature review. Next, using logistic regression, classification trees and bagging in combination with classification trees, this study analyzes the improvement in churn-model performance by extending customer event history from one to sixteen years. The results show that, after the fifth additional year, predictive performance is only marginally increased, meaning that the company in this study can discard 69% of its data with almost no decrease in predictive performance. The practical implication is that analysts can substantially decrease data-related burdens, such as data storage, preparation and analysis. This is particularly valuable in times of big data when decreasing computational complexity is paramount.  相似文献   

10.
To survive in today's telecommunication business it is imperative to distinguish customers who are not reluctant to move toward a competitor. Therefore, customer churn prediction has become an essential issue in telecommunication business. In such competitive business a reliable customer predictor will be regarded priceless. This paper has employed data mining classification techniques including Decision Tree, Artificial Neural Networks, K-Nearest Neighbors, and Support Vector Machine so as to compare their performances. Using the data of an Iranian mobile company, not only were these techniques experienced and compared to one another, but also we have drawn a parallel between some different prominent data mining software. Analyzing the techniques’ behavior and coming to know their specialties, we proposed a hybrid methodology which made considerable improvements to the value of some of the evaluations metrics. The proposed methodology results showed that above 95% accuracy for Recall and Precision is easily achievable. Apart from that a new methodology for extracting influential features in dataset was introduced and experienced.  相似文献   

11.
C4.5算法在保险客户流失分析中的应用   总被引:11,自引:0,他引:11  
保持客户和吸引客户是保险公司提高竞争力的关键,目前保险公司对客户流失的分析是粗略的或根据经验来判断。论文利用面向属性归纳和决策树C4.5算法对保险客户基本信息进行分析,找出客户流失的特征,帮助保险公司有针对性地改善客户关系。  相似文献   

12.
为了提高铁路零散白货客户流失预测的准确性和高效性,根据铁路零散白货客户的流失特征,提出了基于CDL模型的客户流失识别方法,在此基础上,针对数据量大的问题,提出了基于Hadoop并行框架的C4.5决策树客户流失预测模型。通过仿真实验,证明该模型具有较好的准确性和预测能力,并且随着样本数量的增加,Hadoop并行框架的效率得到了明显的提升,且不影响客户流失预测模型的准确性和预测能力。  相似文献   

13.
Customer retention in telecommunication companies is one of the most important issues in customer relationship management, and customer churn prediction is a major instrument in customer retention. Churn prediction aims at identifying potential churning customers. Traditional approaches for determining potential churning customers are based only on customer personal information without considering the relationship among customers. However, the subscribers of telecommunication companies are connected with other customers, and network properties among people may affect the churn. For this reason, we proposed a new procedure of the churn prediction by examining the communication patterns among subscribers and considering a propagation process in a network based on call detail records which transfers churning information from churners to non-churners. A fast and effective propagation process is possible through community detection and through setting the initial energy of churners (the amount of information transferred) differently in churn date or centrality. The proposed procedure was evaluated based on the performance of the prediction model trained with a social network feature and traditional personal features.  相似文献   

14.
With the widespread usage of social networks, forums and blogs, customer reviews emerged as a critical factor for the customers’ purchase decisions. Since the beginning of 2000s, researchers started to focus on these reviews to automatically categorize them into polarity levels such as positive, negative, and neutral. This research problem is known as sentiment classification. The objective of this study is to investigate the potential benefit of multiple classifier systems concept on Turkish sentiment classification problem and propose a novel classification technique. Vote algorithm has been used in conjunction with three classifiers, namely Naive Bayes, Support Vector Machine (SVM), and Bagging. Parameters of the SVM have been optimized when it was used as an individual classifier. Experimental results showed that multiple classifier systems increase the performance of individual classifiers on Turkish sentiment classification datasets and meta classifiers contribute to the power of these multiple classifier systems. The proposed approach achieved better performance than Naive Bayes, which was reported the best individual classifier for these datasets, and Support Vector Machines. Multiple classifier systems (MCS) is a good approach for sentiment classification, and parameter optimization of individual classifiers must be taken into account while developing MCS-based prediction systems.  相似文献   

15.
Accurate identification of precipitating clouds is a challenging task. In the present work, Support Vector Machines (SVMs), Decision Trees (DT), and Random Forests (RD) algorithms were applied to extract and track mesoscale convective precipitating clouds from a series of 22 Geostationary Operational Environmental Satellite-13 meteorological image sub-scenes over the continental territory of Colombia. This study’s aims are twofold: (i) to establish whether the use of five meteorological spectral channels, rather than a single infrared (IR) channel, improves rainfall objects detection and (ii) to evaluate the potential of machine learning algorithms to locate precipitation clouds. Results show that while the SVM algorithm provides more accurate classification of rainfall cloud objects than the traditional IR brightness temperature threshold method, such improvement is not statistically significant. Accuracy assessment was performed using STEP (shape (S), theme (T), edge (E), and position (P)) object-based similarity matrix method, taking as reference precipitation satellite images from the Tropical Rainfall Measuring Mission. Best thematic and geometric accuracies were obtained applying the SVM algorithm.  相似文献   

16.
Remote patient tracking has recently gained increased attention, due to its lower cost and non-invasive nature. In this paper, the performance of Support Vector Machines (SVM), Least Square Support Vector Machines (LS-SVM), Multilayer Perceptron Neural Network (MLPNN), and General Regression Neural Network (GRNN) regression methods is studied in application to remote tracking of Parkinson’s disease progression. Results indicate that the LS-SVM provides the best performance among the other three, and its performance is superior to that of the latest proposed regression method published in the literature.  相似文献   

17.
基于递归最小二乘支持向量机,提出了一种网络业务流量非线性预测算法。通过最小二乘支持量机首先将原始的网络流量数据映射到一个高维空间中,进而在这个高维空间中对流量数据进行预测,使得在低维空间中非线性预测转化为高维空间中的线性预测,提高了预测性能。仿真结果表明,预测误差能够维持在5%以内。  相似文献   

18.
Building performance has been shown to degrade significantly after commissioning, resulting in increased energy consumption and associated greenhouse gas emissions. Fault Detection and Diagnosis (FDD) protocols using existing sensor networks and IoT devices have the potential to minimize this waste by continually identifying system degradation and re-tuning control strategies to adapt to real building performance. Due to its significant contribution to greenhouse gas emissions, the performance of gas boiler systems for building heating is critical. A review of boiler performance studies has been used to develop a set of common faults and degraded performance conditions, which have been integrated into a MATLAB/Simscape emulator. This resulted in a labeled dataset with approximately 10,000 simulations of steady-state performance for each of 14 non-condensing boilers. The collected data is used for training and testing fault classification using K-nearest neighbour, Decision tree, Random Forest, and Support Vector Machines. The results show that the Decision Tree, Random Forest, and Support Vector Machines method provide high prediction accuracy, consistently exceeding 95%, and generalization across multiple boilers is not possible due to low classification accuracy.  相似文献   

19.
针对数据挖掘方法在电信客户流失预测中的局限性,提出将信息融合与数据挖掘相结合,分别从数据层、特征层、决策层构建客户流失预测模型。确定客户流失预测指标;根据客户样本在特征空间分布的差异性对客户进行划分,得到不同特征的客户群;不同客户群采用不同算法构建客户流失预测模型,再通过人工蚁群算法求得模型融合权重,将各模型的预测结果加权得到预测最终结果。实验结果表明,基于信息融合的客户流失预测模型确实比传统模型更优。  相似文献   

20.
The early detection of potential churners enables companies to target these customers using specific retention actions, and subsequently increase profits. This analytical CRM (Customer Relationship Management) approach is illustrated using real-life data of a European pay-TV company. Their very high churn rate has had a devastating effect on their customer base. This paper first develops different churn-prediction models: the introduction of Markov chains in churn prediction, and a random forest model are benchmarked to a basic logistic model.The most appropriate model is subsequently used to target those customers with a high churn probability in a field experiment. Three alternative courses of marketing action are applied: giving free incentives, organizing special customer events, obtaining feedback on customer satisfaction through questionnaires. The results of this field experiment show that profits can be doubled using our churn-prediction model. Moreover, profits vary enormously with respect to the selected retention action, indicating that a customer satisfaction questionnaire yields the best results, a phenomenon known in the psychological literature as the ‘mere-measurement effect’.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号