客户流失预测——相关论文学习笔记

Table of Contents

1. 《Churn prediction in telecommunication using ML》

2. 《Handling imbalanced data in churn prediction using ADASYN and Back-propagation algorithm》

3. 《Customer churn prediction for retail business》=> Useless

4. 《A comparative study of customer churn prediction in telecom industry using ensemble based classfiers 》- Useless

5. 《Customer churn prediction in an internet service provider》

6. 《A review and analysis of churn prediction methods for customer retention in telecom industries》

7. 《Using deep learning to predict customer churn in a mobile telecommunication network》

8. 《Churn analysis and plan recommendation for telecom operators》

9. 《A data mining process framework for churn management in mobile telecommunication industry》

10. 《Using deep learning to predict customer churn in a mobile telecommunication network》


  • 1. 《Churn prediction in telecommunication using ML》

  • Abstract
    • Setbacks (difficulties):

      • enormous database;

      • large feature space;

      • imbalanced class distribution:number of churner << number of non-churners

    • Solutions:

      • Data imbalance:  SMOTE (synthetic minority over-sampling technique)

        • SMOTE实现:http://shataowei.com/2017/12/01/python%E5%BC%80%E5%8F%91%EF%BC%9A%E7%89%B9%E5%BE%81%E5%B7%A5%E7%A8%8B%E4%BB%A3%E7%A0%81%E6%A8%A1%E7%89%88-%E4%B8%80/

      • Feature reduction methods, including: co-relation feature extraction, gain ratio, information gain and One-R feature evaluation methods

      • Models:  CART (classification and regression trees) , bagged CART and PART (partial decision trees

      • Evaluations: AUC, sensitivity and specificity

  • Introduction
    • Common Feature selection methods:PCA, Gain Ratio, Information gain, OneR and Co-relation based techniques
    • Common Sampling methods:
      • Random Oversampling (ROS):  random instances from minority class are simply replicated --> prone to overfitting
      • Random undersampling (RUS): random instances from the majority class are discarded --> may discard some useful instances
    • Models:
      • Tree and rules-based:  CART, PART,
      • Ensemble of trees: C 5.0, bagged CART, RF, XGBoost
      • Linear: LR, linear discriminant analysis
      • Non-linear: Neural network, SVM, KNN, Naïve Bayes
  • Related techs
    • Models: KNN, RF, Rotation Forest, Adaboost,
    • Model fusion: ordered weighted average, vote
    • Feature extraction: PCA, F-score, Fisher's ratio, Minimum Redundancy Maximum Relevance
  • Methods and materials
    • Data set:  1:3, imbalanced
    • Data preprocessing: removing useless features + sampling using SMOTE + feature selection using Co-relation, Gain Ratio, information gain and oneR
    • Intro SMOTEhttps://www.cnblogs.com/Determined22/p/5772538.html  Create new similar instances instead of same instances for minority class data, so that it can soften decision boundary, and further the classification can be more general and does not over-fit.
    • Co-relation Feature Selection: Pearson's co-relation coefficient and spearman's co-relation coefficient
      • Information gain attribute selection:  entropy

      • Gain Ratio attribute selection:  overcome the limitation of IG (is used to select attributes for the terminal nodes of the decision tree)

      • OneR Attribute Selection: short for One Rule.  Generating one rule for each predictor in the data, then selects the rule with the smallest total error as its "one rule"

      • Decision tree based classification

      • Partial tree based classification PART:  decision trees that are prune the decision tree on their own

      • Bagged tree classification:  bootstrap aggregation or bagging

      • Boosted classification trees:

  • Conclusions
    • Adequate preprocessing and data balancing in case of imbalanced datasets are bound to improve the classification performances of the used classifiers.

    • SMOTE based classifier ---> improve classification performance

    • Ensemble approach can achieve performance

    • Co-relation based feature extracted better than other selection methods in this case

  • 2. 《Handling imbalanced data in churn prediction using ADASYN and Back-propagation algorithm》

  • Abstract
    • Churn prediction difficulty: imbalanced data
    • Methods in the paper: Oversampling algorithm:  ADASYN (Adaptive synthetic  sampling), an oversampling methods ==> solve imbalanced problem. ADASYN is an improvement  algorithm from SMOTE (Synthetic minority over-sampling)
    • Classification method: backpropagation algorithm
  • Introduction
    • DATA SIZE: tens of columns (attributes) & thousands of rows of data
    • Model: boosting, random forest and its modification
  • Methodology
    • Input data: 1 year Tel data, 55 features & 200.387 rows; in-balanced data:  churn: un-churned= 0.04, 0.96
    • Feature selection: Pearson correlation equation
    • Resampling using ADASYN: difference between ADASYN - SMOTE: ADASYN uses density beta distribution as a reference for determining the number of synthetic data
    • Constructing churn prediction with backpropagation method
      • Forward propagation of operating signal:
      • Back propagation of error signal:
    • Performance measurement
      • F1-Score & accuracy
        • Precision = TP/(TP+FP): 体现了模型对负样本的识别能力,precision越高,说明模型对负样本的区分能力越强
        • Recall (sensitivity) = TP/(TP+FN): 体现了分类模型对正样本的识别能力,recall越高,说明模型识别正样本的能力越强
        • F1- score= 2TP/(2TP+FN+FP) = 2*precision*recall/(precision + recall): 是两者的结合,F1-score越高,说明分类模型越稳健
      • Confusion Matrix
  • Results & analysis
    • Data source: PT Telkom Indonesia recorded from 2014.10 - 2015.9;  200387 rows, with 55 features, only  % of total data are churned. After feature extraction, 38 features are used
  • 3. 《Customer churn prediction for retail business》=> Useless

  • Abstract
    • Dataset: UCI machine learning repository; 2010.1 - 2011.1 transactions records
    • Method: preprocessing to remove NAs, validating numerical values, removing erroneous data points; perform aggregations on the data to generate invoice + customer data sets;  ML algorithms: SVM, RF, Extreme gradient boosting
  • Introduction
    • Customer churn = customer attrition = customer defection
    • Objective of this project:
      • Predict churn value for all the customers of the company for a given period of time
      • Compute the overall churn rate for the given time
      • Provide deeper insight into the sales by analyzing customers' buying pattern
      • Detect customers who are about to drop out from the business in order to take necessary steps
      • Provide clear visualizations of the churn predictions to help business come up with better strategies
      • Help business know the real value of a potential churn customer and retain him/her as a loyal customer by establishing priorities, optimizing resources, putting efficient business efforts and maximizing the value of the portfolio of the customer
      • Help business come up with personalized customer retention plans to reduce the churn rate
    • Deliverables
      • A system that can predict if a customer is a churn or not for a retail business
      • A system which can compute churn rate of the retail business
      • A system which can run multiple algorithms and compare performance  among them. Algorithms include RF, SVM, Gradient boosting to predict the customer churn for a given period of time
  • Literature survey
    • Sequential patterns
      • DEML:
      • Genetic modelling:
      • Neural Networks:
      • Logistic regression and random forests
      • Game theory: 博弈论
  • Design
    • Architecture design:
    • Sequence diagram
    • Data flow diagram
  •  
  • Implementation
    • Data size: 541909 rows
    • Preprocess: cleaning data + aggregation
    • Training:
    • ML model: RF; SVM;Gradient boosting
  • 4. 《A comparative study of customer churn prediction in telecom industry using ensemble based classfiers 》- Useless

  • Abstract
    • Comparing ensemble based classifiers were compared with well-known classifiers namely  decision tree, naïve Bayes classifier, and SVM
  • Introduction
  • Literature survey
  • Working methodology
    • Decision tree: C4.5 ----> accuracy is high, but fails to respond to noise
    • Naïve bayes --->
    • SVM  ---> not suited for data with noise
    • Bagging (bootstrap aggregation):  
      • Divide dataset into k subset with replacement
      • Train the model by using (k-1) subset and test the model by using the rest 1 subset
    • Boosting
      • Maintain a weight for each training tuple
    • Random forest:
      • Disadvantage: cannot handle unbalanced dataset by using random forest
  • 5. 《Customer churn prediction in an internet service provider》

  • Abstract
    • Methodology:
    • Feature engineering:
      • SMOTE oversampling method -- reduce the imbalance between the number of churners and non-churneres
      • Machine learning model: adaboost, extra trees, knn, neural network, xgboost
    • Experimental Results
      • Xgboost is the best, precision = 45.71%;  recall = 42.06%
    • Dataset characters:
      • Churn : un-churn=2 : 98
  • Introduction
    • Problem statement: predict whether customer will renew their service (monthly subscriptions). That means, service will be expired at the end of each month, the valid duration to renew the services is from the expired time to next 16 days. 
      • Churn customers: don't renew their subscription in next 16 days, before the end of current service  &  terminate current service
    • Noticed: the status of a customer, churn or non-churn, is determined at the end of each month, regardless of the previous status
    • After this period, users who do not renew the services are identified as churn customers   -----> we  can define, predict whether our customer will renew their subscription in next half of renew term.
  • Related work
    • Noticed:  Metrics in churn prediction:
      • For finding the most possible churning customers  ===    precision measures would be more effective
      • For purpose of retaining most customers =====    recall of the model needs to be improved
    • Reducing imbalanced data:
      • SMOTE
    • Model:
      • KNN
      • Adaboost: sensitive to nosie data and outliers
      • Extra - trees:
      • Neural Network:
  • Data and feature engineering
    • Features were seperated 3 main groups  --->   .1 customer information; 2. their usage data; 3. service data
      • Customer info: registration date, termination date, location, service type, cable type, bandwidth, payment history, promotion, and so on
      • Customer usage data: the initial date time of connection, disconnection date  time, reason for rejection, type of modem, user's daily usage such as amount of data downloaded & uploaded
      • Customer service data: customer's inbound and outbound call phone history; customer satisfaction surveys
  • 6. 《A review and analysis of churn prediction methods for customer retention in telecom industries》

  • Abstract
    • Focusing on analyzing the churn prediction techniques to identify the churn behavior and validate the reasons for customer churn
      • Summarize the churn prediction techniques --> deeper understand of the customer churn
      • Shows the most accurate churn prediction  --> hybrid models rather than single algorithms
  • Analysis of customer churn prediction methodologies
    • Preprocessing - imbalanced problem and sampling base on churn prediction
    • Ensemble methods: 
      • Reference: http://scikit-learn.org/stable/modules/ensemble.html
      • Goal: combine the predictions of several base estimators but with a given learning algorithm in order to improve generalizability / robustness over a single estimator.
      • Two families of ensemble methods:
        • Averaging methods: the driving principle is to build several estimators independently and then to average their predictions. On average, the combined estimator is usually better than any of the single base estimator because its variance is reduce
          • Examples: bagging methods, forecast of randomized trees
            • Bagging meta-estimator
              • Bagging method form a class of algorithms which build several instances of a black-box estimator on random subsets of the original training set and then aggregate their individual predictions to form a final prediction.
                • Forecast of randomized trees
                  • RF: each tree in the ensemble
                  • Extra- tress
        • Boosting methods: base estimators are built sequentially and one tries to reduce the bias of the combined estimator,. The motivation is to combine several weak models to produce a powerful ensemble
          • Examples: Adaboost, gradient tree boosting
    • Churn prediction from big tree
  • 7. 《Using deep learning to predict customer churn in a mobile telecommunication network》

  • Abstract
    • Auto-encoders   ----->  deep belief networks ----> multi-layer feedforward networks
    • Framework:   four - layer feedforward architecture
  • Introduction
    • Motivation:  use deep learning to avoid time-consuming feature engineering effort and ideally to increase the predictive performance of previous models.
    • Dataset intro:
      • Historical data from a telecommunication company with nearly 1.2 million customers and span over sixteen months
      • Challenging Characteristics:
        • Churn rate is very high and all customers are prepaid users
  • Churn prediction in prepaid mobile telecommunication network
    • Goal: to infer when this lack of activity may happen in the future for each active customer
    • State definition: 
  • Deep learning models for churn prediction
    • Input data & preparation:
  • 8. 《Churn analysis and plan recommendation for telecom operators》

  • Abstract
    • In this paper, we design a hybrid ML classifier to predict if a customer will churn based on the CDR parameters an  we also propose a rule engine to suggest best plans
  • 9. 《A data mining process framework for churn management in mobile telecommunication industry》

  • Introduction
    • Aims:By using a combination of expert systems and machine learning techniques, the process framework handles churn prediction from 3 perspectives:
      • Prediction of which subscriber may churn
      • Determination of reasons why subscriber may churn
      • Recommendations of appropriate strategy for customer retention
    • Data
      • A rich chunk of telecom subscribers' demographic data
      • Subscribers' transactions information
      • Subscribers' complaints information
  • Process framework
  • Experiment
    • 1. Collect Raw Dataset
      • Subscriber Data: caller number, called number, incoming route, outgoing route, amount b4 call, amount after call, inter national mobile subscriber identity, exchange id, record type, event type, date of subscription, type of service subscribed and subscribe number
      • Complaint Data:  request_complained_id, date of complaint, time of complaint, type f complaint, status (open and close), imputer(internal staff initiator), handle_by_person
    • 2. Data prediction model
      • Raw Data ---> features
        • 20 featurers
      • Churn  prediction with artificial neural network
      • Generating churn reasons and intervention strategy
        • Results obtained from churn prediction using ANN ---> decision support expert system (DSES) to generate probable reasons for churn & recommendations for customer retention.
          • DSES: - a set of if-then rules that enabled the generation of recommendations of appropriate incentives based on the credit rating of a subscriber.
            • Classify subscribers ---> high-valued, medium-valued and low-valued , so we can ignore low valued subscribers, and put more efforts on high-valued & medium valued.
            • Generate churn reasons
              • Based on the rules to determine the churn reasons.
                • The following is a sample of Jess Rules in the DSEM

                                    

10. 《Using deep learning to predict customer churn in a mobile telecommunication network》

  • Understanding and calculating churn
    • High level
      • Measure of how many customers leave over a set time period
      • Measure how much revenue you loose through customer cancellations
    • How churn can impact the bottom line
      • Calculate Lifetime value (LTV) - to understand the value a CSM has
        • Basic LTV
        • Cost of customer acquisition (COCA)
        • Cost of Goods sold (COGS)
      • Calculate churn
        • Customer churn
        • Revenue churn
    • Analyzing churn
      • Reasons -
        • Find churn reasons to focus and prioritize
        • Know whether actions to retain customer is working
      • Methods
        • Cohort reports 列式报表
          • Type 1
        • Churn by customer age - grouping your  customer by age
        • Churn by customer behavior
          • Need to look at customers who use a certain feature or complete a certain action and determine it's impact on churn

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值