tableau分析客户流失_功能强大的eda,具有Tableau和ibm watson电信客户流失数据上的全调xgboost模型...

tableau分析客户流失

Life starts when you solve problems, as a data scientist I love solving business problems.

当您解决问题时,生命就开始了,作为数据科学家,我喜欢解决业务问题。

首先是什么流失率: (What Is Churn First of All:)

Churn is the Number of subscribers to a service that discontinue their subscription to that service in a given time period. In order for a company to expand its Client Base, its growth rate (i.e. its number of new customers) must exceed its churn. Churn is an important consideration in the telephone and cell phone services industry.

客户流失率是指在给定时间段内停止订阅该服务的服务的订户数。 为了使公司扩大客户群,其增长率(即新客户数量)必须超过其客户流失率。 流失是电话和手机服务行业中的重要考虑因素。

为什么公司非常担心客户流失: (Why Companies is Much worried about the Churn:)

Image for post
Photo by LYCS Architecture on Unsplash
LYCS ArchitectureUnsplash上的 照片

Churn is used to indicate the strength of a company’s customer division and its overall growth prospects.The less the Churn the more the company can make revenue out of Them.

客户流失率用来表示公司客户部门的实力及其整体增长前景。客户流失率越低,公司可以从中获得更多收入。

High Churn means the company need to again spend money to acquire new customer Base.

高流失率意味着公司需要再次花钱来获得新的客户群。

Thats why companies are much worried about churn because its always difficult to acquire new customers and its mostly easy to retain them but the important question is how we know who will churn.

这就是为什么公司非常担心流失的原因,因为它总是很难获得新客户,而且最容易留住新客户,但是重要的问题是我们如何知道谁会流失。

That's where we find our Business Problem.

那就是我们发现业务问题的地方。

业务问题: (Business Problem:)

Image for post
Photo by Daria Nepriakhina on Unsplash
Daria NepriakhinaUnsplash拍摄的照片

Every Day to day passing by the competition is high in the market for the Telecom Industry and losing the customers from its customer base gives a lot of loss to the company and on the other hand, acquiring new customers is difficult and costly. The Telecom Company wants to know the customers who going to churn and want a model that classifies the customers which are going to churn so that the company can run measures to retain them.

竞争日益激烈,电信行业市场每天都在流失,失去客户基础的客户给公司造成了很大的损失,另一方面,获取新客户既困难又昂贵。 电信公司希望了解将要流失的客户,并需要一个模型来对将要流失的客户进行分类,以便公司可以采取措施挽留他们。

分类说明: (Classification Description:)

I will be classifying the customers based on the various features we collected from the Telecom Company and will be given output if the customer will churn or not.

我将根据我们从电信公司收集的各种功能对客户进行分类,并且无论客户是否流失,都将获得输出。

资料说明: (Data Description:)

Image for post
Photo by Carson Masterson on Unsplash
Carson MastersonUnsplash上的 照片

The data is from the IBM Watson Of Telecom Churn, Thanks to IBM providing real-life scenario data so that like me aspiring Data scientists can learn and perform the task which can be in future replicated in Real Industry.

数据来自IBM Watson Of Telecom Churn,这要归功于IBM提供的真实场景数据,以便有抱负的数据科学家可以学习和执行将来可以在Real Industry中复制的任务。

属性信息: (Attribute Information:)

customerID : Customer Identification

customerID :客户标识

Gender : the customer is a male or a female

性别:客户是男性还是女性

SeniorCitizen : the customer is a senior citizen or not (1, 0)

老年人:客户是否是老年人(1、0)

Partner : customer a partner or not (Yes, No)

合作伙伴:客户是否是合作伙伴(是,否)

Dependents : customer dependents or not (Yes, No)

家属:是否为客户家属(是,否)

Tenure : Number of months the customer stayed with the company

任期:客户在公司停留的月数

PhoneService : a phone service or not (Yes, No)

PhoneService :电话服务与否(是,否)

MultipleLines : customer multiple lines or not (Yes, No, No phone service)

MultipleLines :是否有客户多条线路(是,否,没有电话服务)

InternetService : Customer’s internet service provider (DSL, Fiber optic, No)

InternetService :客户的Internet服务提供商(DSL,光纤,否)

OnlineSecurity : customer online security or not (Yes, No, No internet service)

OnlineSecurity :是否提供客户在线安全性(是,否,没有互联网服务)

OnlineBackup : customer online backup or not (Yes, No, No internet service)

OnlineBackup :是否进行客户在线备份(是,否,没有互联网服务)

DeviceProtection : customer device protection or not (Yes, No, No internet service)

DeviceProtection :是否保护客户设备(是,否,没有互联网服务)

TechSupport : customer tech support or not (Yes, No, No internet service)

010-62529275:客户技术支持或不(是,否,否互联网服务)

StreamingTV : customer streaming TV or not (Yes, No, No internet service)

StreamingTV :是否提供客户流电视(是,否,没有互联网服务)

StreamingMovies : customer streaming movies or not (Yes, No, No internet service)

StreamingMovies :是否播放流媒体电影(是,否,没有互联网服务)

Contract : The contract term of the customer (Month-to-month, One year, Two years)

合同:客户的合同期限(月至月,一年,两年)

PaperlessBilling : the customer has paperless billing or not (Yes, No)

无纸化计费:客户是否有无纸化计费(是,否)

PaymentMethod : The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic))

PaymentMethod :客户的付款方式(电子支票,邮寄支票,银行转帐(自动),信用卡(自动))

MonthlyCharges : The amount charged to the customer monthly

MonthlyCharges :每月向客户收取的金额

TotalCharges : The total amount charged to the customer

TotalCharges :向客户收取的总金额

Churn : Whether the customer churned or not (Yes or No)

流失:客户是否搅拌(是或否)

We have Successfully defined our business Problem and now we will solve the Problem Using our Business Understanding First with approaching the Problem Solving by Exploratory Data Analysis and then using Machine learning to Classify the Churn customers.

我们已经成功定义了业务问题,现在我们将使用我们的业务理解来解决问题,首先通过探索性数据分析来解决问题,然后使用机器学习对客户流失进行分类。

探索性数据分析: (Exploratory Data Analysis:)

Image for post
Tableau Link File
Tableau链接文件

This time I will be doing the EDA in Tableau, as is a very powerful tool.I have created a dedicated dashboard regarding the same and have depicted a powerful Data story of Telcom churn.

这次,我将在Tableau中进行EDA,这是一个非常强大的工具。我创建了与此相关的专用仪表板,并描绘了一个强大的Telcom流失数据故事。

Tableau Dashboard Link:

Tableau仪表板链接:

https://public.tableau.com/profile/shubham.pundir#!/vizhome/TelecomChurnEDAAndInsightStory/TelecomChurnEDAAndInsightStory?publish=yes

https://public.tableau.com/profile/shubham.pundir#!/vizhome/TelecomChurnEDAAndInsightStory/TelecomChurnEDAAndInsightStory?publish=yes

I will be linking the snips here for better understanding.

我将在此处链接这些片段以更好地理解。

Image for post

Our data contains 26% of the churn people and 73% of the people who did not churn.

我们的数据包含26%的流失人员和73%的未流失人员。

Image for post

Insights: Contracts

见解:合同

In a single view, we will be looking at the Gender ratio among the Churn people in the Contract and there charging pattern.

从单一角度来看,我们将研究合同中流失人员中的性别比例以及那里的收费模式。

The above figures show The combination of Contract and the average Total, Monthly charges with the tenure.

上面的数字显示了合同和平均总月租费用与使用期限的组合。

The Green is: Female

绿色是:女性

The Gold is: Male

金是:

The Values in The Bars is the average Charges and tenure in months and year.

酒吧的价值是按月和年计算的平均收费和使用期限。

We can see that the people who have churned yes have a greater Money generation in the two year and one year Contracts it means when these customer leave the company it generated a huge loss.

我们可以看到,选择“是”的人在两年和一年的合同中产生了更多的金钱,这意味着这些客户离开公司时会产生巨大的损失。

Most of the revenue Generation is from the long Contracts ,company should concentrate more on retaining the longer contracts.

大部分收入来自长期合同,公司应更多地专注于保留较长的合同。

Image for post

Insights: Tech Support

见解:技术支持

In a single view, we will be looking at the Gender ratio among the Churn people in Tech Support and there charging pattern.

在单一视图中,我们将研究技术支持部门中流失人员的性别比例以及那里的收费模式。

The above figures show The combination of Tech Support and the average Total, Monthly charges with the tenure.

上图显示了技术支持与使用权的平均每月总费用的组合。

The Green is: Female

绿色是:女性

The Gold is: Male

金是:

The Values in The Bars is the average Charges and tenure in months and year.

酒吧的价值是按月和年计算的平均收费和使用期限。

We can see that the people who have churned YES have a less charging in the Tech Support ,we can conclude that most of the unavilability of Tech support the People are leaving the Company.

我们可以看到,对“是”进行培训的人员对技术支持的收费较低,我们可以得出结论,人们离开公司的大多数技术支持都是不合理的。

We can also see that in the monthly basis the average charging is same means that people are not much satisfied with the service hence Churn YES.

我们还可以看到,按月收取的平均费用是相同的,这意味着人们对该服务不太满意,因此“ Churn YES”。

The Tenure section says it all, validates it as we can see the average tenure of the People with Tech support is less and should be taken into considersation by the company.

权属部分说明了所有情况,并对其进行了验证,因为我们可以看到技术支持人员的平均任期较少,并且公司应考虑在内。

Image for post

Insights: Streaming TV

见解:串流电视

In a single view, we will be looking at Gender ratio among the Churn people in Streaming TV and there charging pattern.

从单一的角度来看,我们将研究流媒体电视中流失人群中的性别比例以及充电方式。

The above figures show The combination of Tech Support and the average Total, Monthly charges with the tenure.

上图显示了技术支持与使用权的平均每月总费用的组合。

The Green is: Female

绿色是:

The Gold is: Male

金是:

The Values in The Bars is the average Charges and tenure in months and year.

酒吧的价值是按月和年计算的平均收费和使用期限。

We can see that the people who have churned YES are generating less revenue still they have subscribed to the Steaming TV ,their should be more customer centric plans to increase the revenue and to retain them.

我们可以看到,选择“是”的人仍在订阅Steaming TV,他们所获得的收入却越来越少,他们应该以客户为中心,计划增加收入并保留他们。

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more yearly plans should come up with customer centric mindset.

平均任期也很少,这意味着他们甚至连续一年都没有使用过,而在此之前就放弃了,更多的年度计划应该以客户为中心。

Image for post

Insights: Streaming Movies

见解:流电影

In a single view, we will be looking at the Gender ratio among the Churn people in Streaming Movies and there charging pattern.

在单一视图中,我们将查看“流媒体电影”中的流失人群中的性别比例以及那里的收费模式。

The above figures show The combination of Streaming Movies and the average Total, Monthly charges with the tenure.

上图显示了流媒体电影和平均总月租费用与使用期限的组合。

The Green is: Female

绿色是:

The Gold is: Male

金是:

The Values in The Bars is the average Charges and tenure in months and year.

酒吧的价值是按月和年计算的平均收费和使用期限。

We can see that the people who have churned YES are generating less revenue still they have subscribed to the Steaming Movies ,their should be more customer centric plans to increase the revenue and retain them.

我们可以看到,选择“是”的人仍在订阅“蒸汽电影”,他们的收入却越来越少,他们应该以客户为中心,计划增加收入并保留他们。

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more yearly Movie plans should come up with customer centric mindset.

平均任期也很少,这意味着他们甚至连续一年都没有使用过,而在此之前就放弃了,更多的年度电影计划应该以客户为中心。

Image for post

Insights: Phone Service

见解:电话服务

In a single view, we will be looking at the Gender ratio among the Churn people in Phone Service and there charging pattern.

从单一角度来看,我们将研究电话服务中流失人员中的性别比例以及那里的收费模式。

The above figures show The combination of Phone Service and the average Total, Monthly charges with the tenure.

上面的数字显示了电话服务与使用权的平均每月总费用的组合。

The Green is: Female

绿色是:女性

The Gold is: Male

金是:

The Values in The Bars is the average Charges and tenure in months and year.

酒吧的价值是按月和年计算的平均收费和使用期限。

We can see that the people who have churned YES are generating less revenue as there are signficantly many in the monthly charges who have left the service means there is wrong in the service provided by the company the phone service should be more customer centric.

我们可以看到,选择“是”的人所产生的收入较少,因为每月有大量费用离开了该服务,这意味着该公司提供的服务存在错误,因此电话服务应以客户为中心。

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more yearly Phone service plans should come up with customer centric mindset.

平均任期也很少,这意味着他们甚至连续一年都没有使用过,而在此之前就放弃了,更多的年度电话服务计划应该以客户为中心。

Image for post

Insights: Online Security

见解:在线安全

In a single view, we will be looking at the Gender ratio among the Churn people in Online Security and there charging pattern.

从单一角度来看,我们将研究在线安全中的流失人员中的性别比例以及那里的收费模式。

The above figures show The combination of Online Security and the average Total, Monthly charges with the tenure.

上图显示了在线安全性和平均总月租费用与使用期限的组合。

The Green is: Female

绿色是:

The Gold is: Male

金是:

The Values in The Bars is the average Charges and tenure in months and year.

酒吧的价值是按月和年计算的平均收费和使用期限。

We can see that the people who have churned YES are generating less revenue we should deep dive more into the Online security measures and should one to one clear the online security problems people are facing and customers can be retained.

我们可以看到,选择“是”的人所产生的收入减少了,我们应该深入研究在线安全措施,并且应该一对一地解决人们面临的在线安全问题并保留客户。

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more Online Security Measures plans should come up with customer centric mindset.

平均任期也非常少,这意味着他们甚至连续一年都没有使用过,而在此之前就放弃了,更多的在线安全措施计划应该以客户为中心的思维方式。

Image for post

Insights: Online Backup

见解:在线备份

In a single view, we will be looking at the Gender ratio among the Churn people in Online Backup and there charging pattern.

在单一视图中,我们将查看在线备份中的客户流失率中的性别比例以及那里的收费模式。

The above figures show The combination of Online Backup and the average Total, Monthly charges with the tenure.

上图显示了在线备份与平均总,每月费用与使用期限的组合。

The Green is: Female

绿色是:

The Gold is: Male

金是:

The Values in The Bars is the average Charges and tenure in months and year.

酒吧的价值是按月和年计算的平均收费和使用期限。

We can see that the people who have churned YES are generating less revenue we should deep dive more into the Online Backup measures and should one to one clear the online Backup problems people are facing and customers can be retained.

我们可以看到,选择“是”的人所产生的收入减少了,我们应该深入研究“在线备份”措施,并且应该一对一地解决人们面临的在线备份问题并保留客户。

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more Online Backup Measures plans should come up with customer centric mindset.

平均使用期限也非常少,这意味着他们甚至连续一年都没有使用过,而在此之前将其丢弃,更多的在线备份措施计划应该以客户为中心的思维方式。

Image for post

Insights: Internet Service

见解:互联网服务

In a single view, we will be looking at the Gender ratio among the Churn people in Internet Service and there charging pattern.

从单一角度来看,我们将研究Internet服务中流失人员中的性别比例以及那里的收费模式。

The above figures show The combination of Internet Service and the average Total, Monthly charges with the tenure.

上面的数字显示了Internet服务以及使用权的平均每月总费用。

The Green is: Female

绿色是:

The Gold is: Male

金是:

The Values in The Bars is the average Charges and tenure in months and year.

酒吧的价值是按月和年计算的平均收费和使用期限。

We can see that the people who have churned YES are generating less revenue we can even see that there is less amount of people opting for Fiber optics and DSL company should come up with customer centric flexible plans to provide least Internet as the other steaming TV and movies is more dependent on those.

我们可以看到,选择“是”的人产生的收入更少,我们甚至可以看到,选择光纤和DSL公司的人数减少了,应该提出以客户为中心的灵活计划,以提供最少的互联网服务,而其他蒸汽电视和电影更依赖那些电影。

The average tenure is also very less means they are not even using it for straight a year and dropping it before that,more Internet Service flexible plans should come up with customer centric mindset.

平均使用期限也非常少,这意味着他们甚至连一年都没有使用它,而在此之前将其放弃,更多的Internet Service灵活计划应该以客户为中心的思维方式提出。

Image for post

Insights: Contract pattern overall Data

见解:合同模式总体数据

In a single view, we will be looking at the Contract pattern among the Churn people in all data and there charging pattern.

在单一视图中,我们将查看所有数据中流失人员之间的合同模式以及那里的计费模式。

The above figures show The combination of Contract pattern and the average Total, Monthly charges with the tenure.

上图显示了合同模式和平均总月租费用与使用期限的组合。

The Green is: Churn NO

绿色是:客户流失

The Red is: Churn Yes

红色是:流失是

The Values in The Bars is the average Charges and tenure in months and year.

酒吧的价值是按月和年计算的平均收费和使用期限。

In the Total charges we can see that The Churn No is creating a huge loss for the company,but in month to month is not that much to be worried about.

在总费用中,我们可以看到Churn No为公司造成了巨大的损失,但是每个月都不必担心。

The monthly customers are not showing much of patter In the Contract,but Average tenure is worth looking into for the contract in the Company.

每月客户在合同中的表现并不明显,但是平均任期值得在公司中寻找合同。

Image for post

Insights: Payment Method pattern overall Data

见解:付款方式模式总体数据

In a single view, we will be looking at the Payment Method pattern among the Churn people in all data and there charging pattern.

在单一视图中,我们将查看所有数据中流失人员中的“付款方式”模式以及那里的计费模式。

The above figures show The combination of Payment Method and the average Total, Monthly charges with the tenure.

上图显示了“付款方式”和平均总月租费用与使用期限的组合。

The Green is: Churn NO

绿色是:客户流失

The Red is: Churn Yes

红色是:流失是

The Values in The Bars is the average Charges and tenure in months and year.

酒吧的价值是按月和年计算的平均收费和使用期限。

Pattern In Total Charges Among Various Paying Method:

各种付款方式中的总费用模式:

The people who are paying vai Bank tranfer less then 23000 are likely to be Churn YES.

支付少于23000的vai银行转账的人可能是Churn YES。

The people who are paying vai Credit Card less then 24000 are likely to be Churn YES.

支付少于24000的vai信用卡的人可能是Churn YES。

The people who are paying vai Elctronic Check less then 15000 are likely to be Churn YES.

支付少于15000的电子支付支票的人很可能会选择“是”。

The people who are paying vai Mail Check less then 500 are likely to be Churn YES.

支付少于500的vai Mail Check的人可能是Churn YES。

Pattern In Monthly Charges Among Various Paying Method:

各种付款方式中的月度收费方式:

On an average who are paying less then 80 they are likely to get Churn YES.

平均而言,支付不到80美元的人很可能会获得“ Churn YES”。

Pattern In Tenure Among Various Paying Method:

各种支付方式中的任期模式:

If a customer is with the company for less then 30 months and is paying via Bank Transfer and Credit Card they are likely to get Churn Yes.

如果客户在公司的服务时间少于30个月,并且正在通过银行转帐和信用卡付款,那么他们很可能会获得“是”。

If a customer is with the company for less then 17 months and is paying via Electronic Check they are likely to get Churn Yes.

如果客户在公司的服务时间不足17个月,并且正在通过电子支票付款,则他们很可能会获得“客户支持”。

If a customer is with the company for less then 9 months and is paying via Mailed Check they are likely to get Churn Yes.

如果客户在公司的服务时间少于9个月,并且正在通过邮寄支票付款,则他们很可能会获得Churn Yes。

The company should try to increase the tenure of the payers and move them to automatic Paying via options by giving more attractive cashback options and This will help in Less Churn Yes.

公司应尝试通过提供更具吸引力的现金返还选项来增加付款人的保有权期限,并将其转移到通过选项自动付款,这将有助于减少流失。

Lest kick in our Machine Learning and apply the All best XGboost and tune The model to reach our best accuracy Score(Using Confusion Matrix).

最好不要使用我们的机器学习方法,并应用所有最佳XGboost并调整模型,以达到我们的最高准确性得分(使用混淆矩阵)。

XGboost全面了解: (XGboost Total understanding:)

Whenever the Imbalanced data set comes up to mind The XGboost performs really well. I use XGBOOST in the imbalanced data set because I don’t want to opt for the Upsampling and Downsampling as it creates a bias if I upsample and loss of Valuable information when we do downsample.

每当出现不平衡数据集时,XGboost的性能都非常好。 我在不平衡的数据集中使用XGBOOST,因为我不想选择Upsampling和Downsampling,因为如果我进行降采样时进行升采样并丢失有价值的信息,则会产生偏差。

Ther is a hyperparameter Scale_pos_weight which lets the Xgboost penalize each time it classifies wrong the class and it helps to reach a better accuracy other algorithms fail to.

Ther是一个超参数Scale_pos_weight ,它使Xgboost每次在对类进行错误分类时都会受到惩罚,并有助于获得其他算法无法达到的更好的准确性。

Machine Learning (XGBOOST) Highlights:

机器学习(XGBOOST)的亮点:

Image for post

When Ever you get the Inbalance data always fo the Stratified Split, what it does is that it splits the same amount of class in both the testing and training Set, and it's very important.

当Ever始终获得“分层拆分”的不平衡数据时,它的作用是在测试和培训集中拆分相同数量的班级,这一点非常重要。

Image for post

Always keep the Track of the AUC scores while training model Do early stoping at 10 so that you get the lowest Validation Auc.

训练模型时始终跟踪AUC分数。尽早在10处停止,以使验证Auc最低。

Image for post

You can see here without tuning the hyperparameters it is not much doing the better job for True negatives which is our main concern.

您可以在此处看到,而无需调整超参数,对于True负数这是我们主要关注的事情,并没有做得更好。

Image for post

Round one of hyperparameter Tuning to know which parameter to tune more and get a better performance in classifying our True negative Churn YES.

回合超参数调整的其中一项,以了解在对我们的“真”负“搅动”分类时,哪个参数需要更多调整并获得更好的性能。

Image for post

Round 2 of hyperparameter Tuning for better classification.

超参数调整的第2轮,以实现更好的分类。

Image for post

Final Hyperparameter for The XGBOOST.

XGBOOST的最终超参数。

Accuracy Measure:

精度度量:

Image for post

We were Successfully able to classify 86% correctly the Churn Yes customers.

我们能够成功将86%的Churn Yes客户正确分类。

Let's plot the first decision Tree to have an idea of how the functionality is Happening.

让我们绘制第一个决策树,以了解功能如何实现。

Image for post

结论: (Conclusion:)

We have successfully solved the Business problem and have given insightful Mesures while Exploratory Analysis of data to help retain Customers at Company Level and successfully proposed a model which is 86% accurate in predicting the Customers who are going to churn.

我们已经成功解决了业务问题,并在进行探索性数据分析时给予了深刻见解,以帮助将客户保留在公司级别,并成功提出了一种模型,该模型可以准确预测86%的客户。

I hope you liked this journey of Business Problem Solving with me next time, will come up with another interesting Business Problem.

我希望您下次喜欢与我一起解决业务问题的旅程,并且会提出另一个有趣的业务问题。

翻译自: https://medium.com/analytics-vidhya/powerful-eda-with-tableau-and-full-tuned-xgboost-model-on-ibm-watson-telecom-churn-data-6de4a4d47942

tableau分析客户流失

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值