Python实现预测信用卡潜在客户

本文介绍了Happy Customer Bank如何利用Python进行数据挖掘,预测信用卡潜在客户。银行希望通过各种通信方式向现有客户交叉销售信用卡,并已筛选出一组目标客户。文章实现了LR、RF、LIGHTGBM和XGBOOST等模型来识别高意向客户。
摘要由CSDN通过智能技术生成

一、数据集

有一家名为Happy Customer Bank (快乐客户银行) 的银行,是一家中型私人银行,经营各类银行产品,如储蓄账户、往来账户、投资产品、信贷产品等。

该银行还向现有客户交叉销售产品,为此他们使用不同类型的通信方式,如电话、电子邮件、网上银行推荐、手机银行等。

在这种情况下,Happy Customer Bank 希望向现有客户交叉销售其信用卡。该银行已经确定了一组有资格使用这些信用卡的客户。

银行希望确定对推荐的信用卡表现出更高意向的客户。

数据集:dataset

该数据集主要包括:

  1. 客户详细信息(gender, age, region, etc

  2. 他/她与银行的关系详情(Channel_Code、Vintage、Avg_Asset_Value, etc

在这里,我们的任务是构建一个能够识别对信用卡感兴趣的客户的模型。

二、本文实现的模型

完成了LR、RF、LIGHTBGM、XGBOOST等模型的预测

三、代码

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
# Import dataset
df_train = pd.read_csv(r"C:\Users\Administrator\Desktop\DATA\Credit-Card-Lead-Prediction-main\train_s3TEQDk.csv")
df_train.head()
# Shape of the data
df_train.shape

# There is 2.45L rows and 11 columns are there.
# Datatypes of the dataset
df_train.info()
# Five point summary for numerical variables
df_train.describe(exclude='object')

# Minimum age of the customer is found to be 23yrs and maximum age is 85yrs
# Five point summary for categorical variables
df_train.describe(include='object')

#单变量分析
# Count plot for gender variable
plt.figure(figsize=(6,5))
sns.countplot(df_train['Gender'])
plt.show()

# dataset consist of more male gender observations than female.
# Unique region code names
df_train['Region_Code'].unique()

# distribution of age
plt.figure(figsize=(8,5))
sns.distplot(df_train['Age'])
plt.show()
# Age variable is right skewed.
# between 26-28yrs and 46-49yrs most of the customers are seen

# distribution of Vintage 该资金投资的起始年份
plt.figure(figsize=(8,5))
sns.distplot(df_train['Vintage'])
plt.show()
# Vintage variable is right skewed.

# Occupation of customers
plt.figure(figsize=(10,5))
sns.countplot(df_train['Occupation'])
plt.show()
# Most of the customers are self employed and very least is Entrepreneur

# Unique channel code
df_train['Channel_Code'].unique()
# There are 4 differnt channel code present in the dataset

# credit product of customers
plt.figure(figsize=(10,5))
sns.countplot(df_train['Credit_Product'])
plt.show()
# Most of the customers do not have credit card products

# customers status
plt.figure(figsize=(10,5))
sns.countplot(df_train['Is_Active'])
plt.show()
# Most of the customers are not active in last 3months

# customers interest in purchase of credit card product
plt.figure(figsize=(10,5))
sns.countplot(df_train['Is_Lead'])
plt.show()
# Very few customers are showing interest in buying credit card product

#双变量分析
# Gender with target
plt.figure(figsize=(15,5))
pd.crosstab(df_train['Gender'], df_train['Is_Lead']).plot(kind='bar')
plt.show()
# Males are more interested towards buying credit card than females

df_t
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值