08_行销（Marketing）预测生命周期价值（CLV）

最新推荐文章于 2022-05-17 19:29:59 发布

Vivian Ouyang

最新推荐文章于 2022-05-17 19:29:59 发布

阅读量989

点赞数

分类专栏：行销学（Marketing）中的数据科学文章标签： python 机器学习

本文链接：https://blog.csdn.net/oyww710/article/details/106157868

版权

本文探讨了在营销策略中客户生命周期价值（CLV）的重要性，并介绍了如何利用Python建立回归模型预测CLV。通过分析在线零售数据集，进行了数据清理和预处理，构建了样本集，并应用线性回归模型进行建模。评估指标包括R²和中位绝对误差。

摘要由CSDN通过智能技术生成

行销（Marketing）预测生命周期价值（CLV）

在营销中，对营销活动的预算始终是挑战。我们不想花太多钱而导致ROI下降。但是，我们也不想花费太多，也没有明显的影响或结果。在确定营销策略的预算时，至关重要的是要知道运行给定的营销活动会带来多少预期回报。了解单个客户的客户生命周期价值（CLV）可以帮助营销人员证明其营销预算的合理性，并定位潜在的高价值客户。通常，获取新客户要比保留现有客户更为昂贵，因此，为了建立具有正ROI的营销策略，了解生命周期价值和与获取新客户相关的成本至关重要。例如，如果客户的平均CLV为100元，而获得新客户仅需花费10元，那么在获得新客户时，我们的业务就会产生更多的收入。

有多种方法可以计算CLV。一种方法是找到客户的平均购买金额，购买频率和寿命，并进行简单的计算以获得CLV。例如，考虑一个假设的案例，客户的平均购买金额为100，并且他/她平均每月进行五次购买。则该客户每月的平均价值为500美元，这就是平均购买金额乘以平均购买频率。现在，我们需要知道该客户的生命周期。估算客户生命周期的一种方法是查看平均每月客户流失率，即客户离开并终止与公司的业务关系的客户所占的百分比。您可以用客户流失率除以客户的生命周期。假设在我们假设的情况下，客户流失率为5％，则估计客户的生命周期为20年（1/5%)。给定该客户每月的平均价值为500元，寿命为20年，则该客户的CLV为120,000。最终的CLV金额是通过将500（每月的平均值）乘以12个月以及20年的寿命来计算的。

在本文中，我会建立一个回归模型来预测连续的变量CLV。我会使用一个在线零售的数据集，仍然是来自Kaggle，数据集是OnlineRetail.csv 。

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

/kaggle/input/onlineretail/OnlineRetail.csv

Load Packages

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, median_absolute_error

%matplotlib inline

Load the data

df=pd.read_csv(r"../input/onlineretail/OnlineRetail.csv", encoding="cp1252")

df.head(3)

	InvoiceNo	StockCode	Description	Quantity	InvoiceDate	UnitPrice	CustomerID	Country
0	536365	85123A	WHITE HANGING HEART T-LIGHT HOLDER	6	12/1/2010 8:26	2.55	17850.0	United Kingdom
1	536365	71053	WHITE METAL LANTERN	6	12/1/2010 8:26	3.39	17850.0	United Kingdom
2	536365	84406B	CREAM CUPID HEARTS COAT HANGER	8	12/1/2010 8:26	2.75	17850.0	United Kingdom

df.shape

(541909, 8)

Data Clean-Up

Negative Quantity

df.loc[df['Quantity'] <= 0].shape

(10624, 8)

df = df.loc[df['Quantity'] > 0]

Missing CustomerID

df = df[pd.notnull(df['CustomerID'])]

Excluding Incomplete Month

print('Date Range: %s ~ %s' % (df['InvoiceDate'].min(), df['InvoiceDate'].max()))

Date Range: 1/10/2011 10:32 ~ 9/9/2011 9:52

df = df.loc[df['InvoiceDate'] < '2011-12-01']

Total Sales

df['Sales'] = df['Quantity'] * df['UnitPrice']

Per Order Data


orders_df = df.groupby(

最低0.47元/天解锁文章

Vivian Ouyang

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

08_行销（Marketing）预测生命周期价值 （CLV）

行销（Marketing）预测生命周期价值 （CLV）

Load Packages

Load the data

Data Clean-Up

08_行销（Marketing）预测生命周期价值（CLV）

行销（Marketing）预测生命周期价值（CLV）