Prophet代码实战（一）趋势项调节

现实、狠残酷

已于 2023-02-03 09:44:01 修改

阅读量2k

点赞数 5

分类专栏： Prophet 文章标签：人工智能 python

于 2022-10-10 16:01:06 首次发布

本文链接：https://blog.csdn.net/qq_34184505/article/details/127241238

版权

Prophet 专栏收录该内容

4 篇文章 6 订阅

订阅专栏

这里写目录标题

Prophet Quick Start

Prophet的缺点

时间t上的观测值的分布只能是高斯分布
无法高效处理大量相关时间序列

数据格式

Prophet 的输入必须包含两列的数据框：ds 和 y 。

ds 列必须包含日期（YYYY-MM-DD）或者是具体的时间点（YYYY-MM-DD HH:MM:SS）。
y 列必须是数值变量，表示我们希望去预测的量。

example_wp_log_peyton_manning.csv下载地址：

import pandas as pd
from prophet import Prophet

# 读入数据集
df = pd.read_csv('data/example_wp_log_peyton_manning.csv')

print(df.tail(5))
"""
              ds          y
2900  2016-01-16   7.817223
2901  2016-01-17   9.273878
2902  2016-01-18  10.333775
2903  2016-01-19   9.125871
2904  2016-01-20   8.891374
"""

建模流程

通过使用辅助的方法 Prophet.make_future_dataframe 来将未来的日期扩展指定的天数，得到一个合规的数据框。

m = Prophet()
m.fit(df)
# 构建待预测日期数据框，periods = 365 代表除历史数据的日期外再往后推 365 天
horizon = 365
future = m.make_future_dataframe(periods=horizon)
future.tail(5)
"""
             ds
3265 2017-01-15
3266 2017-01-16
3267 2017-01-17
3268 2017-01-18
3269 2017-01-19
"""
# 预测
forecast = m.predict(future)
# 通过 Prophet.plot 方法传入预测得到的数据框，可以对预测的效果进行绘图。
fig1 = m.plot(forecast)
# 使用 Prophet.plot_components 方法。默认情况下，将展示趋势、时间序列的年度季节性和周季节性。如果之前包含了节假日，也会展示出来。
fig2 = m.plot_components(forecast)

在这里插入图片描述
如果想查看预测的成分分析，可以使用 Prophet.plot_components 方法。默认情况下，将展示趋势、时间序列的年度季节性和周季节性。如果之前包含了节假日，也会展示出来。

在这里插入图片描述

Prophet详解

在另一篇文章Prophet算法简介中介绍了Prophet算法是一个加（乘）法模型，可分解为趋势项、季节项、外部变量（节假日）、误差项（随机项）。接下来我们一一介绍如何设置或者调节Prophet这些分解的成分。

趋势

调整趋势的类型"growth"：

线性：“linear” 默认值，线性趋势，可选择分段线性趋势
逻辑斯蒂：“logisitic” 逻辑斯蒂趋势，可选择分段逻辑斯蒂趋势
无趋势：“flat” 无趋势

线性趋势

m = Prophet(growth="linear")
m.fit(df)
horizon = 365
future = m.make_future_dataframe(periods=horizon)
forecast = m.predict(future)
fig = m.plot_components(forecast[["ds", "trend", "trend_upper", "trend_lower"]])

在这里插入图片描述

逻辑斯蒂趋势

必须在训练集和测试集的data_frame中提供"cap"，即趋势的上限

m = Prophet(growth="logistic")

df["cap"] = 10
m.fit(df)

horizon = 365
future = m.make_future_dataframe(periods=horizon)

future["cap"] = 10
forecast = m.predict(future)

fig = m.plot_components(forecast[["ds", "trend", "trend_upper", "trend_lower"]])
fig2 = m.plot(forecast)

df.drop(columns=["cap"], inplace=True)
future.drop(columns=["cap"], inplace=True)

在这里插入图片描述

在有需要的情况下好可以提供"floor"，即趋势的下限

m = Prophet(growth="logistic")

df["cap"] = 10
df["floor"] = 6
m.fit(df)

horizon = 365
future = m.make_future_dataframe(periods=horizon)

future["cap"] = 10
future["floor"] = 6
forecast = m.predict(future)

fig = m.plot_components(forecast[["ds", "trend", "trend_upper", "trend_lower"]])
fig2 = m.plot(forecast)

df.drop(columns=["cap", "floor"], inplace=True)
future.drop(columns=["cap", "floor"], inplace=True)

在这里插入图片描述

无趋势

当序列平稳性检测P-value值比较小的时候可以考虑使用无趋势的Prophet。

m = Prophet(growth="flat")
m.fit(df)
horizon = 365
future = m.make_future_dataframe(periods=horizon)
forecast = m.predict(future)
fig = m.plot_components(forecast[["ds", "trend", "trend_upper", "trend_lower"]])

在这里插入图片描述

检测和设置趋势的分段点

分段点的检测和可视化

通过超参数changepoint_range调整潜在的分段点位置

changepoint_range，0-1之间的小数，表示分段点只能存在训练集的前百分之几，默认0.8
changepoint_range越大，模型越容易拟合近期的数据，也容易过拟合

m = Prophet()
m.fit(df)
horizon = 365
future = m.make_future_dataframe(periods=horizon)
forecast = m.predict(future)

from prophet.plot import add_changepoints_to_plot
fig = m.plot(forecast)
a = add_changepoints_to_plot(fig.gca(),m ,forecast)

在这里插入图片描述
允许训练集前90%的数据趋势发生变化

m = Prophet(changepoint_range=0.9) # 允许训练集前90%的数据趋势发生变化
m.fit(df)
horizon = 365
future = m.make_future_dataframe(periods=horizon)
forecast = m.predict(future)

from prophet.plot import add_changepoints_to_plot
fig = m.plot(forecast)
a = add_changepoints_to_plot(fig.gca(),m ,forecast)

通过超参数n_changepoints调整潜在的分段点的数量

n_changepoints，整数，表示在changepoint_range规定的范围内，均匀选取n_changepoints个时间点作为潜在的分段点，默认值为25
n_changepoints越大，模型的线性波动越大，模型对训练集的拟合程度越高，越容易过拟合

m = Prophet(n_changepoints=50) 
m.fit(df)
horizon = 365
future = m.make_future_dataframe(periods=horizon)
forecast = m.predict(future)

from prophet.plot import add_changepoints_to_plot
fig = m.plot(forecast)
a = add_changepoints_to_plot(fig.gca(),m ,forecast)

在这里插入图片描述
通过调整changepoints指定分段点的位置

用户提供changepoints时，程序会忽略n_changepoints和changepoint_range

m = Prophet(changepoints=["2012-01-01", "2014-01-01"]) # 允许训练集前90%的数据趋势发生变化
m.fit(df)
horizon = 365
future = m.make_future_dataframe(periods=horizon)
forecast = m.predict(future)

from prophet.plot import add_changepoints_to_plot
fig = m.plot(forecast)
a = add_changepoints_to_plot(fig.gca(),m ,forecast)

通过超参数changepoint_prior_scale控制速率变化了的先验概率

changepoint_prior_scale，大于0的小数，表示分段点前后两条趋势线增长速率的变化量的先验概率。默认0.05
changepoint_prior_scale越大，趋势线波动越大，模型对训练集的拟合程度越高，越容易过拟合
通常而言，用户可以固定n_changepoints,只需要通过调整changepoint_prior_scale来控制趋势变化的频繁程度。

m = Prophet(changepoint_prior_scale=0.5) # 允许训练集前90%的数据趋势发生变化
m.fit(df)
horizon = 365
future = m.make_future_dataframe(periods=horizon)
forecast = m.predict(future)

from prophet.plot import add_changepoints_to_plot
fig = m.plot(forecast)
a = add_changepoints_to_plot(fig.gca(),m ,forecast)

趋势的置信区间

产参数interval_width

0-1的小数，需要计算的置信度区间，默认值为0.8

m = Prophet(interval_width=0.95) # 允许训练集前90%的数据趋势发生变化
m.fit(df)
horizon = 365
future = m.make_future_dataframe(periods=horizon)
forecast = m.predict(future)
fig = m.plot(forecast)