数据格式
将下面的输入表格转为以ds和y为名字的两列,ds是时间,y是值。没什么理由,Prophet框架就是这样的(😓)
time | abc (随便列名) |
---|---|
2010/1/1 | 10.1 |
… | |
2010/12/1 | 10.1 |
2011/1/1 | 10.1 |
… | |
2011/12/1 | 10.1 |
… |
计算ACF
直接from statsmodels.graphics.tsaplots import plot_acf
使用plot_acf
函数
从图上来看好像情况不错,那就接着用Prophet进行预测了。
使用Prophet预测
m = Prophet(growth='logistic', yearly_seasonality=2, changepoint_prior_scale=2, seasonality_mode='additive')
出现了几个离群点,除此之外拟合看起来不错哦。
完整代码
import pandas as pd
import matplotlib.pyplot as plt
from fbprophet import Prophet
import numpy as np
from fbprophet.plot import plot_yearly
from matplotlib.font_manager import FontProperties
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
def predict(history_cases, Period):
df = pd.DataFrame()
ds = [k for k in history_cases]
y = [history_cases[k] for k in history_cases]
df["ds"] = ds
df["y"] = y
temp = np.array(y[2:])
print(temp)
plot_acf(temp) # 查看auto coefficient function,判断是否可以有季节性
tran_test_index = len(df) - 3 # 除最后三个点外的其他点用来做训练,最后三个点用来做预测的验证
df_train = df[:tran_test_index]
y_min = 0
y_max = df["y"].max() + 3
# seasonality_prior_scale =12,
# growth='logistic' 使用logistic必须添加上限(添加底下的'cap')
# yearly_seasonality = 2 傅里叶级数为2
# changepoint_prior_scale 调节拟合程度
# seasonality_mode='multiplicative' or additive 乘性季节周期模式【预测的季节性在时间序列开始时太大而在结束时太小】
m = Prophet(growth='logistic', yearly_seasonality=2, changepoint_prior_scale=2,
seasonality_mode='additive')
df_train['cap'] = y_max
m.fit(df_train)
future = m.make_future_dataframe(periods=Period, freq='MS')
future['cap'] = y_max
forecast = m.predict(future)
predictions = list(forecast['yhat'][-Period:])
# 确保预测结果处于某个区间内
for i in range(len(predictions)):
if predictions[i] > y_max:
predictions[i] = y_max
if predictions[i] < y_min:
predictions[i] = y_min
# 绘制拟合图和预测点
test_x = list(pd.to_datetime(df[tran_test_index:tran_test_index + Period]["ds"]))
test_y = list(df[tran_test_index:tran_test_index + Period]["y"])
fore_fig = m.plot(forecast)
x_1 = forecast['ds'][-Period:]
plt.scatter(test_x, test_y, color="red", s=150, label="truth")
plt.scatter(x_1, predictions, color="green", marker='^', s=150, label="prediction")
plt.legend()
plt.show()
print(predictions)
return predictions
df = pd.read_csv("abc.csv", encoding="gbk")
df = df.interpolate(method='akima')
# df.fillna(0,inplace=True)
# df.dropna(axis=0,how='any')
# df = df.dropna(axis=0,how='any')
bi_history_cases = {}
location = 'abc'
Period = 2
for i in range(len(df)):
bi_history_cases[str(df["time"][i])] = df[location][i] # 添加进字典中
preds = predict(bi_history_cases, Period)
参考
自相关与偏自相关的简单介绍
Facebook 时间序列预测算法 Prophet 的研究(个人认为是知乎最牛Prophet回答)