时间序列例子--ARIMA怎样预测外样本、一步or多步

https://machinelearningmastery.com/make-sample-forecasts-arima-python/

1.划分训练集测试集、这里讲最后7天的气温当做测试集

# split the dataset
from pandas import Series
series = Series.from_csv('daily-minimum-temperatures.csv', header=0)
split_point = len(series) - 7
dataset, validation = series[0:split_point], series[split_point:]
print('Dataset %d, Validation %d' % (len(dataset), len(validation)))
dataset.to_csv('dataset.csv')
validation.to_csv('validation.csv')

2.1用forecat预测一步

The result of the forecast() function is an array containing the forecast value, the standard error of the forecast, and the confidence interval information. Now, we are only interested in the first element of this forecast, as follows.

from pandas import Series
from statsmodels.tsa.arima_model import ARIMA
import numpy

# create a differenced series
def difference(dataset, interval=1):
	diff = list()
	for i in range(interval, len(dataset)):
		value = dataset[i] - dataset[i - interval]
		diff.append(value)
	return numpy.array(diff)

# invert differenced value减了之后要加回来再算mse衡量预测的好坏
# history[-interval]代表倒数第几个
def inverse_difference(history, yhat, interval=1):
	return yhat + history[-interval]


# load dataset
series = Series.from_csv('dataset.csv', header=None)
# seasonal difference
X = series.values
days_in_year = 365
differenced = difference(X, days_in_year)
# fit model
model = ARIMA(differenced, order=(7,0,1))
model_fit = model.fit(disp=0)
# one-step out-of sample forecast一步预测
forecast = model_fit.forecast()[0]
# invert the differenced forecast to something usable
forecast = inverse_difference(X, forecast, days_in_year)
print('Forecast: %f' % forecast)

结果:Forecast: 14.861669

之后拿这个结果去与测试集上进行对比即可

2.2用predict

The statsmodel ARIMAResults object also provides a predict() function for making forecasts.

The predict function can be used to predict arbitrary in-sample and out-of-sample time steps, including the next out-of-sample forecast time step.

The predict function requires a start and an end to be specified, these can be the indexes of the time steps relative to the beginning of the training data used to fit the model

 

1

2

3

4

# one-step out of sample forecast

start_index = len(differenced)

end_index = len(differenced)

forecast = model_fit.predict(start=start_index, end=end_index)

The start and end can also be a datetime string or a “datetime” type; for example:

 

1

2

3

start_index = '1990-12-25'

end_index = '1990-12-25'

forecast = model_fit.predict(start=start_index, end=end_index)

 

1

2

3

4

from pandas import datetime

start_index = datetime(1990, 12, 25)

end_index = datetime(1990, 12, 26)

forecast = model_fit.predict(start=start_index, end=end_index)

from pandas import Series
from statsmodels.tsa.arima_model import ARIMA
import numpy
from pandas import datetime

# create a differenced series
def difference(dataset, interval=1):
	diff = list()
	for i in range(interval, len(dataset)):
		value = dataset[i] - dataset[i - interval]
		diff.append(value)
	return numpy.array(diff)

# invert differenced value
def inverse_difference(history, yhat, interval=1):
	return yhat + history[-interval]

# load dataset
series = Series.from_csv('dataset.csv', header=None)
# seasonal difference
X = series.values
days_in_year = 365
differenced = difference(X, days_in_year)
# fit model
model = ARIMA(differenced, order=(7,0,1))
model_fit = model.fit(disp=0)
# one-step out of sample forecast
start_index = len(differenced)
end_index = len(differenced)
forecast = model_fit.predict(start=start_index, end=end_index)
# invert the differenced forecast to something usable
forecast = inverse_difference(X, forecast, days_in_year)
print('Forecast: %f' % forecast)

Forecast: 14.861669

可以看出来predict更灵活,可以指定位置

3.1多步用forcast

这里要改变一下inverted

# multi-step out-of-sample forecast
forecast = model_fit.forecast(steps=7)[0]

# invert the differenced forecast to something usable
history = [x for x in X]
day = 1
for yhat in forecast:
	inverted = inverse_difference(history, yhat, days_in_year)
	print('Day %d: %f' % (day, inverted))
	history.append(inverted)
	day += 1

解释一下:history[-interval]代表倒数第几个,本来预测最后一个,加上history[-interval]就可以,

                可是这个是多步啊,所以倒数第二个要加上history[-(interval+1)]

               但是  我每一步都history。append就不用该变原来代码啦

完整代码:

from pandas import Series
from statsmodels.tsa.arima_model import ARIMA
import numpy

# create a differenced series
def difference(dataset, interval=1):
	diff = list()
	for i in range(interval, len(dataset)):
		value = dataset[i] - dataset[i - interval]
		diff.append(value)
	return numpy.array(diff)

# invert differenced value
def inverse_difference(history, yhat, interval=1):
	return yhat + history[-interval]

# load dataset
series = Series.from_csv('dataset.csv', header=None)
# seasonal difference
X = series.values
days_in_year = 365
differenced = difference(X, days_in_year)
# fit model
model = ARIMA(differenced, order=(7,0,1))
model_fit = model.fit(disp=0)
# multi-step out-of-sample forecast
forecast = model_fit.forecast(steps=7)[0]
# invert the differenced forecast to something usable
history = [x for x in X]
day = 1
for yhat in forecast:
	inverted = inverse_difference(history, yhat, days_in_year)
	print('Day %d: %f' % (day, inverted))
	history.append(inverted)
	day += 1

Day 1: 14.861669
Day 2: 15.628784
Day 3: 13.331349
Day 4: 11.722413
Day 5: 10.421523
Day 6: 14.415549
Day 7: 12.674711

3.2用predict

from pandas import Series
from statsmodels.tsa.arima_model import ARIMA
import numpy

# create a differenced series
def difference(dataset, interval=1):
	diff = list()
	for i in range(interval, len(dataset)):
		value = dataset[i] - dataset[i - interval]
		diff.append(value)
	return numpy.array(diff)

# invert differenced value
def inverse_difference(history, yhat, interval=1):
	return yhat + history[-interval]

# load dataset
series = Series.from_csv('dataset.csv', header=None)
# seasonal difference
X = series.values
days_in_year = 365
differenced = difference(X, days_in_year)
# fit model
model = ARIMA(differenced, order=(7,0,1))
model_fit = model.fit(disp=0)
# multi-step out-of-sample forecast
start_index = len(differenced)
end_index = start_index + 6
forecast = model_fit.predict(start=start_index, end=end_index)
# invert the differenced forecast to something usable
history = [x for x in X]
day = 1
for yhat in forecast:
	inverted = inverse_difference(history, yhat, days_in_year)
	print('Day %d: %f' % (day, inverted))
	history.append(inverted)
	day += 1

Using time step indexes, we can specify the end index as 6 more time steps in the future; for example:

 

1

2

3

4

# multi-step out-of-sample forecast

start_index = len(differenced)

end_index = start_index + 6

forecast = model_fit.predict(start=start_index, end=end_index)

 Day 1: 14.861669
Day 2: 15.628784
Day 3: 13.331349
Day 4: 11.722413
Day 5: 10.421523
Day 6: 14.415549
Day 7: 12.674711

注:我其实没有明白这个多步预测的原理是啥子,我猜测之前讲的模型2,

      因为第2个样本的t-1时刻我们不知道啊,这个时候没法滚动了,可能只利用之前预测的当做输入

  • 6
    点赞
  • 27
    收藏
    觉得还不错? 一键收藏
  • 4
    评论
Auto-ARIMA(自动自回归滑动平均模型)是一种自动化选择和拟合ARIMA模型的算法。ARIMA模型是一种广泛用于时间序列预测的统计模型,它结合了自回归(AR)和滑动平均(MA)的概念。 Auto-ARIMA的原理如下: 1. 自动选择差分阶数(d): 首先,Auto-ARIMA会通过观察时间序列的自相关图(ACF)和偏自相关图(PACF)来确定是否需要对时间序列进行差分以使其平稳。如果原始序列不平稳,会进行一阶差分,然后再检查差分后序列的平稳性。如果需要,可以进行多阶差分。 2. 自动选择自回归阶数(p)和滑动平均阶数(q): 一旦确定了差分阶数,Auto-ARIMA会使用信息准则(如AIC、BIC)或交叉验证来选择合适的自回归阶数(p)和滑动平均阶数(q)。它会尝试不同的组合,并选择具有最小信息准则值或最佳交叉验证误差的模型。 3. 拟合ARIMA模型: 在确定了差分阶数、自回归阶数和滑动平均阶数后,Auto-ARIMA会使用最大似然估计或最小二乘法来拟合ARIMA模型。这将得到一个最优的ARIMA模型,用于进行时间序列预测。 Auto-ARIMA的优点在于它能够自动选择合适的模型参数,减轻了用户的工作负担,并提供了一个相对简单但有效的时间序列预测方法。它在许多实际应用中被广泛使用,特别是当用户没有领域专业知识或经验来手动选择模型参数时。 值得注意的是,Auto-ARIMA并不是万能的,它也有一些限制和假设。例如,它假设时间序列是线性的、具有固定的模型参数,并且没有季节性成分。在某些情况下,手动选择和调整ARIMA模型可能会更合适。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值