序列到序列回归
时间序列-自动回归 (Time Series - Auto Regression)
For a stationary time series, an auto regression models sees the value of a variable at time ‘t’ as a linear function of values ‘p’ time steps preceding it. Mathematically it can be written as −
对于固定时间序列,自动回归模型将时间“ t”处的变量值视为值“ p”时间步长的线性函数。 数学上可以写成-
$$y_{t} = \:C+\:\phi_{1}y_{t-1}\:+\:\phi_{2}Y_{t-2}+...+\phi_{p}y_{t-p}+\epsilon_{t}$$
$$ y_ {t} = \:C + \:\ phi_ {1} y_ {t-1} \:+ \:\ phi_ {2} Y_ {t-2} + ... + \ phi_ {p} y_ {tp} + \ epsilon_ {t} $$
Where,‘p’ is the auto-regressive trend parameter
其中, “ p”是自回归趋势参数
$\epsilon_{t}$ is white noise, and
$ \ epsilon_ {t} $是白噪声,并且
$y_{t-1}, y_{t-2}\:\: ...y_{t-p}$ denote the value of variable at previous time periods.
$ y_ {t-1},y_ {t-2} \:\:... y_ {tp} $表示前一个时间段的变量值。
The value of p can be calibrated using various methods. One way of finding the apt value of ‘p’ is plotting the auto-correlation plot.
p的值可以使用各种方法进行校准。 找到“ p”的合适值的一种方法是绘制自相关图。
Note − We should separate the data into train and test at 8:2 ratio of total data available prior to doing any analysis on the data because test data is only to find out the accuracy of our model and assumption is, it is not available to us until after predictions have been made. In case of time series, sequence of data points is very essential so one should keep in mind not to lose the order during splitting of data.
注意 -在对数据进行任何分析之前,我们应将数据分成总数据的8:2比例进行训练和测试,因为测试数据仅是为了找出我们模型的准确性,而假设是,我们直到做出预测之后。 对于时间序列,数据点的顺序非常重要,因此应牢记不要在数据拆分期间丢失顺序。
An auto-correlation plot or a correlogram shows the relation of a variable with itself at prior time steps. It makes use of Pearson’s correlation and shows the correlations within 95% confidence interval. Let’s see how it looks like for ‘temperature’ variable of our data.
自相关图或相关图显示了先前时间步长处变量与自身的关系。 它利用了Pearson的相关性,并显示了95%置信区间内的相关性。 让我们看看数据的“温度”变量的样子。
显示ACP (Showing ACP)
In [141]:
在[141]中:
split = len(df) - int(0.2*len(df))
train, test = df['T'][0:split], df['T'][split:]
In [142]:
在[142]中:
from statsmodels.graphics.tsaplots import plot_acf
plot_acf(train, lags = 100)
plt.show()
All the lag values lying outside the shaded blue region are assumed to have a csorrelation.
假定位于蓝色阴影区域之外的所有滞后值都具有反相关关系。
翻译自: https://www.tutorialspoint.com/time_series/time_series_server_auto_regression.htm
序列到序列回归