Time series preprocessing and analysis(AirPassengers)

  1. Stationarity test
    The first step in time series preprocessing is to judge its stationarity. Only the time series of non-white noise has the value of analysis and prediction.
    Stationary means that the statistical value fluctuates around a constant and the range of fluctuation is bounded.
    We can distinguish between two different types of stationary processes. If the common distribution function of a stochastic process does not change with time, it is called a strict stationary , and the other is a weak stationary.[1]
    There are three general ways of judging:
    (1)Draw the trend graph of time series and see the trend judgment
    (2)Draw ACF graph and PACF graph, ACF graph and PACF graph of stationary time series, and either drag tail or truncate tail.
    (3)Check whether there is unit root in the sequence, if there is unit root, it is non-stationary time series.
    Next, the AirPassengers were analyzed in time series
    Build time series:
    1:30 represents the establishment of 30 time sequence values based on 1-30, frequence==4 represents quarterly cycle, and start represents the starting date
    在这里插入图片描述
    Install the “tseries” package, just install it on first run.
    在这里插入图片描述
    Install the “forecast” package, just install it on first run
    在这里插入图片描述
    Using tsdisplay function to display ACF and PACF graphs, this function can also be used to determine the parameters of arima function (we use arima model later), the results drawn are as follows:
    在这里插入图片描述
    It can be seen from the autocorrelation graph that the time series is not very smooth, and the partial autocorrelation graph works well. Although there are a few popping points, but the overall is still shaking around 0. Now we can use another method to do the stationarity test, by unit root.
    在这里插入图片描述
    Adf. test was used to test the unit root. According to the autocorrelation diagram, the autocorrelation coefficient was reduced to 0 without express, showing a tail. The unit root test further verified that there was a unit root, so the sequence was non-stationary.
    When the time series is non-stationary, the solution is difference. What is difference?
    First-order difference refers to the subtraction between two sequence values one phase apart from the original sequence value. The k-order difference is the subtraction between the values of two sequences separated by k periods. If a time series is stationary after difference operation, it is a differential stationary series and can be analyzed by ARIMA model.
    在这里插入图片描述
    We can use ndiffs to determine how many orders of difference are needed
    So let’s do a first order difference, and then check it out.
    在这里插入图片描述
    Check autocorrelation graph ACF:
    在这里插入图片描述
    Partial autocorrelation PACF was tested as shown in the figure below:
    在这里插入图片描述

Establish ARIMA model
ARIMA model is also called autoregressive moving average model. It refers to the model established by converting non-stationary time series into stationary time series, and then regression of dependent variable only to its lag value, present value and lag value of random error term. ARIMA model regards the data series formed by prediction indicators over time as a random sequence. The dependence of this set of random variables reflects the continuity of the original data in time. It is not only affected by external factors, but also has its own variation rules[2]
Use decompose () function to break up the different components, the results the graph is:
在这里插入图片描述
As can be seen from the figure, the seasonality of time series is still very strong.
Start building models:
To build the arima model, we first use the auto-arima in the forcast package for parameter estimation, and automatically find the most P and q values
在这里插入图片描述
The summary() function can also be used to summarize the results of the fitting,Training set error measures:
在这里插入图片描述
It can be seen from the above results that the optimal model for fitting is ARIMA(2,1,1)(0,1,0)[12].
Use models to predict
Prediction model is the most important work when using quantitative prediction method to make prediction. A predictive model is one that predicts the quantitative relationships between things described in mathematical language or formulas. It reveals the inner regularity of things to some extent, and it is taken as the direct basis to calculate the predicted value. Therefore, it has a great impact on the accuracy of prediction. Any specific prediction method is characterized by its specific mathematical model. There are many kinds of prediction methods, each with its own prediction model
在这里插入图片描述
According to the above code, I chose the data predicted 10 years later, and the result figure is:
在这里插入图片描述
It is predicted that the time series data value will still show an upward trend after 10 years, and it is very similar to the curve change radians of previous years.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值