Q1. What is Autoregression?
The autoregressive (AR) model is commonly used to model time-varying processes and solve problems in the fields of natural science, economics and finance, and others. The models have always been discussed in the context of random process and are often perceived as statistical tools for time series data.
A regression model, like linear regression, models an output value which are based on a linear combination of input values.
Example: y^ = b0 + b1*X1
Where y^ is the prediction, b0 and b1 are coefficients found by optimising the model on training data, and X is an input value.
This model technique can be used on the time series where input variables are taken as observations at previous time steps, called lag variables.
For example, we can predict the value for the next time step (t+1) given the observations at the last two time steps (t-1 and t-2). As a regression model, this would look as follows:
X(t+1) = b0 + b1*X(t-1) + b2*X(t-2)-
Because the regression model uses the data from the same input variable at previous time steps, it is referred to as an autoregression. The notation AR(p) refers to the autoregressive model of order p. The AR(p) model is written
Q2. What is Moving Average? 移动平均
Moving average: From a dataset, we will get an overall idea of trends by this technique; it is an average of any subset of numbers. For forecasting long-term trends, the moving average is extremely useful for it. We can calculate it for any period. For example: if we have sales data for twenty years, we can calculate the five-year moving average, a four-year moving average, a three-year moving average and so on. Stock market analysts will often use a 50 or 200-day moving average to help them see trends in the stock market and (hopefully) forecast where the stocks are headed. 移动平均模型是用过去观测值的误差项的线性组合来预测时间序列。简单来说,它用历史上的“误差震荡”来预测未来。
Q3. What is Autoregressive Moving Average (ARMA)? 自回归移动平均
ARMA: It is a model of forecasting in which the methods of autoregression (AR) analysis and moving average (MA) are both applied to time-series data that is well behaved. In ARMA it is assumed that the time series is stationary and when it fluctuates, it does so uniformly around a particular time.
AR (Autoregression model)-
Autoregression (AR) model is commonly used in current spectrum estimation.
ARMA模型结合了自回归和移动平均模型,它使用时间序列的滞后值和滞后误差项来进行预测。
The following is the procedure for using ARMA.
-
Selecting the AR model and then equalizing the output to equal the signal being studied if the input is an impulse function or the white noise. It should at least be good approximation of signal.
-
Finding a model’s parameters number using the known autocorrelation function or the data .
-
Using the derived model parameters to estimate the power spectrum of the signal.
Moving Average (MA) model-
It is a commonly used model in the modern spectrum estimation and is also one of the methods of the model parametric spectrum analysis. The procedure for estimating MA model’s signal spectrum is as follows.
-
Selecting the MA model and then equalising the output to equal the signal understudy in the case where the input is an impulse function or white noise. It should be at least a good approximation of the signal.
-
Finding the model’s parameters using the known autocorrelation function.
-
Estimating the signal’s power spectrum using the derived model parameters.
In the estimation of the ARMA parameter spectrum, the AR parameters are first estimated, and then the MA parameters are estimated based on these AR parameters. The spectral estimates of the ARMA model are then obtained. The parameter estimation of the MA model is, therefore often calculated as a process of ARMA parameter spectrum association.
The notation ARMA(p, q) refers to the model with p autoregressive terms and q moving-average terms. This model contains the AR(p) and MA(q) models,
Q4. What is Autoregressive Integrated Moving Average (ARIMA)? 自回归综合移动平均
ARIMA: It is a statistical analysis model that uses time-series data to either better understand the data set or to predict future trends. ARMA模型结合了自回归和移动平均模型,它使用时间序列的滞后值和滞后误差项来进行预测
An ARIMA model can be understood by the outlining each of its components as follows-
-
Autoregression (AR): It refers to a model that shows a changing variable that regresses on its own lagged, or prior, values.
-
Integrated (I): It represents the differencing of raw observations to allow for the time series to become stationary, i.e., data values are replaced by the difference between the data values and the previous values.
-
Moving average (MA): It incorporates the dependency between an observation and the residual error from the moving average model applied to the lagged observations.
Each component functions as the parameter with a standard notation. For ARIMA models, the standard notation would be the ARIMA with p, d, and q, where integer values substitute for the parameters to indicate the type of the ARIMA model used. The parameters can be defined as-
Q5.What is SARIMA (Seasonal Autoregressive Integrated Moving- Average)? 季节性自回归综合移动平均
Seasonal ARIMA: It is an extension of ARIMA that explicitly supports the univariate time series data with the seasonal component. SARIMA模型是ARIMA模型的扩展,它额外包含了季节性组件,以捕捉时间序列数据中的季节性波动。
It adds three new hyper-parameters to specify the autoregression (AR), differencing (I) and the moving average (MA) for the seasonal component of the series, as well as an additional parameter for the period of the seasonality.
Configuring the SARIMA requires selecting hyperparameters for both the trend and seasonal elements of the series.
Trend Elements
Three trend elements requires the configuration. They are same as the ARIMA model, specifically-
p: It is Trend autoregression order. d: It is Trend difference order.
q: It is Trend moving average order.
Seasonal Elements-
Four seasonal elements are not the part of the ARIMA that must be configured, they are- P: It is Seasonal autoregressive order.
D: It is Seasonal difference order.
Q: It is Seasonal moving average order.
m: It is the number of time steps for the single seasonal period. Together, the notation for the SARIMA model is specified as- SARIMA(p,d,q)(P,D,Q)m-
The elements can be chosen through careful analysis of the ACF and PACF plots looking at the correlations of recent time steps.
Q6. What is Seasonal Autoregressive Integrated Moving-Average with Exogenous Regressors (SARIMAX) ? 带有外生回归量的季节性自回归综合移动平均
SARIMAX: It is an extension of the SARIMA model that also includes the modelling of the exogenous variables.
Exogenous variables are also called the covariates and can be thought of as parallel input sequences that have observations at the same time steps as the original series. The primary series may be referred as endogenous data to contrast it from exogenous sequence(s). The observations for exogenous variables are included in the model directly at each time step and are not modeled in the same way as the primary endogenous sequence (e.g. as an AR, MA, etc. process).
The SARIMAX method can also be used to model the subsumed models with exogenous variables, such as ARX, MAX, ARMAX, and ARIMAX.
The method is suitable for univariate time series with trend and/or seasonal components and exogenous variables.
Q7. What is Vector autoregression (VAR)? 向量自回归
VAR: It is astochastic processmodel used to capture the linearinterdependenciesamong multiple time series. VAR models generalise the univariate autoregressive model (AR model) by allowing for more than one evolving variable. All variables in the VAR enter the model in the same way: each variable has an equation explaining its evolution based on its own lagged values, the lagged values of the other model variables, and an error term. VAR modelling does not requires as much knowledge about the forces influencing the variable as do structural models with simultaneous equations: The only prior knowledge required is a list of variables which can be hypothesised to affect each other intertemporally.
A VAR model describes the evolution of the set of k variables over the same sample period (t = 1, ...,T) as thelinearfunction of only their past values. The variables are collected in thek- vector((k×1)-matrix)yt, , which has as the(ith)element,yi,t, the observation at timetof the (i th )variable. Example: if the (i th )variable is the GDP, then yi,t is the value of GDP at time “t”. -
where the observation yt−i is called the (i-th) lag of y, c is the k-vector of constants (intercepts), Ai is a time-invariant (k × k)-matrix, and et is a k-vector of error terms satisfying.
Q8. What is Vector Autoregression Moving-Average (VARMA)? 向量自回归移动平均
VARMA: It is method models the next step in each time series using an ARMA model. It is the generalisation of ARMA to multiple parallel time series, Example- multivariate time series.
The notation for a model involves specifying the order for the AR(p) and the MA(q) models as parameters to the VARMA function, e.g. VARMA (p, q). The VARMA model can also be used to develop VAR or VMA models.
This method is suitable for multivariate time series without trend and seasonal components.
Q9. What is Vector Autoregression Moving-Average with Exogenous Regressors (VARMAX)? 带有外生回归量的向量自回归移动平均
VARMAX: It is an extension of the VARMA model that also includes the modelling of the exogenous variables. It is the multivariate version of the ARMAX method.
VARMAX是VARMA(向量自回归移动平均模型)的扩展,它还包括了对外生变量的建模。这是ARMAX方法的多变量版本。外生变量也称为协变量,可以被认为是与原始序列在相同时间步具有观测值的平行输入序列。主要序列被称为内生数据,以区别于外生序列。外生变量的观测值在每个时间步直接包含在模型中,并且不像主要内生序列那样被建模(例如作为AR、MA等)。
Exogenous variables are also called the covariates and can be thought of as parallel input sequences that have observations at the same time steps as the original series. The primary series(es) are referred as the endogenous data to contrast it from the exogenous sequence(s). The observations for the exogenous variables are included in the model directly at each time step and are not modeled in the same way as the primary endogenous sequence (Example- as an AR, MA, etc.).
This method can also be used to model subsumed models with exogenous variables, such as VARX and the VMAX.
This method is suitable for multivariate time series without trend and seasonal components and exogenous variables.
Q10. What is Simple Exponential Smoothing (SES)? 简单指数平滑
SES: It method models the next time step as an exponentially weighted linear function of observations at prior time steps. 这种方法适合于没有趋势和季节成分的单变量时间序列。
指数平滑是使用指数窗函数平滑时间序列数据的经验法则技术。与简单移动平均不同,后者对过去的观测值给予了相等的权重,指数平滑用来随时间指数递减地分配权重。这是一种易于学习和应用的方法,用于基于用户之前的假设做出一些判断,例如季节性。指数平滑常用于时间序列数据的分析。指数平滑是信号处理中常用于平滑数据的许多窗函数之一,作为低通滤波器以去除高频噪声。
原始数据序列通常由{xt}表示,开始于时间t = 0,而指数平滑算法的输出通常写为{st},可以被视为对下一个x值的最佳估计。当观测序列从时间t = 0开始时,指数平滑的最简单形式由以下公式给出:
This method is suitable for univariate time series without trend and seasonal components.
Exponential smoothing is the rule of thumb technique for smoothing time series data using the exponential window function. Whereas in the simple moving average, the past observations are weighted equally, exponential functions are used to assign exponentially decreasing weights over time. It is easily learned and easily applied procedure for making some determination based on prior assumptions by the user, such as seasonality. Exponential smoothing is often used for analysis of time-series data.
Exponential smoothing is one of many window functions commonly applied to smooth data in signal processing, acting as low-pass filters to remove high-frequency noise.
The raw data sequence is often represented by {xt} beginning at time t = 0, and the output of the exponential smoothing algorithm is commonly written as {st} which may be regarded as a best estimate of what the next value of x will be. When the sequence of observations begins at time t= 0, the simplest form of exponential smoothing is given by the formulas: