多元时间序列回归模型_多元时间序列分析和预测：将向量自回归（VAR）模型应用于实际的多元数据集...

最新推荐文章于 2025-03-20 16:16:14 发布

weixin_26746401

最新推荐文章于 2025-03-20 16:16:14 发布

阅读量1.4w

点赞数 7

文章标签：机器学习 python 人工智能大数据深度学习

原文链接：https://towardsdatascience.com/multivariate-time-series-forecasting-456ace675971

版权

本文探讨了多元时间序列分析的关键概念，并详细介绍了如何使用向量自回归（VAR）模型对实际的多元数据集进行预测。通过Python实现，深入理解机器学习、大数据和深度学习背景下的时间序列建模。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

多元时间序列回归模型

Multivariate Time Series Analysis

多元时间序列分析

A univariate time series data contains only one single time-dependent variable while a multivariate time series data consists of multiple time-dependent variables. We generally use multivariate time series analysis to model and explain the interesting interdependencies and co-movements among the variables. In the multivariate analysis — the assumption is that the time-dependent variables not only depend on their past values but also show dependency between them. Multivariate time series models leverage the dependencies to provide more reliable and accurate forecasts for a specific given data, though the univariate analysis outperforms multivariate in general[1]. In this article, we apply a multivariate time series method, called Vector Auto Regression (VAR) on a real-world dataset.

单变量时间序列数据仅包含一个时间相关的变量，而多元时间序列数据则包含多个时间相关的变量。我们通常使用多元时间序列分析来建模和解释变量之间有趣的相互依存关系和共同运动。在多变量分析中，假定时间相关变量不仅取决于它们的过去值，而且还显示它们之间的依赖关系。多元时间序列模型利用依存关系为特定的给定数据提供更可靠，更准确的预测，尽管单变量分析通常优于多元变量[1]。在本文中，我们在现实世界的数据集上应用了一种称为向量自动回归(VAR)的多元时间序列方法。

Vector Auto Regression (VAR)

向量自回归(VAR)

VAR model is a stochastic process that represents a group of time-dependent variables as a linear function of their own past values and the past values of all the other variables in the group.

VAR模型是一个随机过程，将一组时间相关变量表示为它们自己的过去值以及该组中所有其他变量的过去值的线性函数。

For instance, we can consider a bivariate time series analysis that describes a relationship between hourly temperature and wind speed as a function of past values [2]:

例如，我们可以考虑一个双变量时间序列分析，该分析描述了每小时温度和风速之间的关系，该关系是过去值的函数[2]：

temp(t) = a1 + w11* temp(t-1) + w12* wind(t-1) + e1(t-1)

temp(t)= a1 + w11 * temp(t-1)+ w12 * wind(t-1)+ e1(t-1)

wind(t) = a2 + w21* temp(t-1) + w22*wind(t-1) +e2(t-1)

wind(t)= a2 + w21 * temp(t-1)+ w22 * wind(t-1)+ e2(t-1)

where a1 and a2 are constants; w11, w12, w21, and w22 are the coefficients; e1 and e2 are the error terms.

其中a1和a2是常数； w11，w12，w21和w22是系数； e1和e2是误差项。

Dataset

数据集

Statmodels is a python API that allows users to explore data, estimate statistical models, and perform statistical tests [3]. It contains time series data as well. We download a dataset from the API.

Statmodels是python API，允许用户浏览数据，估计统计模型并执行统计测试[3]。它还包含时间序列数据。我们从API下载数据集。

To download the data, we have to install some libraries and then load the data:

要下载数据，我们必须安装一些库，然后加载数据：

import pandas as pd
import statsmodels.api as sm
from statsmodels.tsa.api import VAR
data = sm.datasets.macrodata.load_pandas().data
data.head(2)

The output shows the first two observations of the total dataset:

输出显示了总数据集的前两个观察值：

Image for post — A snippet of the dataset

The data contains a number of time-series data, we take only two time-dependent variables “realgdp” and “realdpi” for experiment purposes and use “year” columns as the

最低0.47元/天解锁文章