多变量LSTM模型

最新推荐文章于 2025-04-03 08:33:02 发布

请叫我算术嘉

最新推荐文章于 2025-04-03 08:33:02 发布

阅读量2.3w

点赞数 24

分类专栏：机器学习文章标签：多输入LTSM

本文链接：https://blog.csdn.net/ssjdoudou/article/details/90147034

版权

机器学习专栏收录该内容

8 篇文章

订阅专栏

多变量时间序列数据是指每个时间步长有多个观察值的数据。

对于多变量时间序列数据，我们可能需要两种主要模型; 他们是：

多输入系列。
多个并联系列。

1、多输入系列

问题可能有两个或更多并行输入时间序列和输出时间序列，这取决于输入时间序列。

输入时间序列是平行的，因为每个系列在同一时间步骤具有观察。

我们可以通过两个并行输入时间序列的简单示例来演示这一点，其中输出序列是输入序列的简单添加。

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])

打印出来的输出如下：

[10, 20, 30, 40, 50, 60, 70, 80, 90]
[15, 25, 35, 45, 55, 65, 75, 85, 95]
[25, 45, 65, 85, 105, 125, 145, 165, 185]

我们可以将这三个数据数组重新整形为单个数据集，其中每一行都是一个时间步，每列都是一个单独的时间序列。这是将并行时间序列存储在CSV文件中的标准方法。

from numpy import array
from numpy import hstack

# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))

下面列出了完整的示例:

# multivariate data preparation
from numpy import array
from numpy import hstack
# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
print(dataset)

运行该示例将打印数据集，每个时间步长为一行，两个输入和一个输出并行时间序列分别为一列。

[[ 10  15  25]
 [ 20  25  45]
 [ 30  35  65]
 [ 40  45  85]
 [ 50  55 105]
 [ 60  65 125]
 [ 70  75 145]
 [ 80  85 165]
 [ 90  95 185]]

与单变量时间序列一样，我们必须将这些数据组织成具有输入和输出元素的样本。

LSTM模型需要足够的上下文来学习从输入序列到输出值的映射。LSTM可以支持并行输入时间序列作为单独的变量或特征。因此，我们需要将数据分成样本，保持两个输入序列的观察顺序。

如果我们选择三个输入时间步长，那么第一个样本将如下所示：

输入：

10, 15
20, 25
30, 35

输出：

也就是说，每个并行系列的前三个时间步长被提供作为模型的输入，并且模型将其与第三时间步骤（在这种情况下为65）的输出系列中的值相关联。

我们可以看到，在将时间序列转换为输入/输出样本以训练模型时，我们将不得不从输出时间序列中丢弃一些值，其中我们在先前时间步骤中没有输入时间序列中的值。反过来，选择输入时间步数的大小将对使用多少训练数据产生重要影响。

我们可以定义一个名为split_sequences（）的函数，该函数将采用数据集，因为我们已经定义了时间步长的行和并行系列的列以及返回输入/输出样本。

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
	# find the end of this pattern
	end_ix = i + n_steps
	# check if we are beyond the dataset
	if end_ix > len(sequences):
	    break
	# gather input and output parts of the pattern
	seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
	X.append(seq_x)
	y.append(seq_y)
    return array(X), array(y)

我们可以使用每个输入时间序列的三个时间步长作为输入在我们的数据集上测试此函数。

下面列出了完整的示例。

# multivariate data preparation
from numpy import array
from numpy import hstack

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
	# find the end of this pattern
	end_ix = i + n_steps
	# check if we are beyond the dataset
	if end_ix > len(sequences):
	    break
	# gather input and output parts of the pattern
	seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
	X.append(seq_x)
	y.append(seq_y)
    return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps = 3
# convert into input/output
X, y = split_sequences(dataset, n_steps)
print(X.shape, y.shape)
# summarize the data
for i in range(len(X)):
    print(X[i], y[i])

首先运行该示例将打印X和y组件的形状。

我们可以看到X组件具有三维结构。

第一个维度是样本数，在本例中为7.第二个维度是每个样本的时间步数，在这种情况下为3，即为函数指定的值。最后，最后一个维度指定并行时间序列的数量或变量的数量，在这种情况下，两个并行序列为2。

这是LSTM作为输入所期望的精确三维结构。数据即可使用而无需进一步重塑。

然后我们可以看到每个样本的输入和输出都被打印出来，显示了两个输入序列中每个样本的三个时间步长以及每个样本的相关输出

(7, 3, 2) (7,)

[[10 15]
 [20 25]
 [30 35]] 65
[[20 25]
 [30 35]
 [40 45]] 85
[[30 35]
 [40 45]
 [50 55]] 105
[[40 45]
 [50 55]
 [60 65]] 125
[[50 55]
 [60 65]
 [70 75]] 145
[[60 65]
 [70 75]
 [80 85]] 165
[[70 75]
 [80 85]
 [90 95]] 185

我们现在准备在这些数据上使用LSTM模型。

可以使用前一节中的任何种类的LSTM，例如香草，堆叠，双向，CNN或ConvLSTM模型。

我们将使用Vanilla LSTM，其中通过input_shape参数为输入层指定时间步数和并行系列（特征）。

# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

在进行预测时，模型需要两个输入时间序列的三个时间步长。

我们可以预测输出系列中的下一个值，提供以下输入值：

80,	 85
90,	 95
100, 105

具有三个时间步长和两个变量的一个样本的形状必须是[1,3,2]。

我们希望序列中的下一个值为100 + 105或205。

# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)

下面列出了完整的示例。

# multivariate lstm example
from numpy import array
from numpy import hstack
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the dataset
		if end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps = 3
# convert into input/output
X, y = split_sequences(dataset, n_steps)
# the dataset knows the number of features, e.g. 2
n_features = X.shape[2]
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=200, verbose=0)
# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

运行该示例准备数据，拟合模型并进行预测。

[[208.13531]]

还有另一种更精细的方法来模拟问题。

每个输入序列可以由单独的MLP处理，并且可以在对输出序列进行预测之前组合这些子模型中的每一个的输出。

我们可以将其称为多头输入MLP模型。根据正在建模的问题的具体情况，它可以提供更大的灵活性或更好的性能。

可以使用Keras功能API在Keras中定义此类型的模型。

首先，我们可以将第一个输入模型定义为MLP，其输入层需要具有n_steps特征的向量。

还有另一种更精细的方法来模拟问题。

每个输入序列可以由单独的MLP处理，并且可以在对输出序列进行预测之前组合这些子模型中的每一个的输出。

我们可以将其称为多头输入MLP模型。根据正在建模的问题的具体情况，它可以提供更大的灵活性或更好的性能。

可以使用Keras功能API在Keras中定义此类型的模型。

首先，我们可以将第一个输入模型定义为MLP，其输入层需要具有n_steps特征的向量。

# first input model
visible1 = Input(shape=(n_steps,))
dense1 = Dense(100, activation='relu')(visible1)

我们可以以相同的方式定义第二个输入子模型。

# second input model
visible2 = Input(shape=(n_steps,))
dense2 = Dense(100, activation='relu')(visible2)

既然已经定义了两个输入子模型，我们可以将每个模型的输出合并为一个长向量，可以在对输出序列进行预测之前对其进行解释。

# merge input models
merge = concatenate([dense1, dense2])
output = Dense(1)(merge)

然后我们可以将输入和输出联系在一起。

model = Model(inputs=[visible1, visible2], outputs=output)

下图提供了该模型外观的示意图，包括每层输入和输出的形状。

此模型要求输入作为两个元素的列表提供，其中列表中的每个元素包含一个子模型的数据。

为了实现这一点，我们可以将3D输入数据分成两个独立的输入数据阵列：即从一个形状为[7,3,2]的阵列到两个形状为[7,3]的2D阵列

# separate input data
X1 = X[:, :, 0]
X2 = X[:, :, 1]

然后可以提供这些数据以适合模型。

# fit model
model.fit([X1, X2], y, epochs=2000, verbose=0)

类似地，我们必须在进行单个一步预测时将单个样本的数据准备为两个单独的二维数组。

x_input = array([[80, 85], [90, 95], [100, 105]])
x1 = x_input[:, 0].reshape((1, n_steps))
x2 = x_input[:, 1].reshape((1, n_steps))

我们可以将所有这些结合在一起; 下面列出了完整的示例。

# multivariate mlp example
from numpy import array
from numpy import hstack
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.merge import concatenate

# split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
	X, y = list(), list()
	for i in range(len(sequences)):
		# find the end of this pattern
		end_ix = i + n_steps
		# check if we are beyond the dataset
		if end_ix > len(sequences):
			break
		# gather input and output parts of the pattern
		seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
		X.append(seq_x)
		y.append(seq_y)
	return array(X), array(y)

# define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
# convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))
# choose a number of time steps
n_steps = 3
# convert into input/output
X, y = split_sequences(dataset, n_steps)
# separate input data
X1 = X[:, :, 0]
X2 = X[:, :, 1]
# first input model
visible1 = Input(shape=(n_steps,))
dense1 = Dense(100, activation='relu')(visible1)
# second input model
visible2 = Input(shape=(n_steps,))
dense2 = Dense(100, activation='relu')(visible2)
# merge input models
merge = concatenate([dense1, dense2])
output = Dense(1)(merge)
model = Model(inputs=[visible1, visible2], outputs=output)
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit([X1, X2], y, epochs=2000, verbose=0)
# demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]])
x1 = x_input[:, 0].reshape((1, n_steps))
x2 = x_input[:, 1].reshape((1, n_steps))
yhat = model.predict([x1, x2], verbose=0)
print(yhat)

运行该示例准备数据，拟合模型并进行预测。