关于LSTM预测时间序列数据没有自变量问题

最新推荐文章于 2023-07-09 17:54:33 发布

Leon嘞

最新推荐文章于 2023-07-09 17:54:33 发布

阅读量775

点赞数 1

分类专栏：深度学习文章标签： lstm 深度学习机器学习

本文链接：https://blog.csdn.net/qq_43820692/article/details/127267583

版权

深度学习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

关于LSTM预测时间序列数据没有自变量问题

本文思路来源于：https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/
我们都知道LSTM在预测时间序列问题上取得了很好的效果，但是目前网上很多预测问题存在一个通病，就是这些所谓的预测并不是真正的预测。
举个例子，我们有2021年的降水数据、温度数据、植被数据，这三个量决定了土壤湿度，然后我们拿这些数据来训练一个LSTM，训练完成后，我们只需要把2022年的降水、温度、植被数据输入到LSTM里面我们就可以预测2022年的土壤湿度了，然后我们再拿这个预测数据跟2022年的土壤数据进行比对，发现loss还很小，看起来很不错的样子。
但是，如果我们要预测下一年的呢，比如我现在我有2021年数据让我预测2023年的，此时我并没有2023年的数据，那我应该怎么处理呢？？？
目前常用的思路是滑动窗口法：
比如X[t-3,t-2,t-1] 预测X[t], 此时X[t]可以当作标签Y，那么我们可以构造如下的训练数据：

训练数据	测试数据
Xt-5、Xt-4、Xt-3	Xt-2
Xt-4、Xt-3、Xt-2	Xt-1
Xt-3、Xt-2、Xt-1	Xt
Xt-2、Xt-1、Xt	Xt+1

通过采用滑动窗口法，我们就可以在没有未来训练样本的情况下，自己生成训练样本，从而实现LSTM的训练。代入上面的例子，那我们就可以自行预测出2023年的降水数据、温度数据、植被数据，然后再把他们作为LSTM的训练数据，就可以成功实现2023年土壤湿度的预测了！

代码示意如下：

from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
 
 
'''
下面的split_sequence（）函数实现了这种行为，并将给定的单变量序列分成多个样本，其中每个样本具有指定的时间步长，输出是单个时间步。
'''
# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
    X, y = list(), list()
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i + n_steps
        # check if we are beyond the sequence
        if end_ix > len(sequence)-1:
            break
        # gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
        X.append(seq_x)
        y.append(seq_y)
    return array(X), array(y)
 
 
if __name__ == '__main__':
 
    # define input sequence
    raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
    print raw_seq
    # choose a number of time steps
    n_steps = 3
    # split into samples
    X, y = split_sequence(raw_seq, n_steps)
    print X, y
    # reshape from [samples, timesteps] into [samples, timesteps, features]
    n_features = 1
    X = X.reshape((X.shape[0], X.shape[1], n_features))
    # define model
    model = Sequential()
    model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features)))  # 隐藏层，输入，特征维
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mse')
    # fit model
    model.fit(X, y, epochs=300, batch_size=1, verbose=2)  # 迭代次数，批次数，verbose决定是否显示每次迭代
    # demonstrate prediction
    x_input = array([70, 80, 90])
    x_input = x_input.reshape((1, n_steps, n_features))
    yhat = model.predict(x_input, verbose=0)
    print x_input, yhat
    print(yhat)