【深度学习实战：利用墨尔本十年的温度数据，基于keras框架用循环神经网络LSTM做时间序列预测】

最新推荐文章于 2023-06-28 08:11:42 发布

勋章DhR

最新推荐文章于 2023-06-28 08:11:42 发布

阅读量1k

点赞数 1

分类专栏：深度学习实战文章标签：深度学习 lstm keras python

本文链接：https://blog.csdn.net/zzjcymbq/article/details/127449609

版权

学习笔记，仅供参考！

介绍

RNN是基于序列数据（如语言、语音、时间序列）的递归性质而设计的，是一种反馈类型的神经网络，其结构包含环和自重复，因此被称为“循环”。它专门用于处理序列数据，这里使用的是many to one 的结构类型，输入序列，输出为单个值，类似于之前的文本分类和文本生成或预测时间序列数据。

本文使用的是目前常用的LSTM长短时记忆网络，相对于传统的循环神经网络，信息是通过多个隐含层逐层传递到输出层的。直观上，这会导致信息的损失，更本质地，这会使得网络参数难以优化，LSTM可以很好的解决这问题，对于时间序列预测也有一定的参考价值。

数据集

给出墨尔本近十年的温度数据集，以温度作为输入，利用lstm神经网络模型来做时间序列预测，根据提供的温度数据集来预测未来一天的温度，数据集如下图示例：
在这里插入图片描述

参数解释,其中的window_size：将温度每15个作为一组输入，第16个元素作为输出，也就是预测值，依次滑动窗口

“epochs”: 2,
“batch_size”: 10,
“window_size”: 15, 窗口，每15个数据作为一组，依次滑动
“train_test_split”: 0.8, 切分训练集
“validation_split”: 0.1,
“dropout_keep_prob”: 0.2,抑制参数传递，在全连接层，0.2的参数不做更新，更新速度变快，泛化能力更好，防止过拟合
“hidden_unit”: 100 隐藏层单元

代码

处理时间序列数据集，其中index_col=0：将第0列数据日期作为index，输入的values值只有温度，每16个元素作为一组数据，index：0-14为输入x，index：15为输出y，在对数据切分，0.8为训练集，0.2为测试集

# 处理时间序列数据集
def load_timeseries(filename, params):
    # 加载时间序列数据集
    series = pd.read_csv(filename, sep=',', header=0, index_col=0, squeeze=True)
    data = series.values
    adjusted_window = params['window_size'] + 1  # window_size+1,’+1‘作为预测值
    # Split data into windows
    raw = []  # 原始数据
    for index in range(len(data) - adjusted_window):
        raw.append(data[index:index + adjusted_window])
    # Normalize data
    result = normalize_window(raw)

    raw = np.array(raw)
    # 原始数据假设最开始有N行，通过窗口滑动形成[N-16，16]的二维数据
    result = np.array(result)

    # Split the input dataset into train and test

    split_train_index = int(round(params['train_test_split'] * result.shape[0]))
    train = result[:split_train_index, :]
    np.random.shuffle(train)  # 滑动窗口后的数据相关性太高，按行做重新排序，洗牌

    x_train = train[:, :-1]
    y_train = train[:, -1]
    x_test = result[split_train_index:, :-1]
    y_test = result[split_train_index:, -1]
    # 对数据升维度，每一个X作为一个向量，作为一个输入
    x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
    x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))
    # 处理原始数据
    # order = y_train.argsort(axis=0)
    x_test_raw = raw[split_train_index:, :-1]
    y_test_raw = raw[split_train_index:, -1]

    # Last window, for next time stamp prediction
    last_raw = [data[-params['window_size']:]]  # 取出最后一组数据
    last = normalize_window(last_raw)
    last = np.array(last)
    last = np.reshape(last, (last.shape[0], last.shape[1], 1))
    return [x_train, y_train, x_test, y_test, x_test_raw, y_test_raw, last_raw, last