【机器学习练习-7】- LSTM | 含完整代码+数据集

📈 基于 LSTM 的时间序列预测模型实战(含完整代码、可视化与多组超参数调优)

在时间序列预测中,LSTM(Long Short-Term Memory)因其处理时序信息的能力被广泛应用于金融预测、气象分析、位移监测等领域。本文将带你一步步搭建一个完整的 LSTM 时间序列预测系统:从数据预处理、模型设计,到训练可视化、预测评估,全流程覆盖。

本项目结合 CSDN 博客《【时间序列预测03】-LSTM 长短期记忆网络(Long Short-Term Memory) 》中关于 MLP 的基础知识,以一个包含两个特征变量的时序数据集为例,搭建一个自动化的训练评估流程,从数据加载、预处理、模型训练到预测评估与结果可视化,全部实现自动保存、自动记录。 适合对 LSTM、深度学习在时间序列中的应用感兴趣的读者参考和实践。


🚀 项目亮点

  • 🔁 多组超参数自动循环(filters、batch_size、learning_rate、look_back)
  • 🧠 使用 Keras + TensorFlow 搭建 LSTM 网络,结构灵活可拓展
  • 🧹 集成 EarlyStopping + ModelCheckpoint,避免过拟合并保存最优模型
  • 📉 输出完整评估指标(MAE、RMSE、MAPE、R²)并保存至 CSV
  • 📊 自动绘制并保存训练 loss 曲线与预测结果图像
  • 📁 项目结构清晰,便于组织与管理不同配置下的结果

📌 使用说明

  1. 准备数据集:将你的数据存为 CSV 文件,包含 date 和至少两个数值型特征(可修改支持更多维度)
  2. 修改路径:将 df = read_csv('../data/df.csv') 替换为你的数据路径
  3. 设置参数:可根据需要调整 n_features, hours_train, filters, look_back
  4. 运行 lstm_train.py 脚本,程序将自动遍历各组超参数并保存模型、图像和指标


一、导入所需要的库

import pandas as pd
import matplotlib.pyplot as plt
from numpy import concatenate
from sklearn.preprocessing import MinMaxScaler, LabelEncoder
from sklearn.metrics import mean_squared_error, mean_absolute_error, mean_absolute_percentage_error, r2_score
from keras.models import Sequential
from keras.layers import Dense, Flatten,LSTM
import matplotlib as mpl
import tensorflow as tf
from pandas import read_csv
from math import sqrt
from tensorflow.keras.layers import Input
import os
import random
import numpy as np


TimeSeriesSupervised 是一个专为时间序列数据转换为监督学习问题而设计的工具类。它将时间序列数据转换为可以用于机器学习模型训练的输入输出对,常用于时间序列预测任务。


from sklearn.base import BaseEstimator, TransformerMixin

class TimeSeriesSupervised(BaseEstimator, TransformerMixin):
    """
    Convert a time series dataset into a supervised learning format.
    """
    def __init__(self, look_back=1, predict_forward=1, dropnan=True):
        self.look_back = look_back
        self.predict_forward = predict_forward
        self.dropnan = dropnan

    def fit(self, X, y=None):
        return self

    def transform(self, X):
        if isinstance(X, pd.DataFrame):
            data = X.values
        elif isinstance(X, np.ndarray):
            data = X
        else:
            raise ValueError("Input must be a NumPy array or a pandas DataFrame.")

        n_vars = 1 if len(data.shape) == 1 else data.shape[1]
        df = pd.DataFrame(data)
        cols, names = list(), list()

        # input sequence (t-n, ... t-1)
        for i in range(self.look_back, 0, -1):
            cols.append(df.shift(i))
            names += [f'var{j+1}(t-{i})' for j in range(n_vars)]

        # forecast sequence (t, t+1, ... t+n)
        for i in range(0, self.predict_forward):
            cols.append(df.shift(-i))
            if i == 0:
                names += [f'var{j+1}(t)' for j in range(n_vars)]
            else:
                names += [f'var{j+1}(t+{i})' for j in range(n_vars)]

        # put it all together
        agg = pd.concat(cols, axis=1)
        agg.columns = names

        # drop rows with NaN values
        if self.dropnan:
            agg.dropna(inplace=True)

        return agg

可重复性控制(随机种子)

要让你的代码每次运行时结果一致,可以设置**随机种子(random seed)**来控制各种涉及随机性的部分。你已经在用 TensorFlow、Keras、NumPy 等库,所以需要为这些库分别设定随机种子。

# 设置PYTHONHASHSEED环境变量
os.environ['PYTHONHASHSEED'] = '42'

# Python内置random模块的随机种子
random.seed(42)

# NumPy的随机种子
np.random.seed(42)

# TensorFlow的随机种子
tf.random.set_seed(42)

简单地判断是否启用了 GPU

physical_devices = tf.config.experimental.list_physical_devices('GPU')
for device in physical_devices:
    tf.config.experimental.set_memory_growth(device, True)
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Num GPUs Available:  0

关闭图表中坐标轴(axes)的默认网格线(grid)显示。

mpl.rcParams['axes.grid'] = False

二、加载数据

读取并预处理你的时间序列数据

把 date 这一列设置为 索引列,使得你的 DataFrame 变成时间索引结构(即:时间序列格式);

df = read_csv('../data/df.csv')
df
datep5rain
02015-04-101.04189212.4
12015-04-210.18616834.0
22015-05-02-2.37485131.6
32015-05-130.54832862.4
42015-05-24-0.6843108.2
............
1562019-12-21-1.27030220.8
1572020-01-012.23258526.8
1582020-01-120.1754351.8
1592020-01-23-0.25311361.0
1602020-02-030.34671039.6

161 rows × 3 columns

dataset = df.set_index('date')
dataset
p5rain
date
2015-04-101.04189212.4
2015-04-210.18616834.0
2015-05-02-2.37485131.6
2015-05-130.54832862.4
2015-05-24-0.6843108.2
.........
2019-12-21-1.27030220.8
2020-01-012.23258526.8
2020-01-120.1754351.8
2020-01-23-0.25311361.0
2020-02-030.34671039.6

161 rows × 2 columns

可视化位移数据

# plot displacement
dataset['p5'].plot()
plt.xticks(rotation=45)

在这里插入图片描述

# plot rain
df['rain'].plot()

在这里插入图片描述

查看每个变量的统计特征(均值、最大值、标准差等)

# statistics of the dataset
dataset.describe().transpose()
countmeanstdmin25%50%75%max
p5161.00.6342452.032216-3.847115-0.4672020.3812061.2141059.427837
rain161.043.16273346.2664520.0000009.40000034.00000062.400000357.800000

hours_train = 100
表示用于训练模型的时间步长度是 100 个样本点(可能是小时、天,取决于数据的时间单位)。

n_features = 2
表示你的数据有 两个变量(或特征) —— 比如可能是 p5(位移) 和 rain。

# DEFINE TRAINING LENGHT (the remaining will be test)

#training hours
hours_train=100

# total number of variables
n_features = 2
# number of filters/nodes
filters = 16
# learning rates
lr = 5e-3
# epochs
epochs = 1000
# batch sizes
batch_size = 9
# how many time steps back do we want the model to see
look_backs = 3

三、数据预处理

3.1 时间序列转监督学习格式

look = look_backs

# load dataset
values = dataset.values
# ensure all data is float
values = values.astype('float32')
# normalize features
scaler = MinMaxScaler(feature_range=(0, 1))
scaled = scaler.fit_transform(values)
# specify the number of lag hours
n_hours = look
# frame as supervised learning

transformer = TimeSeriesSupervised(look_back=n_hours, predict_forward=1)
reframed = transformer.fit_transform(scaled)
print(reframed.shape)

(158, 8)

3.2. 划分数据集

# split into train and test sets
values = reframed.values
n_train_hours = hours_train - look
train = values[:n_train_hours, :]
test = values[n_train_hours:, :]
# split into input and outputs
n_obs = n_hours * n_features
train_X, train_y = train[:, :n_obs], train[:, -n_features]
test_X, test_y = test[:, :n_obs], test[:, -n_features]
print(train_X.shape, len(train_X), train_y.shape)
(97, 6) 97 (97,)

3.3 将输入数据重塑为 3D 数组

# reshape input to be 3D [samples, timesteps, features]
train_X = train_X.reshape((train_X.shape[0], n_hours, n_features))
test_X = test_X.reshape((test_X.shape[0], n_hours, n_features))
print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)

n = - n_features + 1
(97, 3, 2) (97,) (61, 3, 2) (61,)

四、构建模型

4.1 定义模型

def LSTM_net(filters, lr):
    model = Sequential()
    model.add(Input(shape=(train_X.shape[1], train_X.shape[2])))  # 推荐用法
    model.add(LSTM(filters, return_sequences=False))              # 不需要 input_shape 参数
    model.add(Dense(1))
    model.compile(
        loss=tf.losses.Huber(),
        optimizer=tf.optimizers.Adam(learning_rate=lr),
        metrics=[tf.metrics.MeanAbsoluteError(), tf.metrics.MeanSquaredError()]
    )
    return model

这段代码是构建 LSTM 网络结构的函数 LSTM_net(filters, lr),意思是:根据你指定的超参数 filters(神经元数量)和 lr(学习率),生成一个LSTM模型


我们逐行拆解它的意思,配点注释给你讲清楚👇:

def LSTM_net(filters, lr):

定义一个函数,输入两个参数:

  • filters:LSTM 层中的神经元个数(控制模型容量)
  • lr:学习率(控制优化器的更新步长)

    model = Sequential()

创建一个 Keras 的顺序模型Sequential),适合像这样一层接一层的神经网络结构。


    model.add(Input(shape=(train_X.shape[1], train_X.shape[2])))  # 推荐用法

添加一个显式的 Input 层。

  • train_X.shape[1] 是时间步长(look_back)
  • train_X.shape[2] 是特征数(n_features)

➡️ 你告诉模型,输入是一个 二维时间序列数据,每个样本形状是:(时间步数, 特征数)。
这是符合 Keras 推荐的做法,避免警告 Do not pass input_shape to layer...


    model.add(LSTM(filters, return_sequences=False))

添加一个 LSTM 层:

  • filters 是隐藏单元个数,比如 64 或 128;
  • return_sequences=False 表示只输出最后一个时间步的隐藏状态,适合做单步预测。

    model.add(Dense(1))

添加一个 Dense(全连接)层,输出是 1 个值,对应你要预测的下一个点(单步预测)。


    model.compile(
        loss=tf.losses.Huber(),
        optimizer=tf.optimizers.Adam(learning_rate=lr),
        metrics=[tf.metrics.MeanAbsoluteError(), tf.metrics.MeanSquaredError()]
    )
  • Huber 损失(鲁棒性更好,不容易被异常值影响);
  • 优化器用 Adam,学习率用你传进来的 lr
  • 同时追踪两个评估指标:MAE 和 MSE,训练时会显示。

🎯 总结一下这个函数的作用:

它返回了一个结构如下的 LSTM 模型:

Input(shape=(time_steps, features))
  ↓
LSTM(units=filters)
  ↓
Dense(1)

适用于时间序列回归任务,预测未来一个数值。
结构简单,训练快,适合用于模型调参、超参搜索。


五、训练模型

fil= filters
learning_rate = lr
batch = batch_size

# load the model
model = LSTM_net(filters=fil, lr=learning_rate)
# Save the models only when validation loss decrease
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss',  # what is the metric to measure
                                              patience=20,
                                              # how many epochs to continue running the model after seeing an increase in val_loss
                                              restore_best_weights=True)  # update the model weights
model_checkpoint = tf.keras.callbacks.ModelCheckpoint(
    f'models/LSTM/weights/filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.weights.h5',
    monitor='val_loss', mode='min', verbose=0,
    save_best_only=True, save_weights_only=True) #Keras 的 ModelCheckpoint 回调被触发保存模型时,它会自动创建中间目录,包括你提供路径中的
# fit network
history = model.fit(train_X, train_y, epochs=epochs, batch_size=batch, validation_split=0.2, verbose=0,
                    shuffle=False, callbacks=[model_checkpoint, early_stop])

可视化loss

plt.figure(figsize=(16, 8))
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss mae')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')

在这里插入图片描述

六、预测

# load model to evaluate the test data
LSTM_model = LSTM_net(filters=fil, lr=learning_rate)
# load the last saved weight from the training
LSTM_model.load_weights(f"models/LSTM/weights/filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.weights.h5")

# Evaluate the model
yhat = LSTM_model.predict(test_X)
test_X_res = test_X.reshape((test_X.shape[0], n_hours * n_features))
# invert scaling for forecast
inv_yhat = concatenate((yhat, test_X_res[:, n:]), axis=1)
inv_yhat = scaler.inverse_transform(inv_yhat)
inv_yhat = inv_yhat[:, 0]
# invert scaling for actual
test_y = test_y.reshape((len(test_y), 1))
inv_y = concatenate((test_y, test_X_res[:, n:]), axis=1)
inv_y = scaler.inverse_transform(inv_y)
inv_y = inv_y[:, 0]

七、计算评估指标

# calculate MAE
mae = mean_absolute_error(inv_y, inv_yhat)
print('Test MAE: %.3f' % mae)
# calculate RMSE
rmse = sqrt(mean_squared_error(inv_y, inv_yhat))
print('Test RMSE: %.3f' % rmse)
# calculate MAPE
mape = mean_absolute_percentage_error(inv_y, inv_yhat)
print('Test MAPE: %.3f' % mape)
# calculate R2
r2 = r2_score(inv_y, inv_yhat)
print('Test R2: %.3f' % r2)
Test MAE: 1.517
Test RMSE: 2.290
Test MAPE: 2.182
Test R2: 0.125

八、可视化预测结果

# 绘图
plt.figure(figsize=(16, 8))
plt.title('Time series forecasting', size=20)
plt.plot(pd.DataFrame(test_y), label='Test Label')
plt.plot(pd.DataFrame(yhat), label='Predictions')
plt.legend(loc='lower right', markerscale=1)
plt.xlabel('Time step', size=20)
plt.ylabel('Differential displacement (mm)', size=20)
plt.grid(True)

在这里插入图片描述

完整代码

import pandas as pd
import matplotlib.pyplot as plt
from numpy import concatenate
from sklearn.preprocessing import MinMaxScaler, LabelEncoder
from sklearn.metrics import mean_squared_error, mean_absolute_error, mean_absolute_percentage_error, r2_score
from keras.models import Sequential
from keras.layers import Dense, Flatten,LSTM
import matplotlib as mpl
import tensorflow as tf
from helper import *
from pandas import read_csv
from math import sqrt
from tensorflow.keras.layers import Input
import os
import random
import numpy as np
# 设置PYTHONHASHSEED环境变量
os.environ['PYTHONHASHSEED'] = '42'

# Python内置random模块的随机种子
random.seed(42)

# NumPy的随机种子
np.random.seed(42)

# TensorFlow的随机种子
tf.random.set_seed(42)

physical_devices = tf.config.experimental.list_physical_devices('GPU')
for device in physical_devices:
    tf.config.experimental.set_memory_growth(device, True)

mpl.rcParams['axes.grid'] = False


# 1. Load the dataset

df = read_csv('../data/df.csv')
dataset = df.set_index('date')

# 2. Preprocess the dataset
# DEFINE TRAINING LENGHT (the remaining will be test)

#training hours
hours_train=100

# total number of variables
n_features = 2

### LSTM ###


dic = {}

# number of filters/nodes
filters = [16, 32, 64, 96, 128, 256]
# learning rates
lr = [10e-3, 5e-3, 10e-4, 5e-4, 10e-5, 5e-5]
# epochs
epochs = 1000
# batch sizes
batch_size = [9, 18, 36, 72, 144]
# how many time steps back do we want the model to see
look_backs = [3, 5, 7, 9, 12]

# Hyperparameters
dic["batch_size"] = []
dic["learning_rate"] = []
dic["filters"] = []
dic["look_backs"] = []

# test_scores
dic["MAE"] = []
dic["RMSE"] = []
dic["MAPE"] = []
dic["R2"] = []

for fil in filters:
    for learning_rate in lr:
        for batch in batch_size:
            for look in look_backs:
                print('-------------------------------------------------------------------------------------')
                print('LSTM')
                print('Filters: ', fil)
                print('Learning rate: ', learning_rate)
                print('Batch size: ', batch)
                print('Look back: ', look)
                # load dataset
                values = dataset.values
                # ensure all data is float
                values = values.astype('float32')
                # normalize features
                scaler = MinMaxScaler(feature_range=(0, 1))
                scaled = scaler.fit_transform(values)
                # specify the number of lag hours
                n_hours = look
                # frame as supervised learning
                reframed = series_to_supervised(scaled, n_hours, 1)
                print(reframed.shape)

                # split into train and test sets
                values = reframed.values
                n_train_hours = hours_train - look
                train = values[:n_train_hours, :]
                test = values[n_train_hours:, :]
                # split into input and outputs
                n_obs = n_hours * n_features
                train_X, train_y = train[:, :n_obs], train[:, -n_features]
                test_X, test_y = test[:, :n_obs], test[:, -n_features]
                print(train_X.shape, len(train_X), train_y.shape)

                # reshape input to be 3D [samples, timesteps, features]
                train_X = train_X.reshape((train_X.shape[0], n_hours, n_features))
                test_X = test_X.reshape((test_X.shape[0], n_hours, n_features))
                print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)

                n = - n_features + 1


                def LSTM_net(filters, lr):
                    model = Sequential()
                    model.add(LSTM(filters, return_sequences=False, input_shape=(train_X.shape[1], train_X.shape[2])))
                    model.add(Dense(1))
                    model.compile(loss=tf.losses.Huber(),
                                  optimizer=tf.optimizers.Adam(learning_rate=lr),
                                  metrics=[tf.metrics.MeanAbsoluteError(), tf.metrics.MeanSquaredError()])
                    model.summary()
                    return model


                # load the model
                model = LSTM_net(filters=fil, lr=learning_rate)
                # Save the models only when validation loss decrease
                early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss',  # what is the metric to measure
                                                              patience=20,
                                                              # how many epochs to continue running the model after seeing an increase in val_loss
                                                              restore_best_weights=True)  # update the model weights
                model_checkpoint = tf.keras.callbacks.ModelCheckpoint(
                    f'models/LSTM/weights/filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.weights.h5',
                    monitor='val_loss', mode='min', verbose=0,
                    save_best_only=True, save_weights_only=True) #Keras 的 ModelCheckpoint 回调被触发保存模型时,它会自动创建中间目录,包括你提供路径中的
                # fit network
                history = model.fit(train_X, train_y, epochs=epochs, batch_size=batch, validation_split=0.2, verbose=0,
                                    shuffle=False, callbacks=[model_checkpoint, early_stop])

                plt.figure(figsize=(16, 8))
                plt.plot(history.history['loss'])
                plt.plot(history.history['val_loss'])
                plt.title('model loss mae')
                plt.ylabel('loss')
                plt.xlabel('epoch')
                plt.legend(['train', 'validation'], loc='upper left')

                # save plots


                # 创建保存目录
                save_path = 'models/LSTM/plots/'
                os.makedirs(save_path, exist_ok=True)

                # 保存图像
                filename = f"filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.png"
                plt.savefig(os.path.join(save_path, filename))#, facecolor='white', edgecolor='none', bbox_inches='tight'


                # plt.show()

                # load model to evaluate the test data
                LSTM_model = LSTM_net(filters=fil, lr=learning_rate)
                # load the last saved weight from the training
                LSTM_model.load_weights(f"models/LSTM/weights/filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.weights.h5")

                # Evaluate the model
                yhat = LSTM_model.predict(test_X)
                test_X_res = test_X.reshape((test_X.shape[0], n_hours * n_features))
                # invert scaling for forecast
                inv_yhat = concatenate((yhat, test_X_res[:, n:]), axis=1)
                inv_yhat = scaler.inverse_transform(inv_yhat)
                inv_yhat = inv_yhat[:, 0]
                # invert scaling for actual
                test_y = test_y.reshape((len(test_y), 1))
                inv_y = concatenate((test_y, test_X_res[:, n:]), axis=1)
                inv_y = scaler.inverse_transform(inv_y)
                inv_y = inv_y[:, 0]

                # calculate MAE
                mae = mean_absolute_error(inv_y, inv_yhat)
                print('Test MAE: %.3f' % mae)
                # calculate RMSE
                rmse = sqrt(mean_squared_error(inv_y, inv_yhat))
                print('Test RMSE: %.3f' % rmse)
                # calculate MAPE
                mape = mean_absolute_percentage_error(inv_y, inv_yhat)
                print('Test MAPE: %.3f' % mape)
                # calculate R2
                r2 = r2_score(inv_y, inv_yhat)
                print('Test R2: %.3f' % r2)

                # plot and save preds



                # 绘图
                plt.figure(figsize=(16, 8))
                plt.title('Time series forecasting', size=20)
                plt.plot(pd.DataFrame(test_y), label='Test Label')
                plt.plot(pd.DataFrame(yhat), label='Predictions')
                plt.legend(loc='lower right', markerscale=1)
                plt.xlabel('Time step', size=20)
                plt.ylabel('Differential displacement (mm)', size=20)
                plt.grid(True)

                # 自动创建保存目录(如果不存在)
                save_dir2 = 'models/LSTM/preds/'
                os.makedirs(save_dir2, exist_ok=True)
                # 构建文件名
                filename2 = f"filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.png"
                plt.savefig(os.path.join(save_dir2, filename2), facecolor='white', edgecolor='none', bbox_inches='tight')

                # 显示图像
                # plt.show()


                # save results on the dictionary
                dic["batch_size"].append(batch)
                dic["learning_rate"].append(learning_rate)
                dic["filters"].append(fil)
                dic["look_backs"].append(look)
                dic["MAE"].append(mae)
                dic["RMSE"].append(rmse)
                dic["MAPE"].append(mape)
                dic["R2"].append(r2)
                # Convert results to a dataframe
                results = pd.DataFrame(dic)
                # Export as csv

                # 自动创建保存目录
                results_dir = 'models/LSTM/results/'
                os.makedirs(results_dir, exist_ok=True)

                # 保存 CSV 文件
                results.to_csv(os.path.join(results_dir, 'LSTM_results.csv'), index=False)
                print('-------------------------------------------------------------------------------------')

print('LSTM finished!')

helper.py

from pandas import DataFrame, concat

# convert series to supervised learning
def series_to_supervised(data, n_in=1, n_out=1, dropnan=True):
    n_vars = 1 if type(data) is list else data.shape[1]
    df = DataFrame(data)
    cols, names = list(), list()
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
        names += [('var%d(t-%d)' % (j + 1, i)) for j in range(n_vars)]
    # forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
        if i == 0:
            names += [('var%d(t)' % (j + 1)) for j in range(n_vars)]
        else:
            names += [('var%d(t+%d)' % (j + 1, i)) for j in range(n_vars)]
    # put it all together
    agg = concat(cols, axis=1)
    agg.columns = names
    # drop rows with NaN values
    if dropnan:
        agg.dropna(inplace=True)
    return agg



数据集 链接

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值