【机器学习练习-7】- LSTM | 含完整代码+数据集

SSWDUT

已于 2025-04-13 10:26:18 修改

阅读量2.2k

点赞数 61

分类专栏： 🤖 实战练习集文章标签：机器学习 lstm 人工智能

于 2025-04-10 19:52:42 首次发布

本文链接：https://blog.csdn.net/wangshangshang09/article/details/147126226

版权

🤖 实战练习集专栏收录该内容

9 篇文章

订阅专栏

📈 基于 LSTM 的时间序列预测模型实战（含完整代码、可视化与多组超参数调优）

在时间序列预测中，LSTM（Long Short-Term Memory）因其处理时序信息的能力被广泛应用于金融预测、气象分析、位移监测等领域。本文将带你一步步搭建一个完整的 LSTM 时间序列预测系统：从数据预处理、模型设计，到训练可视化、预测评估，全流程覆盖。

本项目结合 CSDN 博客《【时间序列预测03】-LSTM 长短期记忆网络（Long Short-Term Memory）》中关于 MLP 的基础知识，以一个包含两个特征变量的时序数据集为例，搭建一个自动化的训练评估流程，从数据加载、预处理、模型训练到预测评估与结果可视化，全部实现自动保存、自动记录。适合对 LSTM、深度学习在时间序列中的应用感兴趣的读者参考和实践。

🚀 项目亮点

🔁 多组超参数自动循环（filters、batch_size、learning_rate、look_back）
🧠 使用 Keras + TensorFlow 搭建 LSTM 网络，结构灵活可拓展
🧹 集成 EarlyStopping + ModelCheckpoint，避免过拟合并保存最优模型
📉 输出完整评估指标（MAE、RMSE、MAPE、R²）并保存至 CSV
📊 自动绘制并保存训练 loss 曲线与预测结果图像
📁 项目结构清晰，便于组织与管理不同配置下的结果

📌 使用说明

准备数据集：将你的数据存为 CSV 文件，包含 date 和至少两个数值型特征（可修改支持更多维度）
修改路径：将 df = read_csv('../data/df.csv') 替换为你的数据路径
设置参数：可根据需要调整 n_features, hours_train, filters, look_back 等
运行 lstm_train.py 脚本，程序将自动遍历各组超参数并保存模型、图像和指标

一、导入所需要的库

import pandas as pd
import matplotlib.pyplot as plt
from numpy import concatenate
from sklearn.preprocessing import MinMaxScaler, LabelEncoder
from sklearn.metrics import mean_squared_error, mean_absolute_error, mean_absolute_percentage_error, r2_score
from keras.models import Sequential
from keras.layers import Dense, Flatten,LSTM
import matplotlib as mpl
import tensorflow as tf
from pandas import read_csv
from math import sqrt
from tensorflow.keras.layers import Input
import os
import random
import numpy as np

TimeSeriesSupervised 是一个专为时间序列数据转换为监督学习问题而设计的工具类。它将时间序列数据转换为可以用于机器学习模型训练的输入输出对，常用于时间序列预测任务。


from sklearn.base import BaseEstimator, TransformerMixin

class TimeSeriesSupervised(BaseEstimator, TransformerMixin):
    """
    Convert a time series dataset into a supervised learning format.
    """
    def __init__(self, look_back=1, predict_forward=1, dropnan=True):
        self.look_back = look_back
        self.predict_forward = predict_forward
        self.dropnan = dropnan

    def fit(self, X, y=None):
        return self

    def transform(self, X):
        if isinstance(X, pd.DataFrame):
            data = X.values
        elif isinstance(X, np.ndarray):
            data = X
        else:
            raise ValueError("Input must be a NumPy array or a pandas DataFrame.")

        n_vars = 1 if len(data.shape) == 1 else data.shape[1]
        df = pd.DataFrame(data)
        cols, names = list(), list()

        # input sequence (t-n, ... t-1)
        for i in range(self.look_back, 0, -1):
            cols.append(df.shift(i))
            names += [f'var{j+1}(t-{i})' for j in range(n_vars)]

        # forecast sequence (t, t+1, ... t+n)
        for i in range(0, self.predict_forward):
            cols.append(df.shift(-i))
            if i == 0:
                names += [f'var{j+1}(t)' for j in range(n_vars)]
            else:
                names += [f'var{j+1}(t+{i})' for j in range(n_vars)]

        # put it all together
        agg = pd.concat(cols, axis=1)
        agg.columns = names

        # drop rows with NaN values
        if self.dropnan:
            agg.dropna(inplace=True)

        return agg

可重复性控制（随机种子）

要让你的代码每次运行时结果一致，可以设置**随机种子（random seed）**来控制各种涉及随机性的部分。你已经在用 TensorFlow、Keras、NumPy 等库，所以需要为这些库分别设定随机种子。

# 设置PYTHONHASHSEED环境变量
os.environ['PYTHONHASHSEED'] = '42'

# Python内置random模块的随机种子
random.seed(42)

# NumPy的随机种子
np.random.seed(42)

# TensorFlow的随机种子
tf.random.set_seed(42)

简单地判断是否启用了 GPU

physical_devices = tf.config.experimental.list_physical_devices('GPU')
for device in physical_devices:
    tf.config.experimental.set_memory_growth(device, True)
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Num GPUs Available:  0

关闭图表中坐标轴（axes）的默认网格线（grid）显示。

mpl.rcParams['axes.grid'] = False

二、加载数据

读取并预处理你的时间序列数据

把 date 这一列设置为索引列，使得你的 DataFrame 变成时间索引结构（即：时间序列格式）；

df = read_csv('../data/df.csv')
df

	date	p5	rain
0	2015-04-10	1.041892	12.4
1	2015-04-21	0.186168	34.0
2	2015-05-02	-2.374851	31.6
3	2015-05-13	0.548328	62.4
4	2015-05-24	-0.684310	8.2
...	...	...	...
156	2019-12-21	-1.270302	20.8
157	2020-01-01	2.232585	26.8
158	2020-01-12	0.175435	1.8
159	2020-01-23	-0.253113	61.0
160	2020-02-03	0.346710	39.6

161 rows × 3 columns

dataset = df.set_index('date')
dataset

	p5	rain
date
2015-04-10	1.041892	12.4
2015-04-21	0.186168	34.0
2015-05-02	-2.374851	31.6
2015-05-13	0.548328	62.4
2015-05-24	-0.684310	8.2
...	...	...
2019-12-21	-1.270302	20.8
2020-01-01	2.232585	26.8
2020-01-12	0.175435	1.8
2020-01-23	-0.253113	61.0
2020-02-03	0.346710	39.6

161 rows × 2 columns

可视化位移数据

# plot displacement
dataset['p5'].plot()
plt.xticks(rotation=45)

在这里插入图片描述

# plot rain
df['rain'].plot()

在这里插入图片描述

查看每个变量的统计特征（均值、最大值、标准差等）

# statistics of the dataset
dataset.describe().transpose()

	count	mean	std	min	25%	50%	75%	max
p5	161.0	0.634245	2.032216	-3.847115	-0.467202	0.381206	1.214105	9.427837
rain	161.0	43.162733	46.266452	0.000000	9.400000	34.000000	62.400000	357.800000

hours_train = 100
表示用于训练模型的时间步长度是 100 个样本点（可能是小时、天，取决于数据的时间单位）。

n_features = 2
表示你的数据有两个变量（或特征） —— 比如可能是 p5（位移）和 rain。

# DEFINE TRAINING LENGHT (the remaining will be test)

#training hours
hours_train=100

# total number of variables
n_features = 2

# number of filters/nodes
filters = 16
# learning rates
lr = 5e-3
# epochs
epochs = 1000
# batch sizes
batch_size = 9
# how many time steps back do we want the model to see
look_backs = 3

三、数据预处理

3.1 时间序列转监督学习格式

look = look_backs

# load dataset
values = dataset.values
# ensure all data is float
values = values.astype('float32')
# normalize features
scaler = MinMaxScaler(feature_range=(0, 1))
scaled = scaler.fit_transform(values)
# specify the number of lag hours
n_hours = look
# frame as supervised learning

transformer = TimeSeriesSupervised(look_back=n_hours, predict_forward=1)
reframed = transformer.fit_transform(scaled)
print(reframed.shape)

(158, 8)

3.2. 划分数据集

# split into train and test sets
values = reframed.values
n_train_hours = hours_train - look
train = values[:n_train_hours, :]
test = values[n_train_hours:, :]
# split into input and outputs
n_obs = n_hours * n_features
train_X, train_y = train[:, :n_obs], train[:, -n_features]
test_X, test_y = test[:, :n_obs], test[:, -n_features]
print(train_X.shape, len(train_X), train_y.shape)

(97, 6) 97 (97,)

3.3 将输入数据重塑为 3D 数组

# reshape input to be 3D [samples, timesteps, features]
train_X = train_X.reshape((train_X.shape[0], n_hours, n_features))
test_X = test_X.reshape((test_X.shape[0], n_hours, n_features))
print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)

n = - n_features + 1

(97, 3, 2) (97,) (61, 3, 2) (61,)

四、构建模型

4.1 定义模型

def LSTM_net(filters, lr):
    model = Sequential()
    model.add(Input(shape=(train_X.shape[1], train_X.shape[2])))  # 推荐用法
    model.add(LSTM(filters, return_sequences=False))              # 不需要 input_shape 参数
    model.add(Dense(1))
    model.compile(
        loss=tf.losses.Huber(),
        optimizer=tf.optimizers.Adam(learning_rate=lr),
        metrics=[tf.metrics.MeanAbsoluteError(), tf.metrics.MeanSquaredError()]
    )
    return model

这段代码是构建 LSTM 网络结构的函数 LSTM_net(filters, lr)，意思是：根据你指定的超参数 filters（神经元数量）和 lr（学习率），生成一个LSTM模型。

我们逐行拆解它的意思，配点注释给你讲清楚👇：

def LSTM_net(filters, lr):

定义一个函数，输入两个参数：

filters：LSTM 层中的神经元个数（控制模型容量）
lr：学习率（控制优化器的更新步长）

    model = Sequential()

创建一个 Keras 的顺序模型（Sequential），适合像这样一层接一层的神经网络结构。

    model.add(Input(shape=(train_X.shape[1], train_X.shape[2])))  # 推荐用法

添加一个显式的 Input 层。

train_X.shape[1] 是时间步长（look_back）
train_X.shape[2] 是特征数（n_features）

➡️ 你告诉模型，输入是一个 二维时间序列数据，每个样本形状是：(时间步数, 特征数)。
这是符合 Keras 推荐的做法，避免警告 Do not pass input_shape to layer...

    model.add(LSTM(filters, return_sequences=False))

添加一个 LSTM 层：

filters 是隐藏单元个数，比如 64 或 128；
return_sequences=False 表示只输出最后一个时间步的隐藏状态，适合做单步预测。

    model.add(Dense(1))

添加一个 Dense（全连接）层，输出是 1 个值，对应你要预测的下一个点（单步预测）。

    model.compile(
        loss=tf.losses.Huber(),
        optimizer=tf.optimizers.Adam(learning_rate=lr),
        metrics=[tf.metrics.MeanAbsoluteError(), tf.metrics.MeanSquaredError()]
    )

用 Huber 损失（鲁棒性更好，不容易被异常值影响）；
优化器用 Adam，学习率用你传进来的 lr；
同时追踪两个评估指标：MAE 和 MSE，训练时会显示。

🎯 总结一下这个函数的作用：

它返回了一个结构如下的 LSTM 模型：

Input(shape=(time_steps, features))
  ↓
LSTM(units=filters)
  ↓
Dense(1)

适用于时间序列回归任务，预测未来一个数值。
结构简单，训练快，适合用于模型调参、超参搜索。

五、训练模型

fil= filters
learning_rate = lr
batch = batch_size

# load the model
model = LSTM_net(filters=fil, lr=learning_rate)
# Save the models only when validation loss decrease
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss',  # what is the metric to measure
                                              patience=20,
                                              # how many epochs to continue running the model after seeing an increase in val_loss
                                              restore_best_weights=True)  # update the model weights
model_checkpoint = tf.keras.callbacks.ModelCheckpoint(
    f'models/LSTM/weights/filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.weights.h5',
    monitor='val_loss', mode='min', verbose=0,
    save_best_only=True, save_weights_only=True) #Keras 的 ModelCheckpoint 回调被触发保存模型时，它会自动创建中间目录，包括你提供路径中的
# fit network
history = model.fit(train_X, train_y, epochs=epochs, batch_size=batch, validation_split=0.2, verbose=0,
                    shuffle=False, callbacks=[model_checkpoint, early_stop])

可视化loss

plt.figure(figsize=(16, 8))
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss mae')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')

在这里插入图片描述

六、预测

# load model to evaluate the test data
LSTM_model = LSTM_net(filters=fil, lr=learning_rate)
# load the last saved weight from the training
LSTM_model.load_weights(f"models/LSTM/weights/filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.weights.h5")

# Evaluate the model
yhat = LSTM_model.predict(test_X)
test_X_res = test_X.reshape((test_X.shape[0], n_hours * n_features))
# invert scaling for forecast
inv_yhat = concatenate((yhat, test_X_res[:, n:]), axis=1)
inv_yhat = scaler.inverse_transform(inv_yhat)
inv_yhat = inv_yhat[:, 0]
# invert scaling for actual
test_y = test_y.reshape((len(test_y), 1))
inv_y = concatenate((test_y, test_X_res[:, n:]), axis=1)
inv_y = scaler.inverse_transform(inv_y)
inv_y = inv_y[:, 0]

七、计算评估指标

# calculate MAE
mae = mean_absolute_error(inv_y, inv_yhat)
print('Test MAE: %.3f' % mae)
# calculate RMSE
rmse = sqrt(mean_squared_error(inv_y, inv_yhat))
print('Test RMSE: %.3f' % rmse)
# calculate MAPE
mape = mean_absolute_percentage_error(inv_y, inv_yhat)
print('Test MAPE: %.3f' % mape)
# calculate R2
r2 = r2_score(inv_y, inv_yhat)
print('Test R2: %.3f' % r2)

Test MAE: 1.517
Test RMSE: 2.290
Test MAPE: 2.182
Test R2: 0.125

八、可视化预测结果

# 绘图
plt.figure(figsize=(16, 8))
plt.title('Time series forecasting', size=20)
plt.plot(pd.DataFrame(test_y), label='Test Label')
plt.plot(pd.DataFrame(yhat), label='Predictions')
plt.legend(loc='lower right', markerscale=1)
plt.xlabel('Time step', size=20)
plt.ylabel('Differential displacement (mm)', size=20)
plt.grid(True)

在这里插入图片描述

完整代码

import pandas as pd
import matplotlib.pyplot as plt
from numpy import concatenate
from sklearn.preprocessing import MinMaxScaler, LabelEncoder
from sklearn.metrics import mean_squared_error, mean_absolute_error, mean_absolute_percentage_error, r2_score
from keras.models import Sequential
from keras.layers import Dense, Flatten,LSTM
import matplotlib as mpl
import tensorflow as tf
from helper import *
from pandas import read_csv
from math import sqrt
from tensorflow.keras.layers import Input
import os
import random
import numpy as np
# 设置PYTHONHASHSEED环境变量
os.environ['PYTHONHASHSEED'] = '42'

# Python内置random模块的随机种子
random.seed(42)

# NumPy的随机种子
np.random.seed(42)

# TensorFlow的随机种子
tf.random.set_seed(42)

physical_devices = tf.config.experimental.list_physical_devices('GPU')
for device in physical_devices:
    tf.config.experimental.set_memory_growth(device, True)

mpl.rcParams['axes.grid'] = False


# 1. Load the dataset

df = read_csv('../data/df.csv')
dataset = df.set_index('date')

# 2. Preprocess the dataset
# DEFINE TRAINING LENGHT (the remaining will be test)

#training hours
hours_train=100

# total number of variables
n_features = 2

### LSTM ###


dic = {}

# number of filters/nodes
filters = [16, 32, 64, 96, 128, 256]
# learning rates
lr = [10e-3, 5e-3, 10e-4, 5e-4, 10e-5, 5e-5]
# epochs
epochs = 1000
# batch sizes
batch_size = [9, 18, 36, 72, 144]
# how many time steps back do we want the model to see
look_backs = [3, 5, 7, 9, 12]

# Hyperparameters
dic["batch_size"] = []
dic["learning_rate"] = []
dic["filters"] = []
dic["look_backs"] = []

# test_scores
dic["MAE"] = []
dic["RMSE"] = []
dic["MAPE"] = []
dic["R2"] = []

for fil in filters:
    for learning_rate in lr:
        for batch in batch_size:
            for look in look_backs:
                print('-------------------------------------------------------------------------------------')
                print('LSTM')
                print('Filters: ', fil)
                print('Learning rate: ', learning_rate)
                print('Batch size: ', batch)
                print('Look back: ', look)
                # load dataset
                values = dataset.values
                # ensure all data is float
                values = values.astype('float32')
                # normalize features
                scaler = MinMaxScaler(feature_range=(0, 1))
                scaled = scaler.fit_transform(values)
                # specify the number of lag hours
                n_hours = look
                # frame as supervised learning
                reframed = series_to_supervised(scaled, n_hours, 1)
                print(reframed.shape)

                # split into train and test sets
                values = reframed.values
                n_train_hours = hours_train - look
                train = values[:n_train_hours, :]
                test = values[n_train_hours:, :]
                # split into input and outputs
                n_obs = n_hours * n_features
                train_X, train_y = train[:, :n_obs], train[:, -n_features]
                test_X, test_y = test[:, :n_obs], test[:, -n_features]
                print(train_X.shape, len(train_X), train_y.shape)

                # reshape input to be 3D [samples, timesteps, features]
                train_X = train_X.reshape((train_X.shape[0], n_hours, n_features))
                test_X = test_X.reshape((test_X.shape[0], n_hours, n_features))
                print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)

                n = - n_features + 1


                def LSTM_net(filters, lr):
                    model = Sequential()
                    model.add(LSTM(filters, return_sequences=False, input_shape=(train_X.shape[1], train_X.shape[2])))
                    model.add(Dense(1))
                    model.compile(loss=tf.losses.Huber(),
                                  optimizer=tf.optimizers.Adam(learning_rate=lr),
                                  metrics=[tf.metrics.MeanAbsoluteError(), tf.metrics.MeanSquaredError()])
                    model.summary()
                    return model


                # load the model
                model = LSTM_net(filters=fil, lr=learning_rate)
                # Save the models only when validation loss decrease
                early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss',  # what is the metric to measure
                                                              patience=20,
                                                              # how many epochs to continue running the model after seeing an increase in val_loss
                                                              restore_best_weights=True)  # update the model weights
                model_checkpoint = tf.keras.callbacks.ModelCheckpoint(
                    f'models/LSTM/weights/filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.weights.h5',
                    monitor='val_loss', mode='min', verbose=0,
                    save_best_only=True, save_weights_only=True) #Keras 的 ModelCheckpoint 回调被触发保存模型时，它会自动创建中间目录，包括你提供路径中的
                # fit network
                history = model.fit(train_X, train_y, epochs=epochs, batch_size=batch, validation_split=0.2, verbose=0,
                                    shuffle=False, callbacks=[model_checkpoint, early_stop])

                plt.figure(figsize=(16, 8))
                plt.plot(history.history['loss'])
                plt.plot(history.history['val_loss'])
                plt.title('model loss mae')
                plt.ylabel('loss')
                plt.xlabel('epoch')
                plt.legend(['train', 'validation'], loc='upper left')

                # save plots


                # 创建保存目录
                save_path = 'models/LSTM/plots/'
                os.makedirs(save_path, exist_ok=True)

                # 保存图像
                filename = f"filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.png"
                plt.savefig(os.path.join(save_path, filename))#, facecolor='white', edgecolor='none', bbox_inches='tight'


                # plt.show()

                # load model to evaluate the test data
                LSTM_model = LSTM_net(filters=fil, lr=learning_rate)
                # load the last saved weight from the training
                LSTM_model.load_weights(f"models/LSTM/weights/filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.weights.h5")

                # Evaluate the model
                yhat = LSTM_model.predict(test_X)
                test_X_res = test_X.reshape((test_X.shape[0], n_hours * n_features))
                # invert scaling for forecast
                inv_yhat = concatenate((yhat, test_X_res[:, n:]), axis=1)
                inv_yhat = scaler.inverse_transform(inv_yhat)
                inv_yhat = inv_yhat[:, 0]
                # invert scaling for actual
                test_y = test_y.reshape((len(test_y), 1))
                inv_y = concatenate((test_y, test_X_res[:, n:]), axis=1)
                inv_y = scaler.inverse_transform(inv_y)
                inv_y = inv_y[:, 0]

                # calculate MAE
                mae = mean_absolute_error(inv_y, inv_yhat)
                print('Test MAE: %.3f' % mae)
                # calculate RMSE
                rmse = sqrt(mean_squared_error(inv_y, inv_yhat))
                print('Test RMSE: %.3f' % rmse)
                # calculate MAPE
                mape = mean_absolute_percentage_error(inv_y, inv_yhat)
                print('Test MAPE: %.3f' % mape)
                # calculate R2
                r2 = r2_score(inv_y, inv_yhat)
                print('Test R2: %.3f' % r2)

                # plot and save preds



                # 绘图
                plt.figure(figsize=(16, 8))
                plt.title('Time series forecasting', size=20)
                plt.plot(pd.DataFrame(test_y), label='Test Label')
                plt.plot(pd.DataFrame(yhat), label='Predictions')
                plt.legend(loc='lower right', markerscale=1)
                plt.xlabel('Time step', size=20)
                plt.ylabel('Differential displacement (mm)', size=20)
                plt.grid(True)

                # 自动创建保存目录（如果不存在）
                save_dir2 = 'models/LSTM/preds/'
                os.makedirs(save_dir2, exist_ok=True)
                # 构建文件名
                filename2 = f"filters_{fil}_batch_size_{batch}_lr_{learning_rate}_look_back_{look}.png"
                plt.savefig(os.path.join(save_dir2, filename2), facecolor='white', edgecolor='none', bbox_inches='tight')

                # 显示图像
                # plt.show()


                # save results on the dictionary
                dic["batch_size"].append(batch)
                dic["learning_rate"].append(learning_rate)
                dic["filters"].append(fil)
                dic["look_backs"].append(look)
                dic["MAE"].append(mae)
                dic["RMSE"].append(rmse)
                dic["MAPE"].append(mape)
                dic["R2"].append(r2)
                # Convert results to a dataframe
                results = pd.DataFrame(dic)
                # Export as csv

                # 自动创建保存目录
                results_dir = 'models/LSTM/results/'
                os.makedirs(results_dir, exist_ok=True)

                # 保存 CSV 文件
                results.to_csv(os.path.join(results_dir, 'LSTM_results.csv'), index=False)
                print('-------------------------------------------------------------------------------------')

print('LSTM finished!')

helper.py

from pandas import DataFrame, concat

# convert series to supervised learning
def series_to_supervised(data, n_in=1, n_out=1, dropnan=True):
    n_vars = 1 if type(data) is list else data.shape[1]
    df = DataFrame(data)
    cols, names = list(), list()
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
        names += [('var%d(t-%d)' % (j + 1, i)) for j in range(n_vars)]
    # forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
        if i == 0:
            names += [('var%d(t)' % (j + 1)) for j in range(n_vars)]
        else:
            names += [('var%d(t+%d)' % (j + 1, i)) for j in range(n_vars)]
    # put it all together
    agg = concat(cols, axis=1)
    agg.columns = names
    # drop rows with NaN values
    if dropnan:
        agg.dropna(inplace=True)
    return agg

数据集链接