LSTM + self进行天气变化的时间序列预测的代码解读

最新推荐文章于 2024-07-12 16:55:56 发布

YueYue_727

最新推荐文章于 2024-07-12 16:55:56 发布

阅读量78

点赞数

文章标签： lstm 人工智能 rnn

原文链接：https://weibaohang.blog.csdn.net/article/details/128678028

版权

本文介绍了如何使用PyTorch实现一个基于LSTM和注意力机制的深度学习模型，用于天气变化时间序列预测。作者分享了代码片段，包括数据预处理、模型定义、训练过程以及结果可视化。

摘要由CSDN通过智能技术生成

深度学习初学者，找到一个项目实例训练，稍微注释了代码。希望互勉。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
from sklearn.preprocessing import StandardScaler
from torch.utils.data import TensorDataset
from tqdm import tqdm

定义超参数

timestep = 1  # 时间步长，就是利用多少时间窗口
batch_size = 16  # 批次大小 批次大小，即一次训练模型时所用的样本数量。
input_dim = 14  # 每个步长对应的特征数量，就是使用每天的14个特征，最高、最低、开盘、落盘
hidden_dim = 64  # 隐层大小 表示模型中RNN隐层的神经元数量。
output_dim = 1  # 输出维度，由于是回归任务，所以最终输出层的大小为1，即模型的输出是一个标量。
num_layers = 3  # LSTM的层数
epochs = 10 #迭代轮数，表示模型训练时数据将被遍历的次数。
best_loss = 0 # 最佳损失，可能是用于记录模型在训练过程中的最佳损失值。
model_name = 'gru'  #模型名称，用于保存训练好的模型。
save_path = './{}.pth'.format(model_name) # 保存模型的路径，使用了一个格式化字符串来将model_name嵌入到路径中。

加载时间序列数据

# 1.加载时间序列数据
df = pd.read_csv('自己的路径\ceshi.csv', index_col = 0)

数据标准化

scaler = StandardScaler()
scaler_model = StandardScaler()
data = scaler_model.fit_transform(np.array(df)) # 将整个数据集进行标准化处理，并将结果保存在data变量中。
# 这里使用了fit_transform方法，它首先会计算数据的均值和标准差，然后进行标准化处理。
scaler.fit_transform(np.array(df['T (degC)']).reshape(-1, 1))
#这行代码对特定列（可能是温度列）进行了标准化处理，只有这一列的数据被用来拟合标准化器，并对其进行标准化处理。

形成训练数据

def split_data(data, timestep, input_dim): #序列长度，和每个序列的嵌入长度
    dataX = []  # 保存X
    dataY = []  # 保存Y

    # 将整个窗口的数据保存到X中，将未来一天保存到Y中
    for index in range(len(data) - timestep): # 这是一个循环，从数据序列的起始点开始迭代直到 len(data) - timestep，
        # 这个循环用于构建训练数据和对应的目标值。
        dataX.append(data[index: index + timestep])
        #将长度为 timestep 的时间窗口数据添加到 dataX 中，即将当前时间步及其前 timestep-1 步的数据作为一个训练样本。
        dataY.append(data[index + timestep][1])
        #将当前时间步后 timestep 步的第二个特征（假设索引从0开始）作为对应的目标值 dataY。
    dataX = np.array(dataX)
    dataY = np.array(dataY)
    #将列表转换为 NumPy 数组，以便后续处理。


    # 获取训练集大小
    train_size = int(np.round(0.8 * dataX.shape[0]))

    # 划分训练集、测试集
    x_train = dataX[: train_size, :].reshape(-1, timestep, input_dim)
    y_train = dataY[: train_size]
    #从 dataX 中取前 train_size 个样本作为训练集，并将其 reshape 成 (样本数, 时间步长, 特征维度) 的形状，以适配循环神经网络的输入要求。
    #对应的训练集目标值。
    x_test = dataX[train_size:, :].reshape(-1, timestep, input_dim)
    y_test = dataY[train_size:]

    return [x_train, y_train, x_test, y_test]

# 3.获取训练数据   x_train: 1700,1,4
x_train, y_train, x_test, y_test = split_data(data, timestep, input_dim)

# 4.将数据转为tensor 转为张量 为了与 PyTorch 中的张量操作兼容，因为在构建神经网络模型时，PyTorch 中的张量是必需的。
x_train_tensor = torch.from_numpy(x_train).to(torch.float32)
y_train_tensor = torch.from_numpy(y_train).to(torch.float32)
x_test_tensor = torch.from_numpy(x_test).to(torch.float32)
y_test_tensor = torch.from_numpy(y_test).to(torch.float32)

# 5.形成训练数据集
train_data = TensorDataset(x_train_tensor, y_train_tensor)
test_data = TensorDataset(x_test_tensor, y_test_tensor)

# 6.将数据加载成迭代器 这段代码创建了用于批量加载训练集和测试集的 DataLoader 对象，这是 PyTorch 中用于数据加载和批处理的工具。
train_loader = torch.utils.data.DataLoader(train_data,
                                           batch_size,
                                           True)
#batch_size: 这是指定每个批次中包含的样本数目。在训练过程中，数据通常会被分成小批次进行处理，以便于加速训练并且减少内存占用。
#这两个参数是指定是否在每个 epoch 结束时对数据进行洗牌（reshuffling）。对于训练数据，通常会在每个 epoch 开始时对数据进行洗牌以增加模型的泛化能力，而对于测试数据，则不需要进行洗牌。
test_loader = torch.utils.data.DataLoader(test_data,
                                          batch_size,
                                          False)

定义LSTM网络

class LSTM(nn.Module):
    def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
        super(LSTM, self).__init__()
        #这是 Python 类的构造方法，在创建类的实例时会被调用。
        # 在这里，通过调用 super(LSTM, self).__init__() 初始化父类 nn.Module。然后定义了模型的各个组件：
        self.hidden_dim = hidden_dim  # 隐层大小
        self.num_layers = num_layers  # LSTM层数
        #self.attention：这是一个多头注意力层，通过 nn.MultiheadAttention 构造，用于引入注意力机制
        # embed_dim为每个时间步对应的特征数
        #num_heads 指定了注意力头的数量。
        self.attention = nn.MultiheadAttention(embed_dim=input_dim, num_heads=2)
        # input_dim为特征维度，就是每个时间点对应的特征数量，这里为14
        self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True)
        #这是一个 LSTM 层，通过 nn.LSTM 构造，用于建模时间序列数据。
        # input_dim 是输入特征的维度，hidden_dim 是隐层大小，num_layers 是 LSTM 的层数，batch_first=True 表示输入数据的第一个维度是 batch size。
        self.fc = nn.Linear(hidden_dim, output_dim)
        #这是一个全连接层，通过 nn.Linear 构造，用于将 LSTM 输出的隐层状态映射到最终的输出维度 output_dim

    def forward(self, query, key, value): #这是 LSTM 类中的 forward 方法，它定义了模型的前向传播过程。让我解释一下每个步骤的作用：
        #query, key, value: 这些是注意力机制中的查询(query)、键(key)和值(value)，它们是通过注意力机制计算得到的，通常由输入数据经过一些线性变换后得到。
#         print(query.shape) # torch.Size([16, 1, 4]) batch_size, time_step, input_dim
        attention_output, attn_output_weights = self.attention(query, key, value)
        #self.attention(query, key, value): 这一步调用了多头注意力层，对输入序列进行注意力计算，得到注意力输出 attention_output 和注意力权重 attn_output_weights
#         print(attention_output.shape) # torch.Size([16, 1, 4]) batch_size, time_step, input_dim
        output, (h_n, c_n) = self.lstm(attention_output)
        #self.lstm(attention_output): 将注意力输出作为 LSTM 层的输入，得到 LSTM 层的输出 output 和最后一个时间步的隐藏状态 h_n 和细胞状态 c_n。
#         print(output.shape) # torch.Size([16, 1, 64]) batch_size, time_step, hidden_dim
        batch_size, timestep, hidden_dim = output.shape
        output = output.reshape(-1, hidden_dim)
        #将 LSTM 输出的形状重塑为(batch_size * timestep, hidden_dim)，以便于通过全连接层进行处理。
        output = self.fc(output) #self.fc(output): 将重塑后的 LSTM 输出通过全连接层进行线性变换，得到最终的输出。
        output = output.reshape(timestep, batch_size, -1) #: 将全连接层输出重新调整为与输入相同的形状。
        return output[-1]#返回最后一个时间步的输出作为模型的最终输出。


model = LSTM(input_dim, hidden_dim, num_layers, output_dim)  # 定义LSTM网络
loss_function = nn.MSELoss()  # 定义损失函数MSE
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)  # 定义优化器，用于优化模型的参数；lr=0.01 指定了优化器的学习率，即每次参数更新的步长大小。

模型训练

for epoch in range(epochs): #epochs是你想要遍历整个数据集的次数
    model.train() # 这将模型设置为训练模式
    running_loss = 0 #这初始化了变量 running_loss，用于跟踪每个周期中累积的总损失。
    train_bar = tqdm(train_loader)  # 形成进度条  这创建了一个进度条 train_bar，用于在训练过程中可视化训练进度。
    for data in train_bar:
        x_train, y_train = data  #这一行从每个训练批次中解包出输入数据 x_train 和对应的标签 y_train
        optimizer.zero_grad() #这将优化器中所有参数的梯度归零，因为在 PyTorch 中，梯度在默认情况下是累加的。
        y_train_pred = model(x_train, x_train, x_train) #这一行是模型的前向传播过程，它使用输入数据 x_train 来生成预测值 y_train_pred
        loss = loss_function(y_train_pred, y_train.reshape(-1, 1))
        #这一行计算了模型预测值 y_train_pred 和真实标签 y_train 之间的损失值，loss_function 是你定义的损失函数。
        loss.backward() #这一行执行反向传播，计算损失相对于模型参数的梯度。
        optimizer.step() #这一行根据优化器的策略更新模型参数，使损失尽量减小。

        running_loss += loss.item() #这一行将当前批次的损失值加到 running_loss 中，用于跟踪整个 epoch 的累计损失。
        train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,
                                                                 epochs,
                                                                 loss)
        # 这一行更新进度条的描述，显示当前 epoch 的训练进度和损失值。

    # 模型验证 在模型训练完成后，我们需要对模型进行验证以评估其性能。
    model.eval() #这将模型设置为评估模式。在评估模式下，模型的行为与训练模式下略有不同，例如，不会进行 dropout 或批量归一化。
    test_loss = 0 #这初始化了变量 test_loss，用于跟踪验证过程中的总损失。
    with torch.no_grad(): # 这个语句块内的计算不会被 PyTorch 记录梯度，因此在验证过程中不会进行梯度更新。
        test_bar = tqdm(test_loader) # 这创建了一个进度条 test_bar，用于可视化验证进度。
        for data in test_bar:
            x_test, y_test = data #
            y_test_pred = model(x_test, x_test, x_test) # 这一行是模型的前向传播过程，生成验证集的预测值 y_test_pred。
            test_loss = loss_function(y_test_pred, y_test.reshape(-1, 1))

    if test_loss < best_loss: #这个条件判断语句检查当前的验证损失是否比之前的最佳损失要小。
        best_loss = test_loss #如果当前验证损失更低，将当前验证损失赋值给 best_loss。
        torch.save(model.state_dict(), save_path) #如果当前验证损失更低，将模型的参数保存到指定的路径，这样我们就可以在训练过程中保存最佳模型。

print('Finished Training')

绘制结果

plt.figure(figsize=(12, 8)) #创建一个图形窗口，并设置图形大小为 (12, 8)。
plt.plot(scaler.inverse_transform((model(x_train_tensor[:10000], x_train_tensor[:10000], x_train_tensor[:10000]).detach().numpy()).reshape(-1, 1)), "b",label="Predicted (Train)")
#用于将模型在训练集上的预测结果逆标准化，并将结果转换为 numpy 数组，并reshape成(-1, 1)的形状。
plt.plot(scaler.inverse_transform(y_train_tensor.detach().numpy().reshape(-1, 1)), "r",label="Actual (Train)")

plt.legend()#添加图例，标记出蓝色曲线代表的是模型在训练集上的预测结果，红色曲线代表的是训练集的实际结果。
plt.show()

这个代码转载PyTorch深度学习项目实战100例】—— 基于LSTM + 注意力机制（self-attention）进行天气变化的时间序列预测LSTM+CNN实现时间序列预测(PyTorch版)_pytorh lstm+cnn预测时间序列-CSDN博客

不知道我发注释会不会侵权，要是侵权请告知，立马删。