作业11 LSTM

最新推荐文章于 2024-07-29 00:34:26 发布

szf03

最新推荐文章于 2024-07-29 00:34:26 发布

阅读量550

点赞数 13

文章标签： lstm 机器学习人工智能

本文链接：https://blog.csdn.net/m0_62584837/article/details/135036205

版权

本文解析了LSTM网络中参数梯度的推导过程，强调了遗忘门、输入门和输出门如何防止梯度消失问题，提供了使用Numpy和PyTorch实现LSTM算子的示例，包括手动前向传播和nn.LSTMCell的运用。

摘要由CSDN通过智能技术生成

习题6-4 推导LSTM网络中参数的梯度，并分析其避免梯度消失的效果

其中最难弄懂的就是

在LSTM中 $f_{t}$ 和上一个时刻的外部状态 $h_{t-1}$ 相关，它又和 $c_{t-1}$ 相关

上面所写的式子中，加号前面的 $f_{t}\bigodot c_{t-1}$ 这里看似 $f_{t}$ 和 $c_{t-1}$ 无关，求导的时候较容易被忽略，其实不然和我上面所说的相同

加号后面的道理是一样的，不加以推导。

遗忘门，输入门，输出门在 $\left [ 0,1 \right ]$ 所以门为1，很好的减轻梯度消失的问题，门为0的时候对当前时刻没有影响，也没必要传递更新参数

所以只需要看前部分，至少大于等于ft,只要ft=1，就会缓解梯度消失。

习题6-3P 编程实现下图LSTM运行过程

同学提出，未发现 $h_{t-1}$ 输入。可以适当改动例题，增加该输入。

实现LSTM算子，可参考实验教材代码。

1. 使用Numpy实现LSTM算子

2. 使用nn.LSTMCell实现

3. 使用nn.LSTM实现

1. 使用Numpy实现LSTM算子

numpy实现

import numpy as np

def sigmoid(x):
    if x>0:
        return 1
    else:
        return 0
w_f = np.array([0, 100, 0, 10])  # 遗忘门
w_i = np.array([0, 100, 0, -10])   # 输入门
w_o = np.array([0, 0, 100, -10])  # 输出门
w_c = np.array([1, 0, 0, 0])      # 候选
x = np.array(
    [[1, 0, 0, 1], [3, 1, 0, 1], [2, 0, 0, 1], [4, 1, 0, 1], [2, 0, 0, 1],
     [1, 0, 1, 1], [3, -1, 0, 1], [6, 1, 0, 1], [1, 0, 1, 1]])  # 添加偏置项
y = []  # 存储输出
c = []  # 存储隐藏状态
temp = 0

for i in x:
    c.append(temp)  # 将当前细胞状态添加到列表c中
    wc = np.sum(np.multiply(i, w_c))  # 计算候选
    wi = sigmoid(np.sum(np.multiply(i, w_i)))  # 计算输入门
    wf = sigmoid(np.sum(np.multiply(i, w_f)))  # 计算忘记门
    wo = sigmoid(np.sum(np.multiply(i, w_o)))  # 计算输出门
    temp = wc * wi + wf * temp  # 更新
    y.append(wo * temp)
 # 使用新的隐藏状态计算输出
print('memory:', c)  # 输出隐藏状态
print('y:', y)

这里主要主义的就是公式的熟悉以及输入形状进行运算的有可能出现的错误。

重要的是输入的最后末尾的1

2. 使用nn.LSTMCell实现

值得注意的是weight形状

import torch
import torch.nn as nn

x = torch.Tensor([[[1, 0, 0, 1]], [[3, 1, 0, 1]], [[2, 0, 0, 1]], [[4, 1, 0, 1]], [[2, 0, 0, 1]],
                  [[1, 0, 1, 1]], [[3, -1, 0, 1]], [[6, 1, 0, 1]], [[1, 0, 1, 1]]])
h_t = torch.zeros(1, 1)
c_t = torch.zeros(1, 1)
lstm_cell = nn.LSTMCell(4, 1)

# 调整参数
lstm_cell.weight_ih.data = torch.tensor([[0, 100, 0, -10], [0, 100, 0, 10], [1, 0, 0, 0], [0, 0, 100, -10]]).float()
lstm_cell.weight_hh.data = torch.zeros(4, 1)
h = []

# forward()
for i in x:
    (h_t, c_t) = lstm_cell(i, (h_t, c_t))
    h.append(torch.squeeze(h_t).tolist())

# Round the values to integers
h = [round(num) for num in h]
print("输出为：", h)

手动前向计算

3. 使用nn.LSTM实现

官网网站在下面

import torch
import torch.nn as nn
x = torch.Tensor([[[1, 0, 0, 1]], [[3, 1, 0, 1]], [[2, 0, 0, 1]], [[4, 1, 0, 1]], [[2, 0, 0, 1]],
                  [[1, 0, 1, 1]], [[3, -1, 0, 1]], [[6, 1, 0, 1]], [[1, 0, 1, 1]]])
lstm = nn.LSTM(4,1,1)
lstm.weight_ih_l0.data = torch.tensor([[0, 100, 0, -10], [0, 100, 0, 10], [1, 0, 0, 0], [0, 0, 100, -10]]).float()
lstm.weight_hh_l0.data = torch.zeros(4,1)
h_t = torch.zeros(1,1,1)
c_t = torch.zeros(1,1,1)
# 前向
outputs, (h_t, c_t) = lstm(x, (h_t, c_t))
outputs = outputs.squeeze().tolist()
y= [round(x) for x in outputs]
# 打印四舍五入后的输出列表
print(y)