NNDL 作业11 LSTM

飘渺天山雪正消

于 2023-12-18 21:57:00 发布

阅读量378

点赞数 9

文章标签： lstm 人工智能 rnn

本文链接：https://blog.csdn.net/gdx1314520/article/details/134841232

版权

习题6-4 推导LSTM网络中参数的梯度，并分析其避免梯度消失的效果

LSTM网络的参数梯度计算过程可以分为两个步骤：前向传播和反向传播。

前向传播

遗忘门（Forget Gate）： f_t
输入门（Input Gate）： i_t
新的候选细胞状态（New Candidate Cell State）： C_t~
细胞状态（Cell State）更新： C_t
输出门（Output Gate）： O_t
隐藏状态（Hidden State）更新： h_t

反向传播

又有：△ct=ot+1⊙[1−tanh2(ct+1)]

所以

避免梯度消失：

cell的记忆单元可以存储并传递长期的信息。通过使用门控机制，LSTM能够选择性地遗忘或更新记忆单元中的信息，从而更好地保留重要的信息，减少信息的丢失。
门控单元使用sigmoid函数来控制信息的流动，并且通过乘法操作将信息进行筛选，可以有效地限制信息的传递，避免梯度过小或过大。

习题6-3P 编程实现下图LSTM运行过程

1. 使用Numpy实现LSTM算子

import numpy as np
 
x = np.array([[1, 0, 0, 1],
              [3, 1, 0, 1],
              [2, 0, 0, 1],
              [4, 1, 0, 1],
              [2, 0, 0, 1],
              [1, 0, 1, 1],
              [3, -1, 0, 1],
              [6, 1, 0, 1],
              [1, 0, 1, 1]])
# x = np.array([
#               [3, 1, 0, 1],
#
#               [4, 1, 0, 1],
#               [2, 0, 0, 1],
#               [1, 0, 1, 1],
#               [3, -1, 0, 1]])
inputGate_W = np.array([0, 100, 0, -10])
outputGate_W = np.array([0, 0, 100, -10])
forgetGate_W = np.array([0, 100, 0, 10])
c_W = np.array([1, 0, 0, 0])
 
 
def sigmoid(x):
    y = 1 / (1 + np.exp(-x))
    if y >= 0.5:
        return 1
    else:
        return 0
 
 
temp = 0
y = []
c = []
for input in x:
    c.append(temp)
    temp_c = np.sum(np.multiply(input, c_W))
    temp_input = sigmoid(np.sum(np.multiply(input, inputGate_W)))
    temp_forget = sigmoid(np.sum(np.multiply(input, forgetGate_W)))
    temp_output = sigmoid(np.sum(np.multiply(input, outputGate_W)))
    temp = temp_c * temp_input + temp_forget * temp
    y.append(temp_output * temp)
print("memory:",c)
print("y     :",y)

2. 使用nn.LSTMCell实现

PyTorch - torch.nn.LSTMCell

import  torch
from  torch import nn
import numpy as np
 
print('one layer lstm')
cell=nn.LSTMCell(input_size=100, hidden_size=20)
h=torch.zeros(3,20)
c=torch.zeros(3,20)
x = torch.randn(10,3,100)
for xt in x: 
	h,c = cell(xt, [h,c])
 
print('h.shape: ',h.shape)
print('c.shape: ',c.shape)

3. 使用nn.LSTM实现

PyTorch - torch.nn.LSTM

import  torch
from  torch import nn
 
lstm = nn.LSTM(input_size=512, hidden_size=256, num_layers=2, batch_first=True)
print(lstm)
x = torch.randn(40,25,512)
output,(h_n,c_n) = lstm(x)
print(output.shape,h_n.shape,c_n.shape)

李宏毅机器学习笔记：RNN循环神经网络_李宏毅rnn笔记_ZEERO~的博客-CSDN博客编辑https://blog.csdn.net/weixin_43249038/article/details/132650998

L5W1作业1 手把手实现循环神经网络-CSDN博客编辑https://blog.csdn.net/segegse/article/details/127708468 [干货]深入浅出LSTM及其Python代码实现-腾讯云开发者社区-腾讯云 (tencent.com)

飘渺天山雪正消

关注

9
点赞
踩
8

收藏

觉得还不错? 一键收藏
1
评论
NNDL 作业11 LSTM

推导中参数的梯度，并分析其避免梯度消失的效果LSTM网络的参数梯度计算过程可以分为两个步骤：前向传播和反向传播。前向传播遗忘门（Forget Gate）： f_t输入门（Input Gate）： i_t新的候选细胞状态（New Candidate Cell State）： C_t~细胞状态（Cell State）更新： C_t输出门（Output Gate）： O_t隐藏状态（Hidden State）更新： h_t反向传播又有：△ct=ot+1⊙[1−tanh2(ct+1)]所以。
复制链接

扫一扫