NNDL 作业11 LSTM

习题6-4  推导LSTM网络中参数的梯度, 并分析其避免梯度消失的效果

E为损失函数,由于LSTM中通过门控机制解决了梯度问题,输入门、输出门和遗忘门非0即1且为相加关系,梯度能够在LSTM中传播,减小了梯度消失发生的概率;门为0时,上一刻的信息对当前时刻无影响,没必要接受传递更新参数。

习题6-3P  编程实现下图LSTM运行过程

1、使用Numpy实现LSTM算子
import numpy as np

x=np.array([[1,0,0,1],
            [3,1,0,1],
            [2,0,0,1],
            [4,1,0,1],
            [2,0,0,1],
            [1,0,1,1],
            [3,-1,0,1],
            [6,1,0,1],
            [1,0,1,1]])
inputGata_W=np.array([0,100,0,-10])
outputGata_W=np.array([0,0,100,-10])
forgetGata_W=np.array([0,100,0,10])
c_W=np.array([1,0,0,0])

def sigmoid(x):
    y= 1 / ( 1 + np.exp(-x))
    if y >= 0.5:
        return 1
    else:
        return 0

temp=0
y=[]
memory=[]
for input in x:
    memory.append(temp)
    temp_c=np.sum(np.multiply(input,c_W))
    temp_input=sigmoid(np.sum(np.multiply(input,inputGata_W)))
    temp_forget=sigmoid(np.sum(np.multiply(input,forgetGata_W)))
    temp_output=sigmoid(np.sum(np.multiply(input,outputGata_W)))
    temp=temp_c*temp_input+temp_forget*temp
    y.append(temp_output*temp)
print("memory:",memory)
print("y:",y)

2、使用nn.LSTMCell实现
import torch
import torch.nn as nn

#LSTMcell接受的是(time_step,batch_size,input_size),输入数据x的维度需要进行变换
#time_steps=9,batch_size=1,input_size=4
x=torch.tensor([[1,0,0,1],
                [3,1,0,1],
                [2,0,0,1],
                [4,1,0,1],
                [2,0,0,1],
                [1,0,1,1],
                [3,-1,0,1],
                [6,1,0,1],
                [1,0,1,1]],dtype=torch.float)
x=x.unsqueeze(1)
#LSTM的输入size和隐藏层size
input_size=4
hidden_size=1

#定义LSTM单元
lstm_cell=nn.LSTMCell(input_size,hidden_size,bias=False)
lstm_cell.weight_ih.data=torch.tensor([[0,100,0,10],#forget
                                       [0,100,0,-10],#input
                                       [1,0,0,0],#output
                                       [0,0,100,-10]]).float()
lstm_cell.weight_hh.data=torch.zeros([4*hidden_size,hidden_size])

hx=torch.zeros(1,hidden_size)
cx=torch.zeros(1,hidden_size)
outputs=[]
for i in range(len(x)):
    hx,cx=lstm_cell(x[i],(hx,cx))
    outputs.append(hx.detach().numpy()[0][0])
outputs_rounded=[round(x) for x in outputs]
print(outputs_rounded)

3、使用nn.LSTM实现
import torch
import torch.nn as nn

# 输入数据 x 维度需要变换,因为 LSTM 接收的是 (sequence_length, batch_size, input_size)
# sequence_length = 9, batch_size = 1, input_size = 4
x = torch.tensor([[1, 0, 0, 1],
                  [3, 1, 0, 1],
                  [2, 0, 0, 1],
                  [4, 1, 0, 1],
                  [2, 0, 0, 1],
                  [1, 0, 1, 1],
                  [3, -1, 0, 1],
                  [6, 1, 0, 1],
                  [1, 0, 1, 1]], dtype=torch.float)
x = x.unsqueeze(1)

# LSTM 的输入 size 和隐藏层 size
input_size = 4
hidden_size = 1

# 定义 LSTM 模型
lstm = nn.LSTM(input_size=input_size, hidden_size=hidden_size, bias=False)

# 设置 LSTM 的权重矩阵
lstm.weight_ih_l0.data = torch.tensor([[0, 100, 0, 10],  # forget gate
                                       [0, 100, 0, -10],  # input gate
                                       [1, 0, 0, 0],  # output gate
                                       [0, 0, 100, -10]]).float()  # cell gate
lstm.weight_hh_l0.data = torch.zeros([4 * hidden_size, hidden_size])

# 初始化隐藏状态和记忆状态
hx = torch.zeros(1, 1, hidden_size)
cx = torch.zeros(1, 1, hidden_size)

# 前向传播
outputs, (hx, cx) = lstm(x, (hx, cx))
outputs = outputs.squeeze().tolist()

# print(outputs)
outputs_rounded = [round(x) for x in outputs]
print(outputs_rounded)

参考:

NNDL 作业9:分别使用numpy和pytorch实现BPTT-CSDN博客

神经网络与深度学习作业10:(LSTM | GRU)_lstm神经网络例题-CSDN博客

DL Homework 11-CSDN博客

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值