001_wz_NLP_RNN和RNNCell

最新推荐文章于 2024-02-19 11:19:54 发布

王泽的随笔

最新推荐文章于 2024-02-19 11:19:54 发布

阅读量293

点赞数

分类专栏： NLP-SA 文章标签： nlp rnn pytorch

本文链接：https://blog.csdn.net/qq_40869711/article/details/115790415

版权

NLP-SA 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

torch.nn.RNN

输入参数：
input_size:即对数据做embedding的数据维度feature_len
hidden_size:RNN的隐层维度
num_layers:RNN网络的层数，默认为1层

RNN的前向传播

out, h_t = RNN(x, h_0)

x：输入数据，维度为(seq_len, batch_size, feature_len)
h_0/h_t:隐藏输出，维度为(num_layers, batch_size, hidden_size)
out:为每一时刻隐层输出的列表集合，形如[h_1, h_2, …, h_t],维度为(seq_len, batch_size, hidden_size)
在这里插入图片描述
每一层RNN有两个共享权重矩阵 $W_{ih}$ 和 $W_{hh}$ （我们不用关心偏置矩阵b)
W_ih(W_xh):是对于输入数据x_t(batch_size, feature_len)相关的权重矩阵，维度为（hidden_size, feature_len)^T
W_hh:是关于前面传来数据的权重矩阵，维度为（hidden_size, hidden_size)
更新：
$h_t = tanh(x_tW_{ih}+h_tW_{hh})$

单层RNN代码验证

import torch
rnn = torch.nn.RNN(input_size=100, hidden_size=10, num_layers=1)
print(rnn._parameters.keys())

odict_keys([‘weight_ih_l0’, ‘weight_hh_l0’, ‘bias_ih_l0’,‘bias_hh_l0’])

print(rnn.weight_ih_l0.shape, rnn.weight_hh_l0.shape)

torch.Size([10, 100]) torch.Size([10, 10])

weight_ih_l0，l0代表第0层
可以看到 $W_{ih}$ 的维度为(hidden_size, feature_len)= (10,100)
$W_{hh}$ 维度为(hidden_size, hidden_size)= (10, 10)

# 输入数据x:(seq_len, batch_size, feature_len)
x = torch.randn(8, 5, 100)

# h_0的维度,(num_layers, batch_size, hidden_size)
h_0 = torch.zeros(1, 5, 10)

out, h_t = rnn.forward(x, h_0)
print(out.shape, h_t.shape)

torch.Size([8, 5, 10]) torch.Size([1, 5, 10])

可以自行和前面对比，没有问题

多层RNN代码验证

在这里插入图片描述

import torch

rnn = torch.nn.RNN(input_size=100, hidden_size=10, num_layers=2)
print(rnn._parameters.keys())
print(rnn.weight_ih_l0.shape, rnn.weight_hh_l0.shape)
print(rnn.weight_ih_l1.shape, rnn.weight_hh_l1.shape)

odict_keys([‘weight_ih_l0’, ‘weight_hh_l0’, ‘bias_ih_l0’, ‘bias_hh_l0’, ‘weight_ih_l1’, ‘weight_hh_l1’, ‘bias_ih_l1’ ‘bias_hh_l1’])
torch.Size([10, 100]) torch.Size([10, 10])
torch.Size([10, 10]) torch.Size([10, 10])

# 输入数据x:(seq_len, batch_size, feature_len)
x = torch.randn(8, 5, 100)

# h_0的维度,(num_layers, batch_size, hidden_size)
h_0 = torch.zeros(2, 5, 10)

out, h_t = rnn.forward(x, h_0)
print(out.shape, h_t.shape)

torch.Size([8, 5, 10]) torch.Size([2, 5, 10])

torch.nn.RNNCell

输入与RNN参数相同，但没有了num_layers这个参数
对于RNN我们是一次性将所有数据送进去并行处理，而对于RNNCell每轮都需要我们送入数据
即对于RNN，输入数据x维度(seq_len, batch_sizem feature_len)，而对于RNNCell是x维度(bach_size, feature_len)，但是我们手动的送入数据十次

RNNCell的前向传播

h_t = RNNCell(x_t, ht_1)

x_t:输入数据，维度为（batch_size, feature_len)
ht_1/h_t:隐层输出，维度为( batch_size, hidden_size)
out:隐藏输出集合，即形如[h_1, h_2, …, h_t]，维度为(seq_len, batch_size, hidden_size)

权重与RNN相同

单层RNNCell代码验证

import torch

cell1 = torch.nn.RNNCell(input_size=100, hidden_size=10)
x = torch.randn(10, 5, 100)
h1 = torch.zeros(5, 10)
for x_t in x:
    h1 = cell1(x_t, h1)
print(h1.shape)

torch.Size([5, 10])

多层RNNCell代码验证

import torch

cell1 = torch.nn.RNNCell(input_size=100, hidden_size=30)
cell2 = torch.nn.RNNCell(input_size=30, hidden_size=10)
x = torch.randn(10, 5, 100)
h1 = torch.zeros(5, 30)
h2 = torch.zeros(5, 10)
for x_t in x:
    h1 = cell1(x_t, h1)
    h2 = cell2(h1, h2)
print(h2.shape)

torch.Size([5, 10])

王泽的随笔

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
001_wz_NLP_RNN和RNNCell

torch.nn.RNN输入参数：input_size:即对数据做embedding的数据维度feature_lenhidden_size:RNN的隐层维度num_layers:RNN网络的层数，默认为1层RNN的前向传播out, h_t = RNN(x, h_0)x：输入数据，维度为(seq_len, batch_size, feature_len)h_0/h_t:隐藏输出，维度为(num_layers, batch_size, hidden_size)out:为每一时刻隐层输出的列表
复制链接

扫一扫

专栏目录