001_wz_NLP_RNN和RNNCell

torch.nn.RNN

输入参数:
input_size:即对数据做embedding的数据维度feature_len
hidden_size:RNN的隐层维度
num_layers:RNN网络的层数,默认为1层

RNN的前向传播

out, h_t = RNN(x, h_0)

x:输入数据,维度为(seq_len, batch_size, feature_len)
h_0/h_t:隐藏输出,维度为(num_layers, batch_size, hidden_size)
out:为每一时刻隐层输出的列表集合,形如[h_1, h_2, …, h_t],维度为(seq_len, batch_size, hidden_size)
在这里插入图片描述
每一层RNN有两个共享权重矩阵 W i h W_{ih} Wih W h h W_{hh} Whh(我们不用关心偏置矩阵b)
W_ih(W_xh):是对于输入数据x_t(batch_size, feature_len)相关的权重矩阵,维度为(hidden_size, feature_len)^T
W_hh:是关于前面传来数据的权重矩阵,维度为(hidden_size, hidden_size)
更新:
h t = t a n h ( x t W i h + h t W h h ) h_t = tanh(x_tW_{ih}+h_tW_{hh}) ht=tanh(xtWih+htWhh)

单层RNN代码验证

import torch
rnn = torch.nn.RNN(input_size=100, hidden_size=10, num_layers=1)
print(rnn._parameters.keys())

odict_keys([‘weight_ih_l0’, ‘weight_hh_l0’, ‘bias_ih_l0’,‘bias_hh_l0’])

print(rnn.weight_ih_l0.shape, rnn.weight_hh_l0.shape)

torch.Size([10, 100]) torch.Size([10, 10])

weight_ih_l0,l0代表第0层
可以看到 W i h W_{ih} Wih的维度为(hidden_size, feature_len)= (10,100)
W h h W_{hh} Whh维度为(hidden_size, hidden_size)= (10, 10)

# 输入数据x:(seq_len, batch_size, feature_len)
x = torch.randn(8, 5, 100)

# h_0的维度,(num_layers, batch_size, hidden_size)
h_0 = torch.zeros(1, 5, 10)

out, h_t = rnn.forward(x, h_0)
print(out.shape, h_t.shape)

torch.Size([8, 5, 10]) torch.Size([1, 5, 10])

可以自行和前面对比,没有问题

多层RNN代码验证

在这里插入图片描述

import torch

rnn = torch.nn.RNN(input_size=100, hidden_size=10, num_layers=2)
print(rnn._parameters.keys())
print(rnn.weight_ih_l0.shape, rnn.weight_hh_l0.shape)
print(rnn.weight_ih_l1.shape, rnn.weight_hh_l1.shape)

odict_keys([‘weight_ih_l0’, ‘weight_hh_l0’, ‘bias_ih_l0’, ‘bias_hh_l0’, ‘weight_ih_l1’, ‘weight_hh_l1’, ‘bias_ih_l1’ ‘bias_hh_l1’])
torch.Size([10, 100]) torch.Size([10, 10])
torch.Size([10, 10]) torch.Size([10, 10])

# 输入数据x:(seq_len, batch_size, feature_len)
x = torch.randn(8, 5, 100)

# h_0的维度,(num_layers, batch_size, hidden_size)
h_0 = torch.zeros(2, 5, 10)

out, h_t = rnn.forward(x, h_0)
print(out.shape, h_t.shape)

torch.Size([8, 5, 10]) torch.Size([2, 5, 10])

torch.nn.RNNCell

输入与RNN参数相同,但没有了num_layers这个参数
对于RNN我们是一次性将所有数据送进去并行处理,而对于RNNCell每轮都需要我们送入数据
即对于RNN,输入数据x维度(seq_len, batch_sizem feature_len),而对于RNNCell是x维度(bach_size, feature_len),但是我们手动的送入数据十次

RNNCell的前向传播

h_t = RNNCell(x_t, ht_1)

x_t:输入数据,维度为(batch_size, feature_len)
ht_1/h_t:隐层输出,维度为( batch_size, hidden_size)
out:隐藏输出集合,即形如[h_1, h_2, …, h_t],维度为(seq_len, batch_size, hidden_size)

权重与RNN相同

单层RNNCell代码验证

import torch

cell1 = torch.nn.RNNCell(input_size=100, hidden_size=10)
x = torch.randn(10, 5, 100)
h1 = torch.zeros(5, 10)
for x_t in x:
    h1 = cell1(x_t, h1)
print(h1.shape)
torch.Size([5, 10])

多层RNNCell代码验证

import torch

cell1 = torch.nn.RNNCell(input_size=100, hidden_size=30)
cell2 = torch.nn.RNNCell(input_size=30, hidden_size=10)
x = torch.randn(10, 5, 100)
h1 = torch.zeros(5, 30)
h2 = torch.zeros(5, 10)
for x_t in x:
    h1 = cell1(x_t, h1)
    h2 = cell2(h1, h2)
print(h2.shape)

torch.Size([5, 10])

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值