pytorch学习-- RNN, RNNCell

最新推荐文章于 2023-03-15 21:50:06 发布

bit_codertoo

最新推荐文章于 2023-03-15 21:50:06 发布

阅读量1.7k

点赞数

文章标签：自然语言处理机器学习 pytorch 深度学习

本文链接：https://blog.csdn.net/bit_codertoo/article/details/103653652

版权

文章目录

循环神经网络
RNN Layer
- 多层RNN
nn.RNNCell
梯度爆炸
梯度离散

循环神经网络

在这里插入图片描述

$h_t 来自h_{t-1}与x_t$

求解导数

在这里插入图片描述

RNN Layer

$batch, featureLen]@[hiddenLen, featureLen]^T$
$batch, hiddenLen]@[hidden Len, hiddenLen]^T$
对应
$x_t@W_{xh} + h_t@W_{hh}$
$x : [s e q L e n, b a t c h, f e a t u r e L e n]$ [10,3,100]
$x_t:[batch,featureLen]$ [3,100],3批次，一个单词

rnn = nn.RNN(100,10)	#词向量，隐藏层向量
rnn.weight_hh_l0.shape	#一个RNN有四个参数
rnn.bias_ih_l0.shape

三个参数，input_size，hidden_size, num_layers
可以通过设置batch_first参数设置维度表示方式

out,ht = forward(x,h0)
#注意此处x即为[seq,batch,wordvec]，直接将一句话喂进去
h0/ht -> [num_layers,batch,h_dim]
out -> [seq,batch,h_dim],他返回每个时间步的out

rnn = nn.RNN(100,10)	#词向量，隐藏层向量
x = torch.randn(10,3,100)
out,h = rnn(x,torch.zeros(1,3,20)

多层RNN

注意上层的 $W_{ih}$ 的尺度变为【hidden，hidden】

nn.RNNCell

更加灵活，手动喂多次

三个参数，input_size，hidden_size, num_layers 初始化相同

ht = rnncell(xt,ht_1)
# xt:[b,wordvec]
# ht_1/ht:[num_layers,b,h_dim]
#out = torch.stack([h1,h2...ht]), h的集合

cell1 = nn.RNNCell(100,20)
h1 = torch.zeros(3,20)
for xt in x:
	h1 = cell1(xt,h1)

两层并且hidden不同

cell1 = nn.RNNCell(100,30)
cell2 = nn.RNNCell(30,20)
h1 = torch.zeros(3,30)
h2 = torch.zeros(3,20)
for xt in x:
	h1 = cell1(xt,h1)
	h2 = cell2(h1,h2)

梯度爆炸

对w.grad 进行clipping

loss = criterion(output,y)
model.zero_grad()
loss.backward()
for p in model.parameters():
	print(p.grad.norm())	#检测是否爆炸
	torch.nn.utils.clip_grad_norm_(p,10)
optimizer.step()

梯度离散

bit_codertoo

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
pytorch学习-- RNN, RNNCell

文章目录循环神经网络RNN Layer多层RNNnn.RNNCell梯度爆炸梯度离散循环神经网络ht来自ht−1与xth_t 来自h_{t-1}与x_tht来自ht−1与xt求解导数RNN Layer[batch,featureLen]@[hiddenLen,featureLen]T[batch, featureLen]@[hiddenLen, featureLen]^T[b...
复制链接

扫一扫