Pytorch实现循环神经网络（一）、RNN与LSTM的结构与定义

最新推荐文章于 2023-01-22 22:52:43 发布

炒饭小哪吒

最新推荐文章于 2023-01-22 22:52:43 发布

阅读量746

点赞数 1

文章标签：深度学习神经网络自然语言处理

本文链接：https://blog.csdn.net/weixin_45738220/article/details/107868920

版权

一、RNN的结构与pytorch中实现

首先放上RNN的结构图：
在这里插入图片描述
Xt为当前时刻的输入向量，ht为当前时刻的输出向量。可以看到，RNN不同于一般的全连接层，当前时刻的输出不仅与输入有关，还与上一时刻网络的输出ht-1有关。其输入输出关系在pytorch官方文档中：

在pytorch中定义RNN层：torch.nn.RNN（input_size，hidden_size，num_layers，nonlinearity ，bias，batch_first ，dropout ，bidirectional ）

input_size: 输入特征的维数feature
hidden_size: 隐藏层特征的维数，亦即该RNN层输出特征的维数
num_layers: RNN的层数，下一层网络的输入为上一层网络的输出
nonlinearity: ‘tanh’ or ‘relu’. Default: ‘tanh’
bias: 偏置项，默认为True
batch_first: 如果为真，则输入数据的维数为（batch_size,seq,feature）;若为假，则输入数据维数为（seq,batch_size,feature)。默认为假
dropout: 若不为零，网络除了最后一层外依此概率进行dropout

该层的输入包含input和h_0 (默认设置batch_first=False,bidirectional=False):

input :输入数据维数为（seq,batch_size,feature）
h_0: RNN各层在0时刻的初始状态，数据维数为（num_layers,batch_size,hidden_size）

输出包含output和h_n:

output: 最后一层各时刻的输出，其数据维数为（seq,batch_size,hidden_size）
h_n: 最后时刻网络各层的输出，其数据维数为（num_layers,batch_size,hidden_size）

layer=nn.RNN(10,20,2)
input=torch.randn(5,3,10)#(seq,batch,feature)
h0=torch.randn(2,3,20)#(num_layer,batch,hidden)
output,hn=layer(input,h0)
#output.size()=(5,3,20) 
#hn.size()=(2,3,20)

二、LSTM结构与pytorch实现

首先放上LSTM的结构示意图：
在这里插入图片描述

相比起RNN结构，LSTM每层的输出多了细胞状态向量C，在pytorch官方文档中其计算如下：
在这里插入图片描述
torch.nn.LSTM（input_size，hidden_size，num_layers，nonlinearity ，bias，batch_first ，dropout ，bidirectional）
这里各个参数的含义与torch.nn.RNN完全相同，参照RNN即可。
输入包含input,h_0,c_0:

input: 输入数据维数为（seq,batch_size,feature）
h_0: LSTM各层在0时刻的初始状态，数据维数为（num_layers,batch_size,hidden_size）
c_0: LSTM各层的细胞状态在0时刻的初始值，数据维数为（num_layers,batch_size,hidden_size）

输出包含output,h_n,c_n:

output: 最后一层各时刻的输出，其数据维数为（seq,batch_size,hidden_size）
h_n: 最后时刻网络各层的输出，其数据维数为（num_layers,batch_size,hidden_size）
c_n: 最后时刻网络各层的细胞状态，其数据维数为（num_layers,batch_size,hidden_size）

rnn=nn.LSTM(10,20,2)
input=torch.randn(5,3,10)
h0=torch.randn(2,3,20)
c0=torch.randn(2,3,20)
output,(hn,cn)=rnn(input,(h0,c0))
#output.size()=(5，3，20)
#hn.size()=(2，3，20)
#cn.size()=(2，3，20)

炒饭小哪吒

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Pytorch实现循环神经网络（一）、RNN与LSTM的结构与定义

一、RNN的结构与pytorch中实现首先放上RNN的结构图：Xt为当前时刻的输入向量，ht为当前时刻的输出向量。可以看到，RNN不同于一般的全连接层，当前时刻的输出不仅与输入有关，还与上一时刻网络的输出ht-1有关。其输入输出关系在pytorch官方文档中：在pytorch中定义RNN层：torch.nn.RNN（input_size，hidden_size，num_layers，nonlinearity ，bias，batch_first ，dropout ，bidirectional ）
复制链接

扫一扫