【pytorch】pytorch lstm实现

最新推荐文章于 2024-06-19 08:11:40 发布

zkq_1986

最新推荐文章于 2024-06-19 08:11:40 发布

阅读量1.7k

点赞数 2

分类专栏： pytorch

原文链接：https://blog.csdn.net/wangwangstone/article/details/90296461

版权

pytorch 专栏收录该内容

5 篇文章 2 订阅

订阅专栏

lstm里，多层之间传递的是输出ht ，同一层内传递的细胞状态（即隐层状态）

看pytorch官网对应的参数nn.lstm(*args，**kwargs)，

默认传参就是官网文档的列出的列表传过去。对于后面有默认值（官网在参数解释第一句就有if啥的，一般传参就要带赋值号了。）

官网案例对应的就是前三个。input_size，hidden_size，num_layers

Parmerters:

input_size – The number of expected features in the input x.白话：就是你输入x的向量大小（x向量里有多少个元素）

hidden_size – The number of features in the hidden state h 。白话：就是LSTM在运行时里面的维度。隐藏层状态的维数，即隐藏层节点的个数，这个和单层感知器的结构是类似的。这个维数值是自定义的，根据具体业务需要决定，如下图：

图中input_size：就是输入层，左边蓝色方格 [i0,i1,i2,i3,i4]，hidden_size：就是隐藏层，中间黄色圆圈 [h0,h1,h2,h3,h4]。最右边蓝色圆圈 [o0,o1,o2] 的是输出层，节点个数也是按具体业务需求决定的。

num_layers – Number of recurrent layers. E.g., setting num_layers=2 would mean stacking two LSTMs together to form a stacked LSTM, with the second LSTM taking in outputs of the first LSTM and computing the final results. Default: 1

LSTM 堆叠的层数，默认值是1层，如果设置为2，第二个LSTM接收第一个LSTM的计算结果。也就是第一层输入 [ X0 X1 X2 ... Xt]，计算出 [ h0 h1 h2 ... ht ]，第二层将 [ h0 h1 h2 ... ht ] 作为 [ X0 X1 X2 ... Xt] 输入再次计算，输出最后的 [ h0 h1 h2 ... ht ]。

bias – If False, then the layer does not use bias weights b_ih and b_hh. Default: True

隐层状态是否带bias，默认为true。bias是偏置值，或者偏移值。没有偏置值就是以0为中轴，或以0为起点。偏置值的作用请参考单层感知器相关结构。

batch_first – If True, then the input and output tensors are provided as (batch, seq, feature). Default: False

batch_first:判断输入输出的第一维是否为 batch_size，默认值 False。故此参数设置可以将 batch_size 放在第一维度。因为 Torch 中，人们习惯使用Torch中带有的dataset，dataloader向神经网络模型连续输入数据，这里面就有一个 batch_size 的参数，表示一次输入多少个数据。在 LSTM 模型中，输入数据必须是一批数据，为了区分LSTM中的批量数据和dataloader中的批量数据是否相同意义，LSTM 模型就通过这个参数的设定来区分。如果是相同意义的，就设置为True，如果不同意义的，设置为False。 torch.LSTM 中 batch_size 维度默认是放在第二维度，故此参数设置可以将 batch_size 放在第一维度。如：input 默认是(4,1,5)，中间的 1 是 batch_size，指定batch_first=True后就是(1,4,5)。所以，如果你的输入数据是二维数据的话，就应该将 batch_first 设置为True;

dropout – If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout. Default: 0

默认值0。是否在除最后一个 RNN 层外的其他 RNN 层后面加 dropout 层。输入值是 0-1 之间的小数，表示概率。0表示0概率dripout，即不dropout

bidirectional – If True, becomes a bidirectional LSTM. Default: False

是否是双向 RNN，默认为：false，若为 true，则：num_directions=2，否则为1。我的理解是，LSTM 可以根据数据输入从左向右推导结果。然后再用结果从右到左反推导，看原因和结果之间是否可逆。也就是原因和结果是一对一关系，还是多对一的关系。这仅仅是我肤浅的假设，有待证明。

输入数据格式：（三个输入）

input(seq_len, batch, input_size)

h_0(num_layers * num_directions, batch, hidden_size)

c_0(num_layers * num_directions, batch, hidden_size)

输出数据格式：

output(seq_len, batch, hidden_size * num_directions)

h_n(num_layers * num_directions, batch, hidden_size)

c_n(num_layers * num_directions, batch, hidden_size)

官网案例理解：
import torch