pytorch-LSTM的输入和输出尺寸

最新推荐文章于 2023-04-01 08:55:03 发布

我是一颗棒棒糖

最新推荐文章于 2023-04-01 08:55:03 发布

阅读量5.7k

点赞数 10

分类专栏： DeepLearning学习文章标签：深度学习神经网络 lstm

本文链接：https://blog.csdn.net/qq_42255269/article/details/112799785

版权

DeepLearning学习专栏收录该内容

77 篇文章 19 订阅

订阅专栏

LSTM的输入和输出尺寸

CLASS torch.nn.LSTM(*args, **kwargs)

Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence.
For each element in the input sequence, each layer computes the following function:

对于一个输入序列实现多层长短期记忆的RNN网络，对于输入序列中的每一个元素，LSTM的每一层进行如下计算：
$i_t = \sigma(W_{ii} x_t + b_{ii} + W_{hi} h_{t-1} + b_{hi}) \\ f_t = \sigma(W_{if} x_t + b_{if} + W_{hf} h_{t-1} + b_{hf}) \\ g_t = \tanh(W_{ig} x_t + b_{ig} + W_{hg} h_{t-1} + b_{hg}) \\ o_t = \sigma(W_{io} x_t + b_{io} + W_{ho} h_{t-1} + b_{ho}) \\ c_t = f_t \odot c_{t-1} + i_t \odot g_t \\ h_t = o_t \odot \tanh(c_t) \\$
其中：

$h_t：$ 时间步t的隐藏状态
$c_t：$ 时间步t的细胞状态
$x_t：$ 时间步t的输入
$h_{t-1}：$ 时间步t-1的隐藏状态或者初始化的隐藏状态（时间步0）
$i_t、f_t、g_t：$ 分别是输入门，遗忘门，单元门和输出门
$\sigma：$ sigmoid函数
$\odot：$ Hadamard积

其中的参数：

input_size ：输入的维度

hidden_size：h的维度

num_layers：堆叠LSTM的层数，默认值为1

bias：偏置 ，默认值：True

batch_first： 如果是True，则input为(batch, seq, input_size)。默认值为：False（seq_len, batch, input_size）

bidirectional ：是否双向传播，默认值为False

输入

Inputs: input, (h_0, c_0)

Input输入维度是(seq_len, batch, input_size)，即（句子中字的数量，批量大小，每个字向量的长度）
h_0 的维度(num_layers * num_directions, batch, hidden_size)，即（层数 $*$ LSTM方向数量(单向或者双向)，批量大小，隐藏向量维度）
c_0 的维度 (num_layers * num_directions, batch, hidden_size)，即（层数 $*$ LSTM方向数量，隐藏向量维度）
If (h_0, c_0) is not provided, both h_0 and c_0 default to zero，h_0和c_0的默认参数都是全0.

输出

Outputs: output, (h_n, c_n)

output 输出维度 (seq_len, batch, num_directions * hidden_size)，即（句子中字的数量，批量大小，LSTM方向数量 $*$ 隐藏向量维度）
h_n 维度 (num_layers * num_directions, batch, hidden_size)
c_n 维度 (num_layers * num_directions, batch, hidden_size)

举个例子

num_layers = 1

import torch.nn as nn
import torch
x = torch.rand(5,50,100)#(seq_len, batch, input_size)
lstm = nn.LSTM(100,20,num_layers=2)
output,(hidden,cell) = lstm(x)
print("output size:{} \nhidden size:{} \ncell size:{}".format(output.size(),hidden.size(),cell.size()))

输出：

output size:torch.Size([5, 50, 20]) 
hidden size:torch.Size([2, 50, 20]) 
cell size:torch.Size([2, 50, 20])

bidirecrtional = True

import torch.nn as nn
import torch
x = torch.rand(5,50,100)
lstm = nn.LSTM(100,20,bidirectional=True)
output,(hidden,cell) = lstm(x)
print("output size:{} \nhidden size:{} \ncell size:{}".format(output.size(),hidden.size(),cell.size()))

输出：

output size:torch.Size([5, 50, 40]) 
hidden size:torch.Size([2, 50, 20]) 
cell size:torch.Size([2, 50, 20])

我是一颗棒棒糖

关注

10
点赞
踩
27

收藏

觉得还不错? 一键收藏
1
评论
pytorch-LSTM的输入和输出尺寸

LSTM的输入和输出尺寸CLASS torch.nn.LSTM(*args, **kwargs)Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence.For each element in the input sequence, each layer computes the following function:对于一个输入序列实现多层长短期记忆的RNN网络，对于输入序列中的每一个元素，LSTM的
复制链接

扫一扫