关于相同数据在RNN cell 和 RNN上运行结果不一致的分析

最新推荐文章于 2022-08-21 11:15:49 发布

hellopbc

最新推荐文章于 2022-08-21 11:15:49 发布

阅读量290

点赞数

分类专栏： ML and DL 文章标签： RNN cell RNN

本文链接：https://blog.csdn.net/qq_37774098/article/details/114298656

版权

ML and DL 专栏收录该内容

31 篇文章 1 订阅

订阅专栏

文章目录

关于相同数据在RNN cell 和 RNN上运行结果不一致的分析

关于相同数据在RNN cell 和 RNN上运行结果不一致的分析

代码

import torch
import torch.nn as nn
import torch.nn.functional as F

input_size = 2  # input_size means the lengths of one-hot encode vector, for example, the code [... 128 dim ...] of 'o' in "hello"
batch_size = 1
seq_len = 3  # it means the length of the whole sequence  rather than one-hot encode vector
num_layers = 1
hidden_size = 5

data = torch.randn(seq_len, batch_size, input_size)  # (3,2,4)
print(data)
hidden = torch.zeros(batch_size, hidden_size)  # (2,4)
print(hidden)

# RNN Cell part
# the vector dimension of input and output for every sample x
Cell = nn.RNNCell(input_size=input_size, hidden_size=hidden_size)

for idx, input in enumerate(data):
    print("=" * 20, idx, "=" * 20)
    print("input shape:", input.shape)
    print(input)

    print(hidden)

    hidden = Cell(input, hidden)

    print("hidden shape:", hidden.shape)
    print(hidden)
print("=" * 20, "=", "=" * 20, "\n")

# RNN part
hidden = torch.zeros(num_layers, batch_size, hidden_size)  # (2,4)
RNN = nn.RNN(input_size=input_size, hidden_size=hidden_size, num_layers=num_layers)

out, hidden = RNN(data, hidden)
print("data:", data)
print("output size:", out.shape)
print("output:", out)

print("hidden size:", hidden.shape)
print("hidden:", hidden)

运行结果

tensor([[[ 1.5129,  0.0149]],
        [[ 0.5056, -1.7836]],
        [[ 0.4780,  0.8940]]])
tensor([[0., 0., 0., 0., 0.]])
==================== 0 ====================
input shape: torch.Size([1, 2])
tensor([[1.5129, 0.0149]])
tensor([[0., 0., 0., 0., 0.]])
hidden shape: torch.Size([1, 5])
tensor([[-0.5329,  0.1464, -0.1624, -0.6233, -0.5527]], grad_fn=<TanhBackward>)
==================== 1 ====================
input shape: torch.Size([1, 2])
tensor([[ 0.5056, -1.7836]])
tensor([[-0.5329,  0.1464, -0.1624, -0.6233, -0.5527]], grad_fn=<TanhBackward>)
hidden shape: torch.Size([1, 5])
tensor([[-0.5681,  0.0588, -0.7489,  0.1834, -0.7691]], grad_fn=<TanhBackward>)
==================== 2 ====================
input shape: torch.Size([1, 2])
tensor([[0.4780, 0.8940]])
tensor([[-0.5681,  0.0588, -0.7489,  0.1834, -0.7691]], grad_fn=<TanhBackward>)
hidden shape: torch.Size([1, 5])
tensor([[ 0.1734, -0.2558, -0.4364, -0.6313, -0.1552]], grad_fn=<TanhBackward>)
==================== = ==================== 
data: tensor([[[ 1.5129,  0.0149]],
        [[ 0.5056, -1.7836]],
        [[ 0.4780,  0.8940]]])
output size: torch.Size([3, 1, 5])
output: tensor([[[ 0.1714, -0.2880,  0.1836,  0.8961, -0.5190]],
        [[-0.3229, -0.4838, -0.0166,  0.7877, -0.0175]],
        [[ 0.7183,  0.1400,  0.0856,  0.6780, -0.1199]]],
       grad_fn=<StackBackward>)
hidden size: torch.Size([1, 1, 5])
hidden: tensor([[[ 0.7183,  0.1400,  0.0856,  0.6780, -0.1199]]],
       grad_fn=<StackBackward>)

问题描述

通过输出结果可以看见，两边分别得到三次的h_i的数值是不一样的；

我认为RNN是把RNN cell的循环进行了封装（这里我把RNN的num_layer设置为1，其他参数和RNN cell一致；输入数据一致；h₀初始为0向量），因此参与的计算过程也应该是一样的

那么，为什么两边得到的结果不一致呢？

分析

待解决

hellopbc

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
3
评论
关于相同数据在RNN cell 和 RNN上运行结果不一致的分析

文章目录关于相同数据在RNN cell 和 RNN上运行结果不一致的分析代码运行结果问题描述分析关于相同数据在RNN cell 和 RNN上运行结果不一致的分析代码import torchimport torch.nn as nnimport torch.nn.functional as Finput_size = 2 # input_size means the lengths of one-hot encode vector, for example, the code [... 128
复制链接

扫一扫