ptorch F.softmax() 梯度信息为None

最新推荐文章于 2023-12-21 19:57:39 发布

Tchunren

最新推荐文章于 2023-12-21 19:57:39 发布

阅读量1.1k

点赞数

分类专栏： pytorch 深度学习

本文链接：https://blog.csdn.net/t20134297/article/details/108982640

版权

pytorch 同时被 2 个专栏收录

60 篇文章 8 订阅

订阅专栏

深度学习

36 篇文章 8 订阅

订阅专栏

我在进行一个测试梯度实验的时候，发现，当原始变量流经F.softmax以后，原始变量的梯度就无法获得了，例子如下：

import torch.nn.functional as F
import torch


x = torch.randn(1,5,requires_grad=True)
print(x)
# x = F.softmax(x,dim=1)
# print(x)
l = 0
for i in range(5):
    l = l + x[0][i]

print(l)
l.backward()
print(x.grad)

如果x不经过F.softmax()，则会出现如下的梯度信息：

tensor([[ 1.4093, -0.2620,  0.6668, -0.3897,  1.4681]], requires_grad=True)
tensor(2.8925, grad_fn=<AddBackward0>)
tensor([[1., 1., 1., 1., 1.]])

如果经过了F.softmax(),则x的梯度信息就无法获得了，例子如下：

import torch.nn.functional as F
import torch


x = torch.randn(1,5,requires_grad=True)
print(x)
x = F.softmax(x,dim=1)
print(x)
l = 0
for i in range(5):
    l = l + x[0][i]

print(l)
l.backward()
print(x.grad)

此时的x的梯度为None:

tensor([[ 1.0408,  0.5212,  0.2902, -0.7637, -0.7276]], requires_grad=True)
tensor([[0.4163, 0.2476, 0.1965, 0.0685, 0.0710]], grad_fn=<SoftmaxBackward>)
tensor(1., grad_fn=<AddBackward0>)
None