Pytorch获取中间变量的梯度

最新推荐文章于 2024-05-12 17:08:47 发布

潜行隐耀

最新推荐文章于 2024-05-12 17:08:47 发布

阅读量1w

点赞数 7

分类专栏： pytorch

本文链接：https://blog.csdn.net/PanYHHH/article/details/113436204

版权

pytorch 专栏收录该内容

15 篇文章 20 订阅

订阅专栏

为了节省显存，pytorch在反向传播的过程中只保留了计算图中的叶子结点的梯度值，而未保留中间节点的梯度，如下例所示：

import torch

x = torch.tensor(3., requires_grad=True)
y = x ** 2
z = 4 * y

z.backward()
print(x.grad)   # tensor(24.)
print(y.grad)   # None

可以看到当进行反向传播后，只保留了x的梯度tensor(24.)，而y的梯度没有保留所以为None。

但有时我们需要得到模型中间变量的梯度（如绘制Grad-CAM图时），接下来介绍两种获取中间变量梯度的方法：

方法一：torch.autograd.grad(outputs, inputs)

import torch
import torch.autograd as autograd

x = torch.tensor(3., requires_grad=True)
y = x ** 2
z = 4 * y

x_grad = autograd.grad(z, x, retain_graph=True)[0]
y_grad = autograd.grad(z, y, retain_graph=True)[0]
print(x_grad)   # tensor(24.)
print(y_grad)   # tensor(4.)

可以看到此时x和y的梯度都可以获得，使用此方法时不用执行.backward()。

方法二：torch.Tensor.register_hook()

import torch

x = torch.tensor(3., requires_grad=True)
y = x ** 2
z = 4 * y
features_grad = 0.


# 为了读取模型中间参数变量的梯度而定义的辅助函数
def extract(g):
    global features_grad
    features_grad = g


y.register_hook(extract)
z.backward()
y_grad = features_grad

print(x.grad)   # tensor(24.)
print(y_grad)   # tensor(4.)

在执行反向传播之前，对需要求梯度的中间变量执行.register_hook()，便可获得该中间变量的梯度值。

潜行隐耀

关注

7
点赞
踩
17

收藏

觉得还不错? 一键收藏
4
评论
Pytorch获取中间变量的梯度

为了节省显存，pytorch在反向传播的过程中只保留了计算图中的叶子结点的梯度值，而未保留中间节点的梯度，如下例所示：import torchx = torch.tensor(3., requires_grad=True)y = x ** 2z = 4 * yz.backward()print(x.grad) # tensor(24.)print(y.grad) # None可以看到当进行反向传播后，只保留了x的梯度tensor(24.)，而y的梯度没有保留所以为None。
复制链接

扫一扫