关于auto-gradient机制与detach函数

最新推荐文章于 2024-08-09 00:49:52 发布

Oshrin

最新推荐文章于 2024-08-09 00:49:52 发布

阅读量543

点赞数 1

分类专栏： torch 深度学习文章标签： pytorch auto_gradient detach

本文链接：https://blog.csdn.net/qq_41563738/article/details/102781453

版权

本文详细介绍了PyTorch中的自动梯度机制，解释了叶子节点和非叶子节点的概念，并探讨了detach函数如何在计算图中阻断梯度传播，帮助优化模型参数。重点在于理解何时使用detach以及它对反向传播的影响。

摘要由CSDN通过智能技术生成

先讲一下叶子节点和非叶子节点的定义：

叶子节点（张量的is_leaf）属性值为True，grad_fn为None，叶子节点有两种情况：

第一种：由用户自行创建的节点（即不是由运算而来）：

a = torch.rand(5, 5, requires_grad=False)
b = torch.rand(5, 5, requires_grad=False)
c = torch.rand(5, 5, requires_grad=True)

print(a.is_leaf, b.is_leaf, c.is_leaf)

out:True True True

这里a、b、c都是叶子节点，可见，只要是用户创建的节点，不管requires_grad是否为True，都被认定为叶子节点。

import torch
import torch.nn


class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear1 = nn.Linear(20, 100)
        self.conv1 = nn.Conv1d(3, 20, 1)
        self.linear2 = nn.Linear(100, 3)

    def forward(self, x):
        x = self.linear1(x)
        x = self.conv1(x)
        x = self.linear2(x)
        return x


net = Net()
# 利用这里的.named_parameters()函数可以查看网络某一层的梯度或者相关信息
for name, param in net.named_parameters():
    print(param.is_leaf)


loss = torch.sum(net(torch.rand(3, 3, 20)))
loss.backward()
print(loss)

for name, param in net.named_parameters():
    print(param.grad)

输出结果：

True
True
True
True
True
True
tensor(17.9995, grad_fn=<SumBackward0>)
tensor([[ 0.0022,  0.0038,  0.0032,  ...,  0.0040,  0.0043,  0.0028],
        [-0.2095, -0.3591, -0.3042,  ..., -0.3819, -0.4080, -0.2680],
        [-0.2490, -0.4268, -0.3616,  ..., -0.4540, -0.4849, -0.3185],
        ...,
        [-0.6358, -1.0897, -0.9233,  ..., -1.1591, -1.2382, -0.8132],
        [ 0.7041,  1.2068,  1.0225,  ...,  1.2837,  1.3713,  0.9006],
        [ 0.7866,  1.3482,  1.1423,  ...,  1.4340,  1.5319,  1.0060]])
tensor([ 0.0065, -0.6185, -0.7351,  1.4288,  3.1651,  0.5760, -1.5725, -1.7545,
         0.3321,  3.0410,  1.4884, -1.7543,  1.7647,  1.5608,  3.0300, -1.2516,
         1.4375, -3.0647,  2.1893,  1.0579, -2.0025, -3.3547,  1.8892, -0.3292,
        -1.3015,  0.2360,  0.8959,  0.5413,  0.1860,  1.6856, -1.0035,  3.7435,