Torch.tensor.backward()方法的使用举例

最新推荐文章于 2025-02-23 15:37:35 发布

敲代码的小风

最新推荐文章于 2025-02-23 15:37:35 发布

阅读量5.2k

点赞数 8

分类专栏：零基础学习SSD网络PyTorch实现 Deep-Learning-with-PyTorch 《深度学习之PyTorch实战计算机视觉》

本文链接：https://blog.csdn.net/m0_46653437/article/details/112226972

版权

零基础学习SSD网络PyTorch实现同时被 3 个专栏收录

293 篇文章

订阅专栏

Deep-Learning-with-PyTorch

216 篇文章

订阅专栏

《深度学习之PyTorch实战计算机视觉》

198 篇文章

订阅专栏

本文详细介绍了PyTorch中梯度计算的核心方法backward()的使用方式与注意事项。包括如何通过链式法则计算梯度、如何指定额外梯度、保留计算图以便二次求导等内容，并提供了实际操作示例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

参考链接: backward(gradient=None, retain_graph=None, create_graph=False)

在这里插入图片描述

原文及翻译:

backward(gradient=None, retain_graph=None, create_graph=False)
方法: backward(gradient=None, retain_graph=None, create_graph=False)
    Computes the gradient of current tensor w.r.t. graph leaves.
    计算当前张量相对于计算图中叶子节点的梯度.

    The graph is differentiated using the chain rule. If the tensor is non-scalar 
    (i.e. its data has more than one element) and requires gradient, the function 
    additionally requires specifying gradient. It should be a tensor of matching type 
    and location, that contains the gradient of the differentiated function w.r.t. self.
	我们使用链式法则来对计算图微分求导计算梯度. 如果张量tensor不是标量(即:它的数据包含多个元素),
	并且需要计算梯度,那么需要给这个函数指定一个额外的梯度.这个指定的梯度是一个张量,并且需要满足
	一定的条件,它的类型和位置需要匹配,并且它包含了某个可微函数相对于当前张量自身self的梯度.
	
    This function accumulates gradients in the leaves - you might need to zero them 
    before calling it.
    该函数会对叶子节点的梯度进行累加 - 因此你可能需要在调用这个函数之前先将这些叶节点的梯度置零.

    Parameters  参数
    
            gradient (Tensor or None) – Gradient w.r.t. the tensor. If it is a tensor, 
            it will be automatically converted to a Tensor that does not require grad 
            unless create_graph is True. None values can be specified for scalar Tensors 
            or ones that don’t require grad. If a None value would be acceptable then 
            this argument is optional.
            gradient (Tensor张量 或者是 None) – 它是相对于tensor张量的梯度. 如果它是一个张量
            那么它将会自动被转化为不需要求梯度的张量,除非参数create_graph是True. None值可以指定
            给标量类型的Tensor,或者指定给不需要求梯度的张量.如果一个None值将可以被接受,那么这个
            参数是可选的.

            retain_graph (bool, optional) – If False, the graph used to compute the grads 
            will be freed. Note that in nearly all cases setting this option to True is 
            not needed and often can be worked around in a much more efficient way. 
            Defaults to the value of create_graph.
			retain_graph (布尔类型, 可选的) – 如果该参数是False, 用于计算梯度的这个计算图将会在
			内存中被释放掉. 值得注意的是,几乎所有将这个选择项设置为True的使用案例都是不需要设置为
			True的,并且如果不设置的话通常可以更高效地运行. 该参数的默认值是create_graph的值.

            create_graph (bool, optional) – If True, graph of the derivative will be 
            constructed, allowing to compute higher order derivative products. Defaults 
            to False.
            create_graph (布尔类型, 可选的) – 该参数如果是True,导数的计算图将会被创建,可以用于
            计算更高阶数的导数结果.该参数的默认值是False.

Microsoft Windows [版本 10.0.18363.1256]
(c) 2019 Microsoft Corporation。保留所有权利。

C:\Users\chenxuqi>conda activate ssd4pytorch1_2_0

(ssd4pytorch1_2_0) C:\Users\chenxuqi>python
Python 3.7.7 (default, May  6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> alpha = torch.tensor([10.0, 100.0, 1000.0])
>>> alpha
tensor([  10.,  100., 1000.])
>>>
>>> X = torch.tensor([4.0, 3.0, 2.0],requires_grad=True)
>>> X
tensor([4., 3., 2.], requires_grad=True)
>>>
>>> Y = torch.zeros(3)
>>> Y
tensor([0., 0., 0.])
>>> x0,x1,x2 = X[0],X[1],X[2]
>>> y0 = 3*x0+7*x1**2+6*x2**3
>>> y1 = 4*x0+8*x1**2+3*x2**3
>>> y2 = 5*x0+9*x1**2+1*x2**3
>>> Y
tensor([0., 0., 0.])
>>> Y[0],Y[1],Y[2] = y0,y1,y2
>>> Y
tensor([123., 112., 109.], grad_fn=<CopySlices>)
>>> X.grad
>>> print(X.grad)
None
>>> print(Y.grad)
None
>>>
>>> Y.backward(gradient=alpha)
>>> print(Y.grad)
None
>>> print(X.grad)
tensor([ 5430., 59220., 16320.])
>>>
>>> # params.grad.zero_()
>>>
>>> Y.backward(gradient=alpha)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Anaconda3\envs\ssd4pytorch1_2_0\lib\site-packages\torch\tensor.py", line 118, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "D:\Anaconda3\envs\ssd4pytorch1_2_0\lib\site-packages\torch\autograd\__init__.py", line 93, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
>>>
>>>
>>>
>>> alpha = torch.tensor([10.0, 100.0, 1000.0])
>>> alpha
tensor([  10.,  100., 1000.])
>>> X = torch.tensor([4.0, 3.0, 2.0],requires_grad=True)
>>> Y = torch.zeros(3)
>>> x0,x1,x2 = X[0],X[1],X[2]
>>> y0 = 3*x0+7*x1**2+6*x2**3
>>> y1 = 4*x0+8*x1**2+3*x2**3
>>> y2 = 5*x0+9*x1**2+1*x2**3
>>> Y
tensor([0., 0., 0.])
>>> Y[0],Y[1],Y[2] = y0,y1,y2
>>> Y
tensor([123., 112., 109.], grad_fn=<CopySlices>)
>>> print(X.grad)
None
>>> print(Y.grad)
None
>>> Y.backward(gradient=alpha,retain_graph=True)
>>> print(X.grad)
tensor([ 5430., 59220., 16320.])
>>> print(Y.grad)
None
>>> Y.backward(gradient=alpha,retain_graph=True)
>>> print(X.grad)
tensor([ 10860., 118440.,  32640.])
>>> print(Y.grad)
None
>>> Y.backward(gradient=alpha,retain_graph=True)
>>> print(X.grad)
tensor([ 16290., 177660.,  48960.])
>>> print(Y.grad)
None
>>> X.grad.zero_()
tensor([0., 0., 0.])
>>> print(X.grad)
tensor([0., 0., 0.])
>>> print(Y.grad)
None
>>> Y.backward(gradient=alpha,retain_graph=True)
>>> print(X.grad)
tensor([ 5430., 59220., 16320.])
>>> print(Y.grad)
None
>>>
>>>
>>> X.grad.zero_()
tensor([0., 0., 0.])
>>> print(X.grad)
tensor([0., 0., 0.])
>>> print(Y.grad)
None
>>> Y.backward(gradient=alpha)
>>> print(X.grad)
tensor([ 5430., 59220., 16320.])
>>> print(Y.grad)
None
>>>
>>>
>>>

在这里插入图片描述