动手学深度学习（pytorch）学习记录4-自动微分(作业)[学习记录]

本文链接：https://blog.csdn.net/weixin_50995339/article/details/140994086

注：本代码在jupyter notebook上运行
封面图片来源

在这里插入图片描述

1 为什么计算二阶倒数比一阶导数的开销更大

计算步骤增多：二阶导数是在一阶导数的基础上再求一次导，因此需要更多的计算步骤。

精度要求提高：为了保持二阶导数的计算精度，可能需要更精确地计算一阶导数，这通常意味着需要更多的计算资源。

计算图复杂性：在自动微分中，二阶导数的计算需要更复杂的计算图遍历或额外的计算图构建。

内存占用增加：为了计算二阶导数，可能需要保存更多的中间结果，增加了内存占用。

2 在运行反向传播函数之后，立即再次运行它，看看会发生什么。

x.grad.zero_()# 梯度归零
y = x * x
u = y.detach()# detach() 方法的作用是从计算图中分离出一个张量。
# 这意味着返回的新张量不再依赖于原来的计算图，因此不会参与任何后续的梯度计算。
z = u * x

z.sum().backward()
x.grad == u

在这里插入图片描述

z.sum().backward()
x.grad == u

报错：RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

3 在控制流的例子中，我们计算d关于a的导数，如果将变量a更改为随机向量或矩阵，会发生什么？

def f(a):
    b = a * 2
    while b.norm() < 1000: # 弗罗贝尼乌斯范数
        b = b * 2
    if b.sum() > 0:
        c = b
    else:
        c = 100* b
    return c

# 计算梯度
a = torch.randn((3,3), requires_grad=True)
d = f(a)
d.backward()

报错：RuntimeError: grad can be implicitly created only for scalar outputs

4 重新设计一个求控制流梯度的例子，运行并分析结果。

def f(a):
    b = a * a
    c =  b
    return c

# 计算梯度
a = torch.randn(size=(), requires_grad=True)
d = f(a)
d.backward()

绘制f(x)=sin(x) 和f(x)对x求导的图像，后者不能用cos(x)

import torch
import matplotlib.pyplot as plt
%matplotlib inline
x = torch.arange(0.0,20,0.01)
x.requires_grad_(True)
x1 = x.detach() # 从计算图中分离出一个张量。
y1 = torch.sin(x1) # y=sin(x)
y2 = torch.sin(x)
y2.sum().backward() # y关于x的倒数
plt.plot(x1,y1) # sin(x)曲线
plt.plot(x1,x.grad) # 求导后的曲线