原理:
利用backward可以求解梯度,那么就可以直接利用梯度下降来更新变量
a -= a.grad
代码:
#%%
import torch
import torch.nn as nn
#%% 求导
a = torch.tensor(2,dtype=torch.float32,requires_grad=True)
loss = a**2
for i in range(100):
a.grad = torch.zeros_like(a)
loss.backward(retain_graph=True)
a.data -= 0.1*a.grad
print(a)
结果:
tensor(1.6000, requires_grad=True)
tensor(1.2800, requires_grad=True)
tensor(1.0240, requires_grad=True)
tensor(0.8192, requires_grad=True)
tensor(0.6554, requires_grad=True)
tensor(0.5243, requires_grad=True)
tensor(0.4194, requires_grad=True)
...
tensor(4.7428e-09, requires_grad=True)
tensor(3.7943e-09, requires_grad=True)
tensor(3.0354e-09, requires_grad=True)
tensor(2.4283e-09, requires_grad=True)
tensor(1.9427e-09, requires_grad=True)
tensor(1.5541e-09, requires_grad=True)
tensor(1.2433e-09, requires_grad=True)
tensor(9.9465e-10, requires_grad=True)
tensor(7.9572e-10, requires_grad=True)
tensor(6.3657e-10, requires_grad=True)
tensor(5.0926e-10, requires_grad=True)
tensor(4.0741e-10, requires_grad=True)