如果是非常复杂的网络,无法直接计算。但是如果把网络看作图,通过图传播梯度,就能把梯度计算出来,即反向传播
计算y = wx 的梯度代码实现
import torch
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
##用torch.Tensor创建一个权重变量,后面的1.0代表w的值
w = torch.Tensor([1.0])
#权重w是需要计算梯度的,设它为真。
w.requires_grad = True
#构建模型
def forward(x):
return x*w
#计算损失函数
def loss(x,y):
y_pred = forward(x)
return (y_pred - y) ** 2
#输出训练前结果
print("predict (before training)", 4, forward(4).item())
#训练100次
for epoch in range(100):
for x, y in zip(x_data, y_data):
#计算损失
l = loss(x,y)
#损失函数反向传播,计算梯度。
l.backward()
#item直接把梯度里面的数据拿出来,把张量变成一个标量
print('\tgrad:', x, y, w.grad.item())
#权重更新,0.01为学习率
w.data = w.data - 0.02 * w.grad.data
#将权重数据清零,释放计算图
w.grad.data.zero_()
print("Process:", epoch, l.item())
print("predict (after training)", 4, forward(4).item())
结果如下:
predict (before training) 4 4.0
grad: 1.0 2.0 -2.0
Process: 0 1.0
grad: 2.0 4.0 -7.680000305175781
Process: 0 3.6864001750946045
grad: 3.0 6.0 -14.515201568603516
Process: 0 5.852529525756836
grad: 1.0 2.0 -1.0321919918060303
Process: 1 0.26635506749153137
grad: 2.0 4.0 -3.9636173248291016
Process: 1 0.981891393661499
grad: 3.0 6.0 -7.491236686706543
Process: 1 1.5588507652282715
grad: 1.0 2.0 -0.532710075378418
Process: 2 0.07094500958919525
grad: 2.0 4.0 -2.0456066131591797
Process: 2 0.26153165102005005
grad: 3.0 6.0 -3.8661975860595703
Process: 2 0.4152078926563263
grad: 1.0 2.0 -0.2749295234680176
......
Process: 97 2.2737367544323206e-13
grad: 1.0 2.0 -2.384185791015625e-07
Process: 98 1.4210854715202004e-14
grad: 2.0 4.0 -9.5367431640625e-07
Process: 98 5.684341886080802e-14
grad: 3.0 6.0 -2.86102294921875e-06
Process: 98 2.2737367544323206e-13
grad: 1.0 2.0 -2.384185791015625e-07
Process: 99 1.4210854715202004e-14
grad: 2.0 4.0 -9.5367431640625e-07
Process: 99 5.684341886080802e-14
grad: 3.0 6.0 -2.86102294921875e-06
Process: 99 2.2737367544323206e-13
predict (after training) 4 7.999999523162842
Process finished with exit code 0
计算y = w1x^2 + w2x + b 的梯度代码实现
import torch
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
w1 = torch.tensor([1.0])
w2 = torch.tensor([2.0])
b = torch.tensor([3.0])
w1.requires_grad = True
w2.requires_grad = True
b.requires_grad = True
#学习率
r = 0.02
def forward(x):
return x ** 2 * w1 + x * w2 + b
def loss(x, y):
y_pred = forward(x)
return (y_pred - y) ** 2
print('predict(before training):', 4, forward(4).item())
for epoch in range(100):
for x, y in zip(x_data, y_data):
l = loss(x, y)
l.backward()
print('\tgrad:', x, y, w1.grad.item(), w2.grad.item(), b.grad.item())
w1.data -= r * w1.grad.data
w2.data -= r * w2.grad.data
b.data -= r * b.grad.data
w1.grad.data.zero_()
w2.grad.data.zero_()
b.grad.data.zero_()
print('progress:', epoch, l.item())
print('predict(after training):', 4, forward(4).item())
结果如下:
predict(before training): 4 27.0
grad: 1.0 2.0 8.0 8.0 8.0
grad: 2.0 4.0 47.040000915527344 23.520000457763672 11.760000228881836
grad: 3.0 6.0 -3.4847946166992188 -1.1615982055664062 -0.38719940185546875
progress: 0 0.03748084604740143
grad: 1.0 2.0 3.9485440254211426 3.9485440254211426 3.9485440254211426
grad: 2.0 4.0 5.767963409423828 2.883981704711914 1.441990852355957
grad: 3.0 6.0 -31.601348876953125 -10.533782958984375 -3.511260986328125
progress: 1 3.0822384357452393
grad: 1.0 2.0 4.896817207336426 4.896817207336426 4.896817207336426
grad: 2.0 4.0 19.595916748046875 9.797958374023438 4.898979187011719
grad: 3.0 6.0 -15.325738906860352 -5.108579635620117 -1.702859878540039
progress: 2 0.7249329686164856
grad: 1.0 2.0 3.8229713439941406 3.8229713439941406 3.8229713439941406
grad: 2.0 4.0 10.569290161132812 5.284645080566406 2.642322540283203
grad: 3.0 6.0 -18.334705352783203 -6.111568450927734 -2.037189483642578
progress: 3 1.0375351905822754
grad: 1.0 2.0 3.6837034225463867 3.6837034225463867 3.6837034225463867
grad: 2.0 4.0 11.581207275390625 5.7906036376953125 2.8953018188476562
grad: 3.0 6.0 -13.655387878417969 -4.551795959472656 -1.5172653198242188
......
progress: 96 0.01702137291431427
grad: 1.0 2.0 0.563727855682373 0.563727855682373 0.563727855682373
grad: 2.0 4.0 -2.8896255493164062 -1.4448127746582031 -0.7224063873291016
grad: 3.0 6.0 2.344860076904297 0.7816200256347656 0.2605400085449219
progress: 97 0.016970273107290268
grad: 1.0 2.0 0.5628738403320312 0.5628738403320312 0.5628738403320312
grad: 2.0 4.0 -2.885272979736328 -1.442636489868164 -0.721318244934082
grad: 3.0 6.0 2.3413238525390625 0.7804412841796875 0.2601470947265625
progress: 98 0.016919128596782684
grad: 1.0 2.0 0.5620212554931641 0.5620212554931641 0.5620212554931641
grad: 2.0 4.0 -2.88092041015625 -1.440460205078125 -0.7202301025390625
grad: 3.0 6.0 2.3378047943115234 0.7792682647705078 0.25975608825683594
progress: 99 0.016868306323885918
predict(after training): 4 8.094977378845215