第三讲 gradient_decent 源代码和 stochastic_gradient_decent 源代码。
B站 刘二大人 ,传送门PyTorch深度学习实践——反向传播
参考错错莫课代表的PyTorch 深度学习实践 第4讲
backpropagation源代码(实现了可视化)
注意事项:
1.张量tensor,打印引用(输出)的时候不能直接用,要在后面加上.item();.
2.因为tensor直接用的话是在建立计算图,权重的更新要在后面加上. data;
3.权重更新完要用w.grad.data.zero_()进行grad.data的清零(w是可以变化的)。
import torch import matplotlib.pyplot as plt x_data = [1.0, 2.0, 3.0] y_data = [2.0, 4.0, 6.0] w = torch.tensor([1.0]) w.requires_grad = True def forward(x): return x*w def loss(x, y): y_pred = forward(x) return (y_pred - y)**2 print("predict (before training)", 4, forward(4).item()) mse_list=[] epoch_list=[] for epoch in range(100): cost = 0 for x, y in zip(x_data, y_data): l = loss(x, y) l.backward() # backward,compute grad for Tensor whose requires_grad set to True print('\tgrad:', x, y, w.grad.item()) w.data = w.data - 0.01*w.grad.data w.grad.data.zero_() cost += l.item() mse_list.append(cost / len(x_data)) epoch_list.append(epoch) print('progress:', epoch, l.item()) print("predict (after training)", 4, forward(4).item()) plt.plot(epoch_list,mse_list) plt.xlabel('epoch') plt.ylabel('cost') plt.show()
可视化效果:
部分输出结果:
predict (before training) 4 4.0
grad: 1.0 2.0 -2.0
grad: 2.0 4.0 -7.840000152587891
grad: 3.0 6.0 -16.228801727294922
progress: 0 7.315943717956543
grad: 1.0 2.0 -1.478623867034912
grad: 2.0 4.0 -5.796205520629883
grad: 3.0 6.0 -11.998146057128906
progress: 1 3.9987640380859375
grad: 1.0 2.0 -1.0931644439697266
grad: 2.0 4.0 -4.285204887390137
grad: 3.0 6.0 -8.870372772216797
progress: 2 2.1856532096862793
grad: 1.0 2.0 -0.8081896305084229
grad: 2.0 4.0 -3.1681032180786133
grad: 3.0 6.0 -6.557973861694336.
.
progress: 98 9.094947017729282e-13
grad: 1.0 2.0 -7.152557373046875e-07
grad: 2.0 4.0 -2.86102294921875e-06
grad: 3.0 6.0 -5.7220458984375e-06
progress: 99 9.094947017729282e-13
predict (after training) 4 7.99999856948852
作业
推导过程
backpropagation_homework作业源代码(实现了可视化)
import torch import matplotlib.pyplot as plt x_data = [1.0, 2.0, 3.0]# 数据集不变 y_data = [2.0, 4.0, 6.0] w1 = torch.tensor([1.0]) w1.requires_grad = True w2 = torch.tensor([1.0]) w2.requires_grad = True b = torch.tensor([1.0]) b.requires_grad = True # 构建前馈函数 def forward(x): return w1*x**2 + w2*x + b # 构建计算图 def loss(x, y): y_pred = forward(x) return (y_pred - y)**2 print("predict (before training)", 4, forward(4).item()) mse_list=[] epoch_list=[] for epoch in range(100): cost = 0 for x, y in zip(x_data, y_data): l = loss(x, y) l.backward() # backward,compute grad for Tensor whose requires_grad set to True print('\tgrad:', x, y, w1.grad.item(), w2.grad.item(), b.grad.item()) w1.data = w1.data - 0.01 * w1.grad.data w2.data = w2.data - 0.01 * w2.grad.data b.data = b.data - 0.01 * b.grad.data w1.grad.data.zero_() w2.grad.data.zero_() b.grad.data.zero_() cost += l.item() mse_list.append(cost / len(x_data)) epoch_list.append(epoch) print('progress:', epoch, l.item()) print("predict (after training)", 4, forward(4).item()) plt.plot(epoch_list,mse_list) plt.xlabel('epoch') plt.ylabel('cost') plt.show()
部分输出结果:
predict (before training) 4 21.0
grad: 1.0 2.0 2.0 2.0 2.0
grad: 2.0 4.0 22.880001068115234 11.440000534057617 5.720000267028809
grad: 3.0 6.0 77.04720306396484 25.682401657104492 8.560800552368164
progress: 0 18.321826934814453
grad: 1.0 2.0 -1.1466078758239746 -1.1466078758239746 -1.1466078758239746
grad: 2.0 4.0 -15.536651611328125 -7.7683258056640625 -3.8841629028320312
grad: 3.0 6.0 -30.432214736938477 -10.144071578979492 -3.381357192993164
progress: 1 2.858394145965576
grad: 1.0 2.0 0.3451242446899414 0.3451242446899414 0.3451242446899414
grad: 2.0 4.0 2.4273414611816406 1.2136707305908203 0.6068353652954102
grad: 3.0 6.0 19.449920654296875 6.483306884765625 2.161102294921875.
.
progress: 97 0.0063657015562057495
grad: 1.0 2.0 0.3163881301879883 0.3163881301879883 0.3163881301879883
grad: 2.0 4.0 -1.7319889068603516 -0.8659944534301758 -0.4329972267150879
grad: 3.0 6.0 1.4334239959716797 0.47780799865722656 0.1592693328857422
progress: 98 0.0063416799530386925
grad: 1.0 2.0 0.31661415100097656 0.31661415100097656 0.31661415100097656
grad: 2.0 4.0 -1.7297439575195312 -0.8648719787597656 -0.4324359893798828
grad: 3.0 6.0 1.4307546615600586 0.47691822052001953 0.15897274017333984
progress: 99 0.00631808303296566
predict (after training) 4 8.544171333312988最终输出的值约为8.54,很显然不等于8,说明这个模型(二次函数)不适合我们的数据集(一次函数)
可视化效果: