刘二老师的《PyTorch深度学习实践》Lecture_03 重点回顾+代码复现+知识补充
Lecture_03 梯度下降 Gradient Descent
梯度的方向是函数变化最快的方向,梯度的负方向是函数下降最快的方向。
一、重点回顾
(一)(批量)梯度下降 (Batch) Gradient Descent
(二)随机梯度下降 Stochastic Gradient Descent
注意:批量梯度下降的是用所有的样本进行迭代。 而随机梯度下降是每次随机抽取一个样本进行迭代。
(三)小批量随机梯度下降 Mini-Batch Stochastic Gradient Descent
梯度下降可以并行计算所有样本,学习器性能低但用时短;随机梯度下降每次计算单个样本,学习器性能高但用时长,所以采取折中的小批量随机梯度下降,每次迭代使用batch_size个样本。
二、代码复现
(一)梯度下降
import numpy as np
import matplotlib.pyplot as plt
# 已知数据:
x_data = [1.0,2.0,3.0]
y_data = [2.0,4.0,6.0]
# 线性模型为y = wx, 预测x = 4时,y的值
# 假设:
w = 1.0
# 定义模型:
def forward(x):
return x*w
# 定义损失函数(均方误差):
def cost(xs,ys):
cost = 0
for x,y in zip(xs,ys):
y_pred = forward(x)
cost += (y_pred-y)**2
return cost/len(xs)
# 定义梯度下降算法:
def gradient(xs,ys):
grad = 0
for x,y in zip(xs,ys):
grad += 2*x*(x*w-y)
return grad/len(xs)
print("Prediction before training:",4,'%.2f'%(forward(4)))
for epoch in range(100):
cost_val = cost(x_data,y_data)
grad_val = gradient(x_data,y_data)
w -= 0.01 *grad_val
print("Epoch:%d, w = %.2f, loss = %.2f"%(epoch,w,cost_val))
print("Prediction after training:",4,'%.2f'%(forward(4)))
(二)随机梯度下降
import numpy as np
import matplotlib.pyplot as plt
# 已知数据:
x_data = [1.0,2.0,3.0]
y_data = [2.0,4.0,6.0]
# 线性模型为y = wx, 预测x = 4时, y的值
# 假设:
w = 1.0
# 定义模型:
def forward(x):
return x*w
# 定义损失函数:
def loss(x,y):
y_pred = forward(x)
return (y_pred - y)**2
# 定义随机梯度下降算法:
def gradient(x,y):
return 2*x*(x*w-y)
print("Prediction before training:",4,'%.2f'%(forward(4)))
for epoch in range(100):
for x, y in zip(x_data,y_data):
grad = gradient(x,y)
w = w - 0.01 * grad
print("\tgrad:%.1f %.1f %.2f" % (x,y,grad))
l = loss(x,y)
print("Epoch:%d, w = %.2f, loss = %.2f" % (epoch,w,l))
print("Prediction after training:",4,'%.2f'%(forward(4)))