线性模型:
y
^
=
ω
∗
x
\hat{y}=\omega*x
y^=ω∗x
随机梯度下降:
ω
=
ω
−
α
∂
l
o
s
s
∂
ω
\omega=\omega-\alpha\frac{\partial loss}{\partial \omega}
ω=ω−α∂ω∂loss
损失函数:
l
o
s
s
=
(
y
^
−
y
)
2
=
(
x
∗
w
−
y
)
2
loss=(\hat{y}-y)^2=(x*w-y)^2
loss=(y^−y)2=(x∗w−y)2
损失函数导数:
∂
l
o
s
s
n
∂
ω
=
2
∗
x
n
∗
(
x
n
∗
ω
−
y
n
)
\frac{\partial loss_n}{\partial \omega} = 2*x_n*(x_n*\omega-y_n)
∂ω∂lossn=2∗xn∗(xn∗ω−yn)
示意图:
反向传播计算图:
代码:
import torch
x_data = [1.0,2.0,3.0]
y_data = [2.0,4.0,6.0]
w = torch.Tensor([1.0])#构建权重
w.requires_grad=True #需要计算梯度
def forward(x):#这里的权重w是个Tensor
return x*w
def loss(x,y):#定义损失函数
y_pred=forward(x)
return (y_pred-y)**2
print("predict (before training)",4,forward(4).item())
for epoch in range(100):
for x,y in zip(x_data,y_data):
l=loss(x,y)#loss是张量,前向传播,计算损失
l.backward()#反向传播:把计算梯度,存到权重w中,权重是张量,计算时需要用w.grad.data
print('\tgrad:',x,y,w.grad.item())#item()把梯度值拿出来,变成标量
w.data=w.data-0.01*w.grad.data#取到梯度来更新权重
w.grad.data.zero_()#在反向传播后,梯度积累下来了,需要把权重里面梯度数据全部清零
print("progress:",epoch,l.item())
print("predict (after training)",4,forward(4).item())
结果:
练习题:
y
=
x
2
−
2
x
+
1
y=x^2-2x+1
y=x2−2x+1
x:1 2 3 4
y:0 1 4 ?
公式推导:
l
o
s
s
=
(
y
^
−
y
)
2
=
(
ω
1
x
2
+
ω
2
x
+
b
−
y
)
2
loss = (\hat{y}-y)^2=(\omega_1x^2+\omega_2x+b-y)^2
loss=(y^−y)2=(ω1x2+ω2x+b−y)2
∂
l
o
s
s
∂
ω
1
=
4
ω
1
x
(
ω
1
x
2
+
ω
2
x
+
b
−
y
)
\frac{\partial_loss}{\partial{\omega_1}}=4\omega_1x(\omega_1x^2+\omega_2x+b-y)
∂ω1∂loss=4ω1x(ω1x2+ω2x+b−y)
∂
l
o
s
s
∂
ω
2
=
2
ω
2
(
ω
1
x
2
+
ω
2
x
+
b
−
y
)
\frac{\partial_loss}{\partial{\omega_2}}=2\omega_2(\omega_1x^2+\omega_2x+b-y)
∂ω2∂loss=2ω2(ω1x2+ω2x+b−y)
∂
l
o
s
s
∂
b
=
2
(
ω
1
x
2
+
ω
2
x
+
b
−
y
)
\frac{\partial_loss}{\partial{b}}=2(\omega_1x^2+\omega_2x+b-y)
∂b∂loss=2(ω1x2+ω2x+b−y)
import torch
x_data = [1.0, 2.0, 3.0]
y_data = [0.0, 1.0, 4.0]
#构建权重w1、w2以及偏置b
w1 = torch.Tensor([1.0])
w1.requires_grad = True#需要自动加载机制
w2 = torch.Tensor([1.0])
w2.requires_grad = True
b = torch.Tensor([1.0])
def forward(x):
return w1 * x * x + w2 * x + b
def loss(x,y):#定义损失函数
y_pred = forward(x)
return (y_pred - y)**2
print("predict before training:",4,forward(4).item())
for epoch in range(100):
for x,y in zip(x_data,y_data):
l = loss(x,y)
l.backward()
print("\tgrad:",x,y,w.grad.item())
w1.data = w1.data - 0.014 * w1.grad.data
w2.data = w2.data - 0.014* w2.grad.data
w1.grad.data.zero_()
w2.grad.data.zero_()
print("progress:",epoch,l.item())
print("predict after training:",4,forward(4).item())
输出结果: