一、简介
- 欧氏距离损失经常用在线性回归问题(求解的是连续问题)。
- 回归问题解决的是对具体数值的预测,比如房价预测、销量预测等等。
- 解决回归问题的神经网络一般只有一个输出节点,这个节点的输出值就是预测值。
二、数学推导
- 假设训练数据 X X X, 训练数据的 l a b e l 为 Y label 为 Y label为Y。
- 预测函数为: f ( x i ) = y i ^ = w x i + b f(x_i) = \hat{y_i} = w x_i + b f(xi)=yi^=wxi+b
- 损失函数:
L o s s M S E ( y , y ^ ) = 1 m a r g m i n ( w , b ) ∑ i = 1 m ( f ( x i ) − y i ) 2 = 1 m a r g m i n ( w , b ) ∑ i = 1 m ( y i − w x i − b ) 2 \begin{aligned} Loss_{MSE}(y, \hat{y}) &= \frac{1}{m} \mathop{arg \ min} \limits_{(w,b)} \sum_{i=1}^m (f(x_i) - y_i)^2 \\ &=\frac{1}{m} \mathop{arg \ min} \limits_{(w,b)} \sum_{i=1}^m (y_i - w x_i - b)^2 \end{aligned} LossMSE(y,y^)=m1(w,b)arg mini=1∑m(f(xi)−yi)2=m1(w,b)arg mini=1∑m(yi−wxi−b)2
2.1 导数计算
-
计算导数 ∂ l o s s ∂ w , ∂ l o s s ∂ b \frac{\partial loss}{\partial w}, \frac{\partial loss}{\partial b} ∂w∂loss,∂b∂loss
∂ l o s s ∂ w = 2 ∑ i = 1 m x i [ f ( x i ) − y i ] \begin{aligned} \frac{\partial loss}{\partial w} = 2 \sum_{i=1}^m x_i [f(x_i) - y_i] \end{aligned} ∂w∂loss=2i=1∑mxi[f(xi)−yi]
∂ l o s s ∂ b = 2 ∑ i = 1 m [ f ( x i ) − y i ] \begin{aligned} \frac{\partial loss}{\partial b} = 2 \sum_{i=1}^m [f(x_i) - y_i] \end{aligned} ∂b∂loss=2i=1∑m[f(xi)−yi] -
求解新的 m 和 b m 和 b m和b
{ b = 1 m ∑ i = 1 m [ y i − w x i ] w = ∑ i = 1 m y i ( x i − x ˉ ) ∑ i = 1 m x i 2 − 1 m ( ∑ i = 1 m x i ) 2 \begin{cases} b = \frac{1}{m} \sum_{i=1}^m [y_i - w x_i] \\ \\ w = \frac{\sum_{i=1}^m y_i (x_i - \bar{x})}{ \sum_{i=1}^m x_i^2 - \frac{1}{m}(\sum_{i=1}^m x_i)^2 } \end{cases} ⎩⎪⎨⎪⎧b=m1∑i=1m[yi−wxi]w=∑i=1mxi2−m1(∑i=1mxi)2∑i=1myi(xi−xˉ)
首先求解 b b b:
∂ l o s s ∂ b = 2 ∑ i = 1 m [ f ( x i ) − y i ] = 2 ∑ i = 1 m [ w x i + b − y i ] = 2 ( m b − ∑ i = 1 m [ y i − w x i ] ) \begin{aligned} \frac{\partial loss}{\partial b} &= 2 \sum_{i=1}^m [f(x_i) - y_i] \\ &= 2 \sum_{i=1}^m [ w x_i + b - y_i] \\ &= 2(mb - \sum_{i=1}^m [y_i - w x_i]) \end{aligned} ∂b∂loss=2i=1∑m[f(xi)−yi]=2i=1∑m[wxi+b−yi]=2(mb−i=1∑m[yi−wxi])
令
∂
l
o
s
s
∂
b
=
0
\frac{\partial loss}{\partial b} = 0
∂b∂loss=0
上
式
=
0
m
b
=
∑
i
=
1
m
[
y
i
−
w
x
i
]
b
=
1
m
∑
i
=
1
m
[
y
i
−
w
x
i
]
\begin{aligned} 上式 &=0 \\ mb &= \sum_{i=1}^m [y_i - w x_i] \\ b &= \frac{1}{m} \sum_{i=1}^m [y_i - w x_i] \end{aligned}
上式mbb=0=i=1∑m[yi−wxi]=m1i=1∑m[yi−wxi]
然后求解 w w w:
∂ l o s s ∂ w = 2 ∑ i = 1 m x i [ f ( x i ) − y i ] = 2 ( w ∑ i = 1 m x i 2 − ∑ i = 1 m ( y i − b ) x i ) \begin{aligned} \frac{\partial loss}{\partial w} &= 2 \sum_{i=1}^m x_i [f(x_i) - y_i] \\ &= 2 (w \sum_{i=1}^m x_i^2 - \sum_{i=1}^m (y_i - b)x_i ) \end{aligned} ∂w∂loss=2i=1∑mxi[f(xi)−yi]=2(wi=1∑mxi2−i=1∑m(yi−b)xi)
令
∂
l
o
s
s
∂
w
=
0
\frac{\partial loss}{\partial w} = 0
∂w∂loss=0
上
式
=
0
w
∑
i
=
1
m
x
i
2
=
∑
i
=
1
m
(
y
i
−
b
)
x
i
将
b
=
1
m
∑
i
=
1
m
[
y
i
−
w
x
i
]
代
入
上
式
w
∑
i
=
1
m
x
i
2
=
∑
i
=
1
m
y
i
x
i
−
∑
i
=
1
m
x
i
1
m
∑
i
=
1
m
[
y
i
−
w
x
i
]
w
∑
i
=
1
m
x
i
2
=
∑
i
=
1
m
y
i
x
i
−
1
m
∑
i
=
1
m
x
i
∑
i
=
1
m
y
i
+
w
m
(
∑
i
=
1
m
x
i
)
2
w
(
∑
i
=
1
m
x
i
2
−
1
m
(
∑
i
=
1
m
x
i
)
2
)
=
∑
i
=
1
m
y
i
x
i
−
1
m
∑
i
=
1
m
x
i
∑
i
=
1
m
y
i
w
(
∑
i
=
1
m
x
i
2
−
1
m
(
∑
i
=
1
m
x
i
)
2
)
=
∑
i
=
1
m
y
i
x
i
−
x
ˉ
∑
i
=
1
m
y
i
w
=
∑
i
=
1
m
y
i
(
x
i
−
x
ˉ
)
∑
i
=
1
m
x
i
2
−
1
m
(
∑
i
=
1
m
x
i
)
2
\begin{aligned} 上式 &=0 \\ w \sum_{i=1}^m x_i^2 &= \sum_{i=1}^m (y_i - b)x_i \\ 将 b &= \frac{1}{m} \sum_{i=1}^m [y_i - w x_i] 代入上式 \\ w \sum_{i=1}^m x_i^2 &= \sum_{i=1}^m y_i x_i - \sum_{i=1}^m x_i \frac{1}{m} \sum_{i=1}^m [y_i - w x_i] \\ w \sum_{i=1}^m x_i^2 &= \sum_{i=1}^m y_i x_i - \frac{1}{m}\sum_{i=1}^m x_i \sum_{i=1}^m y_i + \frac{w}{m} (\sum_{i=1}^m x_i)^2 \\ w(\sum_{i=1}^m x_i^2 - \frac{1}{m}(\sum_{i=1}^m x_i)^2) &= \sum_{i=1}^m y_i x_i - \frac{1}{m}\sum_{i=1}^m x_i \sum_{i=1}^m y_i \\ w(\sum_{i=1}^m x_i^2 - \frac{1}{m}(\sum_{i=1}^m x_i)^2) &= \sum_{i=1}^m y_i x_i - \bar{x} \sum_{i=1}^m y_i \\ w &= \frac{\sum_{i=1}^m y_i (x_i - \bar{x})}{ \sum_{i=1}^m x_i^2 - \frac{1}{m}(\sum_{i=1}^m x_i)^2 } \end{aligned}
上式wi=1∑mxi2将bwi=1∑mxi2wi=1∑mxi2w(i=1∑mxi2−m1(i=1∑mxi)2)w(i=1∑mxi2−m1(i=1∑mxi)2)w=0=i=1∑m(yi−b)xi=m1i=1∑m[yi−wxi]代入上式=i=1∑myixi−i=1∑mxim1i=1∑m[yi−wxi]=i=1∑myixi−m1i=1∑mxii=1∑myi+mw(i=1∑mxi)2=i=1∑myixi−m1i=1∑mxii=1∑myi=i=1∑myixi−xˉi=1∑myi=∑i=1mxi2−m1(∑i=1mxi)2∑i=1myi(xi−xˉ)
三、代码实现
3.1 python 代码实现
import numpy as np
# 模型公式
y_hat = np.dot(X, w) + b
# 损失函数
loss = np.sum((y_hat - y) ** 2) / num_train
# 参数的偏导
dw = np.dot(X.T, (y_hat - y)) / num_train
db = np.sum((y_hat - y)) / num_train
3.2 torch 代码实现
import torch
loss_fn = torch.nn.MSELoss(reduce=False, size_average=False,reduction='mean')
loss = loss_fn(input.float(), target.float())
'''
reduction-三个值,
none: 不使用约简;
mean:返回loss和的平均值;
sum:返回loss的和。
默认:mean。
'''