损失函数
深度学习中损失是用来衡量预测结果和真实结果的误差,从而能够反向传播更新整个网络参数。
1. 均方误差(MSE)
MSE(Mean Square Error)一般用于回归问题。
其由在假设误差服从高斯分布的条件下得到,具体推导见
机器学习之线性回归原理与Python实现.
公式:
l ( y p r e d ) = 1 2 ( y p r e d − y ) 2 l(y_{pred}) = \frac {1} {2} (y_{pred} - y)^2 l(ypred)=21(ypred−y)2
其一阶级导数:
∂ l ∂ y p r e d = y p r e d − y \frac {\partial l} {\partial y_{pred}} = y_{pred} - y ∂ypred∂l=ypred−y
2. 交叉熵(CrossEntropy)
交叉熵损失主要用于多分类问题,也就是对于softmax结果进行误差计算。
具体可见
深度学习之一个例子(BP算法,loss函数)以及python实现
公式:
l ( y p r e d ) = − ∑ c y c log y p r e d c l(y_{pred}) = -\sum_c y^c \log y_{pred}^c l(ypred)=−c∑yclogypredc
其中 c c c表示样本真实的分类索引。
其对softmax之前的结果进行求一阶导数得到:
∂ l o s s ∂ a c L + 1 ( x ) = f c ( x ) − I ( y = c ) \begin{aligned} \frac {\partial loss} {\partial a_c^{L+1}(x)} & = f_c(x) - I(y=c) \end{aligned} \\ ∂acL+1(x)∂loss=fc(x)−I(y=c)
上式详细推导见深度学习之一个例子(BP算法,loss函数)以及python实现
3. python实现
import numpy as np
class MSE(object):
"""
L2
SquaredError
最小均方误差
回归结果,神经元输出后结果计算损失
"""
def __str__(self):
return 'MSE'
def __call__(self, y, y_pred):
return self.loss(y, y_pred)
def loss(self, y, y_pred):
"""
loss = 1/2 * (y - y_pred)^2
:param y:class:`ndarray <numpy.ndarray>` 样本结果(n, m)
:param y_pred:class:`ndarray <numpy.ndarray>` 样本预测结果(n, m)
:return: shape(n, m)
"""
return 0.5 * np.sum((y_pred - y) ** 2, axis=-1)
def grad(self, y, y_pred):
"""
一阶导数: y_pred - y
:param y: class:`ndarray <numpy.ndarray>`样本结果(n, m)
:param y_pred:class:`ndarray <numpy.ndarray>` 样本预测结果(n, m)
:return: shape(n, m)
"""
return y_pred - y
class CrossEntropy(object):
def __init__(self):
self.eps = np.finfo(float).eps
def __str__(self):
return 'CrossEntropy'
def __call__(self, y, y_pred):
return self.loss(y, y_pred)
def loss(self, y, y_pred):
"""
loss = - sum_x p(x) log q(x)
:param y:class:`ndarray <numpy.ndarray>` 样本结果(n, m)
:param y_pred:class:`ndarray <numpy.ndarray>` 样本预测结果(n, m)
:return: shape(n, m)
"""
loss = -np.sum(y * np.log(y_pred + self.eps), axis=-1)
return loss
def grad(self, y, y_pred):
"""
这儿的一阶导数包括了softmax部分
:param y:class:`ndarray <numpy.ndarray>` 样本结果(n, m)
:param y_pred:class:`ndarray <numpy.ndarray>` 样本预测结果(n, m)
:return: shape(n, m)
"""
grad = y_pred - y
return grad
4. pytorch结果验证
import torch
import torch.nn.functional as F
def run_mse_fun():
y_pred = np.array([
np.array([0.3, 0.4, 0.3]),
np.array([0.1, 0.1, 0.8])
])
y = np.array([
np.array([1., 2., 3.]),
np.array([4., 5., 6.])
])
print("*" * 10, '自定义 mse(L2)', "*" * 10)
loss_fn = MSE()
print(loss_fn, 'loss:', loss_fn(y=y, y_pred=y_pred).sum())
print(loss_fn, 'grad:\n', loss_fn.grad(y=y, y_pred=y_pred))
"""
pytorch
因为F.mse_loss中
f(x)= (y_pred - y)^2
f'(x) = 2(y_pred - y)
所以结果会是我们自定义的结果的两背
"""
y_pred_pytorch = torch.autograd.Variable(torch.FloatTensor(y_pred), requires_grad=True)
y = torch.FloatTensor(y)
loss_mse = F.mse_loss(y_pred_pytorch, y, size_average=False)
print(loss_mse)
loss_mse.backward()
grad = y_pred_pytorch.grad.numpy()
print('pytorch mse grad: \n', grad)
def run_cross_entropy_fun():
"""test cross entropy """
"""
自定义模型结果
"""
def softmax(X):
e_X = np.exp(X - np.max(X, axis=1, keepdims=True))
return e_X / e_X.sum(axis=1, keepdims=True)
y_before_softmax = np.array([
np.array([0.3, 0.4, 0.3]),
np.array([0.1, 0.1, 0.8])
])
y = np.array([
np.array([0, 1, 0]),
np.array([0, 0, 1])
])
print("*" * 10, '自定义cross entropy', "*" * 10)
y_pred = softmax(y_before_softmax)
loss_fn = CrossEntropy()
print(loss_fn, 'loss:', loss_fn(y=y, y_pred=y_pred).sum())
print(loss_fn, 'grad:\n', loss_fn.grad(y=y, y_pred=y_pred))
"""
pytorch结果
"""
"""
1. cross entropy
"""
print("*" * 10, ' pytorch cross entropy', "*" * 10)
y_pred_pytorch = torch.autograd.Variable(torch.FloatTensor(y_before_softmax), requires_grad=True)
y = torch.LongTensor(y.argmax(axis=1))
loss_cross_entropy = F.cross_entropy(y_pred_pytorch, y, size_average=False).sum()
print('pytorch cross entropy loss:',loss_cross_entropy)
loss_cross_entropy.backward()
grad = y_pred_pytorch.grad.numpy()
print('pytorch cross entropy grad: \n', grad)
"""
2. null loss
"""
print("*" * 10, ' pytorch null loss', "*" * 10)
y_pred_nll = torch.autograd.Variable(torch.FloatTensor(y_pred), requires_grad=True)
loss_cross_nll = F.nll_loss(y_pred_nll, y, size_average=False).sum()
print('pytorch nll loss:', loss_cross_nll)
loss_cross_nll.backward()
grad = y_pred_pytorch.grad.numpy()
print('pytorch null grad: \n', grad)
"""
运行结果:
MSE loss: 38.300000000000004
MSE grad:
[[-0.7 -1.6 -2.7]
[-3.9 -4.9 -5.2]]
tensor(76.6000, grad_fn=<MseLossBackward>)
pytorch mse grad:
[[ -1.4 -3.2 -5.4]
[ -7.8 -9.8 -10.4]]
********** 自定义cross entropy **********
CrossEntropy loss: 1.7227954009197912
CrossEntropy grad:
[[ 0.32204346 -0.64408693 0.32204346]
[ 0.2491434 0.2491434 -0.4982868 ]]
********** pytorch cross entropy **********
pytorch cross entropy loss: tensor(1.7228, grad_fn=<SumBackward0>)
pytorch cross entropy grad:
[[ 0.32204345 -0.64408696 0.32204345]
[ 0.2491434 0.2491434 -0.49828678]]
********** pytorch null loss **********
pytorch nll loss: tensor(-0.8576, grad_fn=<SumBackward0>)
pytorch null grad:
[[ 0.32204345 -0.64408696 0.32204345]
[ 0.2491434 0.2491434 -0.49828678]]
"""