目的
解决分类模型存在的过拟合(over-fitting)和过分自信(overconfident)的问题。
思想
中庸思想,答对了不过分夸奖,打错了不过分惩罚,既要保持答对与答错的区分度,又有保持一定的可伸缩性,防止一棍子打死
公式
过分自信(overconfident)
预测出来的概率,比模型整体的概率高
实现
负对数似然损失:NLLloss
NLLloss(negative log-likelihood loss):对x(输入)做softmax,然后取log(对数),等价于CrossEntropyLoss(交叉熵损失)
做法
import torch
import torch.nn as nn
import torch.nn.functional as F
class LabelSmoothingCrossEntropy(nn.Module):
"""
Cross Entropy loss with label smoothing.
"""
def __init__(self, smoothing=0.1):
"""
Constructor for the LabelSmoothing module.
:param smoothing: label smoothing factor
"""
super(LabelSmoothingCrossEntropy, self).__init__()
assert 0.0 < smoothing < 1.0
self.smoothing = smoothing
self.confidence = 1. - smoothing
def forward(self, x, target):
"""
写法1
"""
# logprobs = F.log_softmax(x, dim=-1)
# nll_loss = -logprobs.gather(dim=-1, index=target.unsqueeze(1))
# nll_loss = nll_loss.squeeze(1) # 得到交叉熵损失
# # 注意这里要结合公式来理解,同时留意预测正确的那个类,也有a/K,其中a为平滑因子,K为类别数
# smooth_loss = -logprobs.mean(dim=1)
# loss = self.confidence * nll_loss + self.smoothing * smooth_loss
"""
写法2
"""
y_hat = torch.softmax(x, dim=1)
# 这里cross_loss和nll_loss等价
cross_loss = self.cross_entropy(y_hat, target)
smooth_loss = -torch.log(y_hat).mean(dim=1)
# smooth_loss也可以用下面的方法计算,注意loga + logb = log(ab)
# smooth_loss = -torch.log(torch.prod(y_hat, dim=1)) / y_hat.shape[1]
loss = self.confidence * cross_loss + self.smoothing * smooth_loss
return loss.mean()
def cross_entropy(self, y_hat, y):
return - torch.log(y_hat[range(len(y_hat)), y])
l = LabelSmoothingCrossEntropy(smoothing=0.1)
# 两个样本,第一个正确类为第1类,第二个正确类为第3类
x = torch.tensor([[0.0682, -0.5742, 0.3612, -0.4870, -2.7665],
[-0.8642, -0.0828, -0.9225, -1.1206, 0.6337]])
target = torch.tensor([0, 2])
a = l.forward(x, target)
print(a)
tensor(1.7892)
参考链接
https://towardsdatascience.com/what-is-label-smoothing-108debd7ef06