【Pytorch】label smoothing

最新推荐文章于 2022-11-27 02:10:04 发布

mjiansun

最新推荐文章于 2022-11-27 02:10:04 发布

阅读量8k

点赞数 3

分类专栏： Pytorch

Pytorch 专栏收录该内容

64 篇文章 9 订阅

订阅专栏

本质上就是将标签变得平滑，让预测的结果不那么相信所属的类别，起到增强鲁棒性的作用。

标签平滑的想法首先被提出用于训练 Inception-v2 [26]。它将真实概率的构造改成：

其中ε是一个小常数，K 是标签总数量。

图 4：ImageNet 上标签平滑效果的可视化。顶部：当增加ε时，目标类别与其它类别之间的理论差距减小。下图：最大预测与其它类别平均值之间差距的经验分布。很明显，通过标签平滑，分布中心处于理论值并具有较少的极端值。

import torch
import torch.nn as nn

class NMTCritierion(nn.Module):
    """
    TODO:
    1. Add label smoothing
    """
    def __init__(self, label_smoothing=0.0):
        super(NMTCritierion, self).__init__()
        self.label_smoothing = label_smoothing
        self.LogSoftmax = nn.LogSoftmax()

        if label_smoothing > 0:
            self.criterion = nn.KLDivLoss(size_average=False)
        else:
            self.criterion = nn.NLLLoss(size_average=False, ignore_index=100000)
        self.confidence = 1.0 - label_smoothing

    def _smooth_label(self, num_tokens):
        # When label smoothing is turned on,
        # KL-divergence between q_{smoothed ground truth prob.}(w)
        # and p_{prob. computed by model}(w) is minimized.
        # If label smoothing value is set to zero, the loss
        # is equivalent to NLLLoss or CrossEntropyLoss.
        # All non-true labels are uniformly set to low-confidence.
        one_hot = torch.randn(1, num_tokens)
        one_hot.fill_(self.label_smoothing / (num_tokens - 1))
        return one_hot

    def _bottle(self, v):
        return v.view(-1, v.size(2))

    def forward(self, dec_outs, labels):
        scores = self.LogSoftmax(dec_outs)
        num_tokens = scores.size(-1)

        # conduct label_smoothing module
        gtruth = labels.view(-1)
        if self.confidence < 1:
            tdata = gtruth.detach()
            one_hot = self._smooth_label(num_tokens)  # Do label smoothing, shape is [M]
            if labels.is_cuda:
                one_hot = one_hot.cuda()
            tmp_ = one_hot.repeat(gtruth.size(0), 1)  # [N, M]
            tmp_.scatter_(1, tdata.unsqueeze(1), self.confidence)  # after tdata.unsqueeze(1) , tdata shape is [N,1]
            gtruth = tmp_.detach()
        loss = self.criterion(scores, gtruth)
        return loss

参考：https://blog.csdn.net/e01528/article/details/85019274

mjiansun

关注

3
点赞
踩
16

收藏

觉得还不错? 一键收藏
0
评论
【Pytorch】label smoothing

本质上就是将标签变得平滑，让预测的结果不那么相信所属的类别，起到增强鲁棒性的作用。标签平滑的想法首先被提出用于训练 Inception-v2 [26]。它将真实概率的构造改成：其中ε是一个小常数，K 是标签总数量。图 4：ImageNet 上标签平滑效果的可视化。顶部：当增加ε时，目标类别与其它类别之间的理论差距减小。下图：最大预测与其它类别平均值之间差距的经验分布。很明...
复制链接

扫一扫