Notes for Label Smoothing and Mixup

最新推荐文章于 2022-06-14 09:39:39 发布

Leslie_May

最新推荐文章于 2022-06-14 09:39:39 发布

阅读量367

点赞数

分类专栏： deep learning

本文链接：https://blog.csdn.net/Leslie_May/article/details/115365371

版权

本文介绍了标签平滑（Label Smoothing）和Mixup两种正则化技术。标签平滑通过降低模型对正确标签的置信度，防止过拟合。Mixup是一种数据增强方法，通过线性组合训练样本，促使模型在训练样本间表现得更线性，提高泛化能力。

摘要由CSDN通过智能技术生成

Label smoothing and Mixup

Notes for popular regularization methods used in deep learning.

Label Smoothing

Label smoothing is a mechanism to regularize the classifier layer by estimating the marginalized effect of label-dropout during training.

Why needs Label Smoothing

In a classification task, our model computes the logit $z_i$ of each label for each training example $x$ , then these logits will be normalized by $s o f t m a x$ to get the probability of each label $k\in \{1,...,K\}$
${\exp(z_k)}\over {\sum_{i=1}^K\exp(z_i)}}.$
Consider using a single ground-truth label $y$ and using cross-entropy loss:
$-\sum_{i=1}^Kq_i\log p_i,$

$q_i = \begin{cases} 1,\ if\ \ (i = y) \\ 0,\ if\ \ (i \ne y ) \end{cases}$

In this case, minizing this cross entropy loss is equivalent to maximizing the log-likelihood of the correct label. This maximum is approached if $z_y \gg z_k$ for all $k\ne y$ , that is, if the logit corresponding to the ground-truth label is much greater than all other logits, and the logit for incorrect label will approach $-\infty$ , which is difficult for the model to output.

Let ${\partial l \over {\partial z_k}}=p_k - q_k=0$

最低0.47元/天解锁文章

Leslie_May

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Notes for Label Smoothing and Mixup

Label smoothing and MixupLabel SmoothingWhy needs Label SmoothingLabel SmoothingConclusionMixupNotes for popular regularization methods used in deep learning.Label SmoothingLabel smoothing is a mechanism to regularize the classifier layer by estimating
复制链接

扫一扫

专栏目录