神经网络训练Trick-Label Smoothing(标签平滑)

Label Smoothing由(Christian Szeged et al., 2015)是为了防止训练过拟合而提出。

提出原因:

one-hot encoding(独热编码):
在分类问题中,常常通过softmax将输出向量转化为独热编码,即正类为1,其他为0
对于N分类问题,每一类对应一个N维向量:

#label及其所对应的one-hot编码对应如下:
label=[0,1,2,3,4,5,6]
one_hot_encode = [[1,0,0,0,0,0,0],
                 [0,1,0,0,0,0,0],
                 [0,0,1,0,0,0,0],
                 [0,0,0,1,0,0,0],
                 [0,0,0,0,1,0,0],
                 [0,0,0,0,0,1,0],
                 [0,0,0,0,0,0,1]]

缺点:
如交叉熵损失函数
l o s s = − ∑ k = 1 K q ( k / x ) log ⁡ ( p ( k / x ) ) loss=-\displaystyle\sum_{k=1}^Kq(k/x)\log(p(k/x)) loss=k=1Kq(k/x)log(p(k/x))
分类越准确loss值越接近0,否则越趋近负无穷。然而我们的标注不一定是完全准确的,因此如果使用独热编码会导致使用交叉熵学习的目标函数不一定达到最优,反而可能过拟合

LabelSmoothing:

真实概率分布变化:
在这里插入图片描述
ϵ \epsilon ϵ常取0.1

def label_smoothing(inputs, eps=0.1):
  	K = inputs.size(-1)    # number of class
  	return (1-eps) * inputs + eps / (K-1)
 #one-hot编码对应改为:
[[0.9998,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002],
[0.0002,0.9998,0.0002,0.0002,0.0002,0.0002,0.0002],
[0.0002,0.0002,0.9998,0.0002,0.0002,0.0002,0.0002],
[0.0002,0.0002,0.0002,0.9998,0.0002,0.0002,0.0002],
[0.0002,0.0002,0.0002,0.0002,0.9998,0.0002,0.0002],
[0.0002,0.0002,0.0002,0.0002,0.0002,0.9998,0.0002],
[0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.9998]]

参考:
标签平滑Label Smoothing [CSDN]
Label smooth [CSDN]
机器学习中用来防止过拟合的方法有哪些? [简书]

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值