Regularization

最新推荐文章于 2022-02-21 14:06:41 发布

qq_34131692

最新推荐文章于 2022-02-21 14:06:41 发布

阅读量71

点赞数

分类专栏：学习笔记文章标签： python 深度学习算法神经网络

本文链接：https://blog.csdn.net/qq_34131692/article/details/110527681

版权

学习笔记专栏收录该内容

11 篇文章 0 订阅

订阅专栏

Regularization

1. Label Smoothing

对于一个分类问题，假设一共有 $K$ 类，分别用 $y_1,y_2,...,y_i$ 表示，label smoothing可以用以下公式表示：

$\left\{ \begin{aligned} y_i' &=(1-\epsilon)\cdot y_i+\epsilon u(K)\\ u(K)&=\frac{1}{K} \end{aligned} \right.$
where $\epsilon=0.1(default)$ and $y^{'}$ represents the smoothened labels.
要实现，首先要看一下YOLOv3里面关于label是怎么处理的。训练custom数据的时候，按照readme文件
生成了一个有标签的txt文件，data/custom/labels/。The dataloader expects that the annotation file corresponding to the image data/custom/images/train.jpg has the path data/custom/labels/train.txt. 看看哪里用到了labels文件夹。
。。。没研究明白lebel到底是怎么传的，决定先实现这个算法，放进utils.py里面。

函数名：label_smoothing()；
input：smoothing之前的labels，总类别数K；
output：做完label smoothing的labels，记为soft_labels 返回。

def label_smoothing(labels, K, epsilon=0.1):
    """
     labels: list of int, original labels; 
          K: int, total # of classes 
    epsilon: float, default = 0.1
    """
    u_k = 1/K
    soft_labels = [round(((1-epsilon)*label + epsilon * u_k) ,4)  for label in labels]  # 默认保留了4位小数
    return soft_labels

2. DropBlock

由dropout演化而来，不再去掉feature map中单独的一个点，而是去掉周围的一片。
两个参数： $block\_size$ is the size of the block to be dropped， $\gamma$ controls how many activation units to drop。
函数名：drop_block()；
input：feature map A， $block\_size$ ， $\gamma$ ；
output：做完dropblock之后的feature map，A_dropblock。
算法如下：
在这里插入图片描述

import torch
import numpy as np
def drop_block(A, block_size=5, mode=None, keep_prob=0.8):
    """`
    A: tensor, output activations of a layer , size: (N,C,H,W), 4-D tensor
    blocksize: int
    gamma: float
    mode: str, 
    """
    if mode == 'Inference' or keep_prob == 1: #keep all
        return A
    
    # compute gamma 中点drop率
    drop_rate = 1 - keep_prob
    gamma = drop_rate / np.power(block_size,2) 
    
    # create a padding layer
    pad = torch.nn.ZeroPad2d(block_size//2) #假设block_size 5
    A_padded = pad(A)  #pad过后的图像 假设A:(1 x 3 x 416 x 416), 那么A_padded: (1 x 3 x 420 x 420)
    
    # generate a mask which yields to Bernoulli(gamma) distribution
    # mask尺寸和pad过后的图像保持一致，利用小于号判断找到小于gamma的M_{i,j}，用float()将True\False转化为0\1
    M = (torch.rand(A_padded.shape) < gamma).float()  # M: (1 x 3 x 420 x 420)
    M_block = torch.ones_like(M)  # 最后drop掉block的mask 暂时全是1 
    # crop patches with block_size around mask_i,j < gamma
    
    for n in range(M.shape[0]): #  1
        for c in range(M.shape[1]):  # 3
            for w in range(block_size//2, M.shape[2]-(block_size//2)):  # 2:418
                for h in range(block_size//2, M.shape[3]-(block_size//2)):
                    if M[n,c,w,h] == 1:  # ==1 表示 小于gamma为True
                        # 准备dropblock
                        # 把M_block中对应的位置改掉，最后直接乘原图的pad就可以返回值了
                        for i in range(block_size//2):
                            for j in range(block_size//2):
                                M_block[n,c,w+i,h+j]=0
                                M_block[n,c,w+i,h-j]=0
                                M_block[n,c,w-i,h+j]=0
                                M_block[n,c,w-i,h-j]=0
    
    # apply the mask, element-wise multiplication
    # remove the padded part 
    mask = M_block[:,:,(block_size//2):(M.shape[2]-(block_size//2)),(block_size//2):(M.shape[2]-(block_size//2))]
    A = A * mask
    
    # normalize the features
    A = A * mask.numel() / mask.sum()
    return A

一堆for应该有什么更好的写法。最后弄了个输入测试了一下
在这里插入图片描述
发现最后算下来的keep_prob会比设定的值要高一些，比如这个算下来是0.06，也就是0元素的占比，那么kp应该就是0.94这样，但是当时设定里给的是default=0.8。想了一下可能是因为这个比值在包含了pad部分的图像时去计算，是接近的。但是去掉pad的一圈0之后再算，可能就上升了。

3. 接口设置

这两种都是train的时候用到的tricks，可以考虑在train.py中加上额外的args来调用。

parser.add_argument("--label_smoothing", type=int, default=False, help="whether use label smoothing or not")
parser.add_argument("--drop_block", type=int, default=False, help="whether use DropBlock or not")

后面要用的时候就使用opt.label_smoothing 以及 opt.drop_block进行判断就行。

if opt.label_smoothing == True:
	soft_labels = label_smoothing(labels,K=2)

if opt.drop_block == True:
	A = drop_block(A)

qq_34131692

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Regularization

Regularization1. Label Smoothing对于一个分类问题，假设一共有KKK 类，分别用y1,y2,...,yiy_1,y_2,...,y_iy1,y2,...,yi表示，label smoothing可以用以下公式表示：{yi′=(1−ϵ)⋅yi+ϵu(K)u(K)=1K\left\{ \begin{aligned}y_i' &=(1-\epsilon)\cdot y_i+\epsilon u(K)\\u(K)&=\frac{1}{K}\end{
复制链接

扫一扫

专栏目录