Data Augmentations

Heuristics-driven

Pad and Crop

He K., Zhang X., Ren S. and Sun J. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

即:

import torchvision.transforms as T

T.Compose([
	T.Pad(4, padding_mode='reflect'),
	T.RandomCrop(32), # for CIFAR-10 and CIFAR-100
	T.RandomHorizontalFlip(),
	T.ToTensor()
])

image-20210609171026364

注: T.RandomCrop(200)

Cutout

DeVries T. and Taylor G. W. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017

原文代码

形象点说, 就是给图片挖几个洞(不过作者的解释挺有趣的, 输入层的连续dropout, 这个想法有趣).

# uoguelph-mlrg cutout
# https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py
# 原文代码

  
import torch
import numpy as np

class Cutout(object):
    """Randomly mask out one or more patches from an image.
    Args:
        n_holes (int): Number of patches to cut out of each image.
        length (int): The length (in pixels) of each square patch.
    """
    def __init__(self, n_holes, length):
        self.n_holes = n_holes
        self.length = length

    def __call__(self, img):
        """
        Args:
            img (Tensor): Tensor image of size (C, H, W).
        Returns:
            Tensor: Image with n_holes of dimension length x length cut out of it.
        """
        h = img.size(1)
        w = img.size(2)

        mask = np.ones((h, w), np.float32)

        for n in range(self.n_holes):
            y = np.random.randint(h)
            x = np.random.randint(w)

            y1 = np.clip(y - self.length // 2, 0, h)
            y2 = np.clip(y + self.length // 2, 0, h)
            x1 = np.clip(x - self.length // 2, 0, w)
            x2 = np.clip(x + self.length // 2, 0, w)

            mask[y1: y2, x1: x2] = 0.

        mask = torch.from_numpy(mask)
        mask = mask.expand_as(img)
        img = img * mask

        return img

image-20210609172940195

MixUp

Zhang H., Cisse M., Dauphin Y. N. and Lopez-Paz D. mixup: Beyond empirical risk minimization. In International Conference on Learning Representations (ICLR), 2018.

代码

MixUp 就是将两个图片按照一定比例相加
x ~ = λ x A + ( 1 − λ ) x B , \tilde{x} = \lambda x_A + (1 - \lambda) x_B, x~=λxA+(1λ)xB,
λ ∼ B e t a ( α , α ) , α ∈ ( 0 , + ∞ ) \lambda \sim \mathrm{Beta}(\alpha, \alpha), \alpha \in (0, +\infty) λBeta(α,α),α(0,+), 同时
y ~ = λ y A + ( 1 − λ ) y B . \tilde{y} = \lambda y_A + (1 - \lambda) y_B. y~=λyA+(1λ)yB.

image-20210609180616376

CutMix

Yun S., Han D., Oh S. J., Chun S., Choe J. and Yoo Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In International Conference on Vision (ICCV), 2019.

原文代码

CutMix 就是将一个图片裁剪掉一块, 然后用另一张图片的对应位置填补:
x ~ = M ∘ x A + ( 1 − M ) ∘ x B , \tilde{x} = M \circ x_A + (1 - M) \circ x_B, x~=MxA+(1M)xB,
其中 M ∈ { 0 , 1 } H × W M \in \{0, 1\}^{H \times W} M{0,1}H×W 是掩码矩阵, 按照如下方式生成:
r x ∼ U [ 0 , W ] ,   r y = U [ 0 , H ] , r w = W 1 − λ ,   r h = H 1 − λ , r_x \sim U[0, W], \: r_y = U[0, H], \\ r_w = W\sqrt{1 - \lambda} , \: r_h = H\sqrt{1 - \lambda}, rxU[0,W],ry=U[0,H],rw=W1λ ,rh=H1λ ,
得到一个bounding box, 在其中的 M i j = 0 M_{ij}= 0 Mij=0, 否则为 1 1 1, λ \lambda λ控制框的大小的:
r w r h H W = 1 − λ . \frac{r_w r_h}{HW} = 1 - \lambda. HWrwrh=1λ.

此时, 标签需要发生相应变化:
y ~ = λ y A + ( 1 − λ ) y B , \tilde{y} = \lambda y_{A} + (1 - \lambda)y_B, y~=λyA+(1λ)yB,
这里标签是one-hot形式(或者概率向量).

image-20210609175227418

AugMix

AugMix

Data-driven

AutoAugment

Cubuk E. D., Zoph B., Man’{e}, D., Vasudevan V. and Le Q. V. AutoAugment: learning augmentation strategies from data. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

根据数据集选择合理的augmentations策略.

RandAugment

Cubuk E. D., Zoph B., Shlens J. and Le Q. V. RandAugment: practical automated data augmentation with a reduced search space.

K K K中augmentations中随机选择 N N N(replace=True), 每个的magnitude都是 M M M.
其中 M , N M, N M,N是通过网格搜索找到的(太粗暴了吧)

DeepAugment

Hendrycks D., Basart S., Mu N., Kadavath S., Wang F., Dorundo E., Desai R., Zhu T., Parajuli S., Guo M., Song D., Steinhardt J. Gilmer J. The many faces of robustness: a critical analysis of out-of-distribution generalization. arXiv preprint arXiv:2006.16241, 2020.

假设 f ( ⋅ ; θ ) f(\cdot;\theta) f(;θ)是一个Image-to-Image的网络, 则DeepAugment是指对 θ \theta θ添加扰动, 比如扰动权重, 改变激活函数等等.

和这个思想类似的还有AdversarialAugment.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值