Random Erasing 原理与代码解析

00000cj

于 2024-04-23 21:53:59 发布

阅读量900

点赞数 15

分类专栏：数据增强文章标签：深度学习人工智能计算机视觉数据增强 random erasing

本文链接：https://blog.csdn.net/ooooocj/article/details/138131894

版权

数据增强专栏收录该内容

8 篇文章 2 订阅

订阅专栏

本文介绍了一种名为RandomErasing的数据增强技术，通过随机遮挡图像中的区域来增强模型对遮挡的鲁棒性。在图像分类、目标检测和行人重识别中，该方法与传统数据增强手段如随机裁剪相辅相成，有效提升了模型性能。

摘要由CSDN通过智能技术生成

paper：Random Erasing Data Augmentation

official implementation：https://github.com/zhunzhong07/Random-Erasing

third-party implementation：https://github.com/huggingface/pytorch-image-models/blob/main/timm/data/random_erasing.py

本文的创新点

本文提出了一种新的数据增强方法Random Erasing（随机擦除），用于训练CNN。该方法在训练过程中随机选择图像中的一个矩形区域，并将其像素用随机值擦除，从而生成具有不同遮挡级别的训练图像。这样做可以减少过拟合的风险，并使模型对遮挡具有鲁棒性。Random Erasing不需要额外的参数学习，易于实现，并且可以与大多数基于CNN的识别模型集成。尽管方法简单，但Random Erasing与常用的数据增强技术（如随机裁剪和翻转）互补，并且在图像分类、目标检测和行人再识别等任务中取得了一致的性能提升。

方法介绍

Random Erasing的伪代码如下所示

在训练过程中，随机擦除以一定的概率 \(p\) 执行，则保持不变的概率为 \(1-p\)。随机擦除随机选择图像中的矩形区域 \(I_e\)，并用随机值替换原本的像素值。假设训练图片的大小为 \(W\times H\)，则面积为 \(S=W\times H\)。我们随机初始化要擦除矩形区域的面积为 \(S_e\)，其中 \(\frac{S_e}{S}\) 在指定范围 \([s_l,s_h]\) 内。擦除矩形区域的长宽比 \(r_e\) 随机初始化在指定 \([r_1,r_2]\) 范围内。矩形区域 \(I_e\) 的大小为 \(H_e=\sqrt{S_e\times r_e}\)，\(W_e=\sqrt{\frac{S_e}{r_e}}\)。然后我们随机选择图片 \(I\) 中的一个点 \(\mathcal{P}=(x_e,y_e)\)，如果 \(x_e+W_e\le W\) 并且 \(y_e+H_e\le H\)，则选择擦除矩形区域为 \(I_e=(x_e,y_e,x_e+W_e,y_e+H_e)\)，否则重复上述过程直到选到合适的 \(I_e\)。

Random Erasing for Image Classification and Person Re-identification

在图像分类任务中，根据图像的视觉内容对图像进行分类，一般来说，训练数据不提供对象的位置，因此我们不知道对象在哪里。这种情况下我们根据Algorithm 1对整张图像进行随机擦除。

图像分类和行人重识别中随机擦除的示例如图1所示

Random Erasing for Object Detection

目标检测任务旨在检测图像中某一类语义对象的实例，由于每个对象在训练图片中的位置已知，我们用三种不同的方案实现随机擦除：1）Image-aware Random Erasing（IRE），在整张图像上选择擦除区域，与分类和行人重识别的方案相同。2）Object-aware Random Erasing（ORE），在每个对象的边界框中选择擦除区域。如果图像中有多个对象，则对每个对象分别进行随机擦除。3）Image and object-aware Random Erasing（I+ORE），在整张图像和每个对象的边界框中都选择擦除区域。三种方案的例子如图2所示

Comparison with Random Cropping

随机裁剪是一种有效的数据增强方法，它减少了背景在CNN决策中的贡献，使模型基于部分对象进行学习，而不是专注于整个对象。与随机裁剪相比，随机擦除保留了对象的整体结构，只遮挡了对象的某些部分。此外，被擦除区域的像素被重新分配为随机值，这可以看作是向图像添加噪声。作者通过实验证明，这两种方法是互补的。随机擦除、随机裁剪以及两者组合的示例如图3所示

实验结果

图像分类

随机擦除在分类任务上的效果如表1所示，其中超参 \(p=0.5, s_l=0.02, s_h=0.4, r_1=\frac{1}{r_2}=0.3\)，可以看到，在三个数据集上，不同的模型下，随机擦除都带来了明显的效果提升。

作者还比较了矩形区域内四种不同的像素重新赋值方法的效果，包括像素随机从[0, 255]采样，称为RE-R；用ImageNet数据集的平均像素值赋值[125, 122, 114]，称为RE-M；用0赋值，称为RE-0；用255赋值，称为RE-255。结果如表2所示，可以看出所有擦除方法都优于baseline，RE-R和RE-M的效果相近，且都优于RE-0和RE-255，后续的实验默认都选择RE-R的方案。

目标检测

在VOC数据集上随机擦除的效果如表5所示，其中超参 \(p=0.5, s_l=0.02, s_h=0.2, r_1=\frac{1}{r_2}=0.3\)，可以看到三种不同选择擦除区域的方法都带来了mAP的提升，其中I+ORE的效果最好。

行人重识别

行人重识别任务上随机擦除的效果如表6所示

代码解析

下面是timm中的实现，非常简单，这里不再过多解释了。

import random
import math

import torch


def _get_pixels(per_pixel, rand_color, patch_size, dtype=torch.float32, device='cuda'):
    # NOTE I've seen CUDA illegal memory access errors being caused by the normal_()
    # paths, flip the order so normal is run on CPU if this becomes a problem
    # Issue has been fixed in master https://github.com/pytorch/pytorch/issues/19508
    if per_pixel:
        return torch.empty(patch_size, dtype=dtype, device=device).normal_()
    elif rand_color:
        return torch.empty((patch_size[0], 1, 1), dtype=dtype, device=device).normal_()
    else:
        return torch.zeros((patch_size[0], 1, 1), dtype=dtype, device=device)


class RandomErasing:
    """ Randomly selects a rectangle region in an image and erases its pixels.
        'Random Erasing Data Augmentation' by Zhong et al.
        See https://arxiv.org/pdf/1708.04896.pdf

        This variant of RandomErasing is intended to be applied to either a batch
        or single image tensor after it has been normalized by dataset mean and std.
    Args:
         probability: Probability that the Random Erasing operation will be performed.
         min_area: Minimum percentage of erased area wrt input image area.
         max_area: Maximum percentage of erased area wrt input image area.
         min_aspect: Minimum aspect ratio of erased area.
         mode: pixel color mode, one of 'const', 'rand', or 'pixel'
            'const' - erase block is constant color of 0 for all channels
            'rand'  - erase block is same per-channel random (normal) color
            'pixel' - erase block is per-pixel random (normal) color
        max_count: maximum number of erasing blocks per image, area per box is scaled by count.
            per-image count is randomly chosen between 1 and this value.
    """

    def __init__(
            self,
            probability=0.5,
            min_area=0.02,
            max_area=1/3,
            min_aspect=0.3,
            max_aspect=None,
            mode='const',
            min_count=1,
            max_count=None,
            num_splits=0,
            device='cuda',
    ):
        self.probability = probability
        self.min_area = min_area
        self.max_area = max_area
        max_aspect = max_aspect or 1 / min_aspect
        self.log_aspect_ratio = (math.log(min_aspect), math.log(max_aspect))
        self.min_count = min_count
        self.max_count = max_count or min_count
        self.num_splits = num_splits
        self.mode = mode.lower()
        self.rand_color = False
        self.per_pixel = False
        if self.mode == 'rand':
            self.rand_color = True  # per block random normal
        elif self.mode == 'pixel':
            self.per_pixel = True  # per pixel random normal
        else:
            assert not self.mode or self.mode == 'const'
        self.device = device

    def _erase(self, img, chan, img_h, img_w, dtype):
        if random.random() > self.probability:
            return
        area = img_h * img_w
        count = self.min_count if self.min_count == self.max_count else \
            random.randint(self.min_count, self.max_count)
        for _ in range(count):
            for attempt in range(10):
                target_area = random.uniform(self.min_area, self.max_area) * area / count
                aspect_ratio = math.exp(random.uniform(*self.log_aspect_ratio))
                h = int(round(math.sqrt(target_area * aspect_ratio)))
                w = int(round(math.sqrt(target_area / aspect_ratio)))
                if w < img_w and h < img_h:
                    top = random.randint(0, img_h - h)
                    left = random.randint(0, img_w - w)
                    img[:, top:top + h, left:left + w] = _get_pixels(
                        self.per_pixel,
                        self.rand_color,
                        (chan, h, w),
                        dtype=dtype,
                        device=self.device,
                    )
                    break

    def __call__(self, input):
        if len(input.size()) == 3:
            self._erase(input, *input.size(), input.dtype)
        else:
            batch_size, chan, img_h, img_w = input.size()
            # skip first slice of batch if num_splits is set (for clean portion of samples)
            batch_start = batch_size // self.num_splits if self.num_splits > 1 else 0
            for i in range(batch_start, batch_size):
                self._erase(input[i], chan, img_h, img_w, input.dtype)
        return input

    def __repr__(self):
        # NOTE simplified state for repr
        fs = self.__class__.__name__ + f'(p={self.probability}, mode={self.mode}'
        fs += f', count=({self.min_count}, {self.max_count}))'
        return fs

00000cj

关注

15
点赞
踩
12

收藏

觉得还不错? 一键收藏
打赏
0
评论
Random Erasing 原理与代码解析

本文提出了一种新的数据增强方法Random Erasing（随机擦除），用于训练CNN。该方法在训练过程中随机选择图像中的一个矩形区域，并将其像素用随机值擦除，从而生成具有不同遮挡级别的训练图像。这样做可以减少过拟合的风险，并使模型对遮挡具有鲁棒性。Random Erasing不需要额外的参数学习，易于实现，并且可以与大多数基于CNN的识别模型集成。尽管方法简单，但Random Erasing与常用的数据增强技术（如随机裁剪和翻转）互补，并且在图像分类、目标检测和行人再识别等任务中取得了一致的性能提升。
复制链接

扫一扫