中inf值可以换成0吗_目标检测中的数据增强方法

在训练中对图片进行数据增强可以提高模型的泛化能力,目标检测中的数据增强方法跟普通的数据增强区别在于还要考虑boxunding box的变换。

常见的数据增强包括随机裁剪,扭曲,扩增,镜像,形变等。

随机裁剪

  1. 从图像中随机裁剪一个正方形区域roi
  2. 判断roi与各个目标框的iou,过小则重新裁剪
  3. 根据各个目标框的中心来选取在roi中的box
  4. 调整box的大小防止超出roi
def _crop(image, boxes, laels):
    heigth, width, _ = image.shape
    
    if len(boxes) == 0:
        return image, boxes, laels

    while True:
        mode = random.choice((
            None,
            (0.1, None),
            (0.3, None),
            (0.5, None),
            (0.7, None),
            (0.9, None),
            (None, None)
        ))
        if mode is None:
            return image, boxes, laels

        min_iou, max_iou = mode
        if min_iou is None:
            min_iou = float('-inf')
        if max_iou is None:
            max_iou = float('inf')
        # 多次迭代 
        for _ in range(50):
            scale = random.uniform(0.3, 1.)
            min_ratio = max(0.5, scale*scale)
            max_ratio = min(2, 1./scale/scale)
            ratio = math.sqrt(random.uniform(min_ratio, max_ratio))
            # 随机裁剪的图片大小
            w = int(scale*ratio*width)
            h = int((scale/ratio)*heigth)
            # 从0- width-w中选择一个起始点
            l = random.randrange(width-w)
            t = random.randrange(heigth-h)
            roi = np.array([l, t, l+w, t+h])
            # 计算目标框和待裁剪区域的iou大小 iou太小则重新选择
            iou = matrix_iou(boxes, roi[np.newaxis])
            if not (min_iou <= iou.min() and iou.max() <= max_iou):
                continue

            image_t = image[roi[1]:roi[3], roi[0]:roi[2]]
            # 得到每个目标框中心点
            centers = (boxes[:, :2] + boxes[:, 2:])/2
            # 选择在roi区域的box
            mask = np.logical_and(roi[:2] < centers, centers < roi[2:]).all(axis=1)
            boxes_t = boxes[mask].copy()
            laels_t = laels[mask].copy()

            # 裁剪区域不存在目标 重新裁剪
            if len(boxes_t) == 0:
                continue
            # 判断box是否超过roi范围并调整
            boxes_t[:, :2] = np.maximum(boxes_t[:, :2], roi[:2])
            boxes_t[:, :2] -= roi[:2]
            boxes_t[:, 2:] = np.minimum(boxes_t[:, 2:], roi[2:])
            boxes_t[:, 2:] -= roi[:2]

            return image_t, boxes_t, laels_t

def matrix_iou(a,b):
    lt = np.maximum(a[:, np.newaxis, :2], b[:, :2])
    rb = np.minimum(a[:, np.newaxis, 2:], b[:, 2:])

    area_i = np.prod(rb - lt, axis=2) * (lt < rb).all(axis=2)
    area_a = np.prod(a[:, 2:] - a[:, :2], axis=1)
    area_b = np.prod(b[:, 2:] - b[:, :2], axis=1)
    return area_i / (area_a[:, np.newaxis] + area_b - area_i)

随机扭曲

  1. 随机变换亮度,即对图像加减某个值
  2. 随机变换对比度,即像素点乘某个值
  3. 随机变换色度, 将图像转换到HSV空间,然后再加上某个值
  4. 随机变换饱和度 HSV的色彩空间乘某个值
def _distort(image):
    def _convert(image, alpha=1, beta=0):
        tmp = image.astype(float) * alpha + beta
        tmp[tmp < 0] = 0
        tmp[tmp > 255] = 255
        image[:] = tmp

    image = image.copy()
    # 随机变换亮度
    if random.randrange(2):
        _convert(image, beta=random.uniform(-32, 32))

    # 随机变换对比度
    if random.randrange(2):
        _convert(image, alpha=random.uniform(0.5, 1.5))
    # 将图片转到HSV空间
    image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

    # 随机色度变换
    if random.randrange(2):
        tmp = image[:, :, 0].astype(int) + random.randint(-18, 18)
        tmp %= 180
        image[:, :, 0] = tmp

    # 随机变换饱和度
    if random.randrange(2):
        _convert(image[:, :, 1], alpha=random.uniform(0.5, 1.5))

    image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR)

    return image

随机扩增

  1. 随机产生一个新的正方形区域roi(大于原始图像)
  2. 将原始图像放入roi中
  3. 调整box
  4. 空白区域使用图像均值填充
def _expand(image, boxes, fill, p):
    # 超过某个概率则不进行扩增
    if random.random() > p:
        return image, boxes

    height, width, depth = image.shape
    for _ in range(50):
        scale = random.uniform(1, 4)
        min_ratio = max(0.5, 1./scale/scale)
        max_ratio = min(2, scale*scale)
        ratio = math.sqrt(random.uniform(min_ratio, max_ratio))
        # 获得扩张后图片大小
        ws = scale*ratio
        hs = scale/ratio
        if ws < 1 or hs < 1:
            continue
        w = int(ws*width)
        h = int(hs*height)

        left = random.randint(0, w - width)
        top = random.randint(0, h - height)
        # 调整目标框在新图片中位置
        boxes_t = boxes.copy()
        boxes_t[:, :2] += (left, top)
        boxes_t[:, 2:] += (left, top)

        expand_image = np.empty((h, w, depth), dtype=image.dtype)
        expand_image[:, :] = fill
        expand_image[top:top + height, left:left+width] = image
        image = expand_image

        return image, boxes_t

随机镜像

def _mirror(image, boxes):
    _, width, _ = image.shape
    if random.randrange(2):
        # 翻转图片
        image = image[:, ::-1]
        # 翻转目标框
        boxes = boxes.copy()
        boxes[:, 0::2] = width - boxes[:, 2::-2]

    return image, boxes

随机形变

"Best Practices for Convolutional Neural Networks applied to Visual Document Analysis", in Proc. of the International Conference on Document Analysis andRecognition, 2003.

弹性变化是对像素点各个维度产生(-1,1)区间的随机标准偏差,并用高斯滤波(0,sigma)对各维度的偏差矩阵进行滤波,最后用放大系数alpha控制偏差范围。因而由A(x,y)得到的A’(x+delta_x,y+delta_y)。A‘的值通过在原图像差值得到,A’的值充当原来A位置上的值。一般来说,alpha越小,sigma越大,产生的偏差越小,和原图越接近。

def _elastic(image, p, alpha=None, sigma=None, random_state=None):
    if random.random() > p:
        return image
    if alpha is None:
        alpha = image.shape[0] * random.uniform(0.5, 2)
    if sigma is None:
        sigma = int(image.shape[0] * random.uniform(0.5, 1))
    # 随机种子
    if random_state is None:
        random_state = np.random.RandomState(None)

    shape = image.shape[:2]
    dx, dy = [cv2.GaussianBlur((random_state.rand(*shape)*2-1) * alpha,
                               (sigma | 1, sigma | 1), 0) for _ in range(2)]

    x, y = np.meshgrid(np.arange(shape[1]), np.arange(shape[0]))
    x, y = np.clip(x+dx, 0, shape[1]-1).astype(np.float32), np.clip(y+dy, 0, shape[0]-1).astype(np.float32)
    return cv2.remap(image, x, y, interpolation=cv2.INTER_LINEAR, borderValue=0, borderMode=cv2.BORDER_REFLECT)

实验结果

处理前

777d8ae2a80d03106350339be87bb957.png

处理后

4264b8d37a2e6485809f543adc0e1801.png

参考文献

[1] 目标检测:数据增强(Numpy+Pytorch)

[2] 增加样本——弹性变换算法实现

[3] https://github.com/jinfagang/ssds_pytorch/blob/master/lib/utils/data_augment.py

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值