【trick 7】Rect矩形推理 —— 显著的减少推理时间

满船清梦压星河HK

已于 2022-04-09 10:14:48 修改

阅读量1.3k

点赞数 5

分类专栏： # 相关理论文章标签：矩形推理目标检测trick yolo-v3-spp

于 2021-06-16 20:06:22 首次发布

本文链接：https://blog.csdn.net/qq_38253797/article/details/116611767

版权

一、Square Inference（lettlebox）

为了说明什么是Rectangular inference（矩形推理），就得先说说什么是 Square Inference（正方形推理）。

Square Inference就是输入为正方形，通常的做法是lettlebox：求得较长边缩放到416的比例，然后对图片长宽按这个比例缩放，使得较长边达到416再对较短边进行填充使得较短边也达到416。效果如下图所示：
在这里插入图片描述但是这样会生成很多冗余信息，Rectangular training/inference就是去除这些冗余信息。

二、Rectangular inference

具体过程：将较长边设定为目标尺寸416/512…(必须是32的倍数)，短边按比例缩放，再对短边进行较少填充使短边满足32的倍数。

Padding逻辑：（固定目标size）
确定目标size的宽高比P。新图宽高比大于P，则宽resize到目标尺寸，上下padding黑边；新图宽高比小于P，则高resize到目标尺寸，左右padding黑边。

效果如下所示：
在这里插入图片描述

三、代码

letterbox_and_rect实现了lettlebox和rect两个功能
传入参数auto=False则为lettlebox； auto=True则为rect；

def letterbox(img: np.ndarray, new_shape=(416, 416), color=(114, 114, 114),
              auto=True, scale_fill=False, scale_up=True):
    """
    将图片缩放调整到指定大小
    :param img: 原图 hwc=(375,500,3)
    :param new_shape: 缩放后的最长边大小
    :param color: pad的颜色
    :param auto: True 保证缩放后的图片保持原图的比例 即 将原图最长边缩放到指定大小，再将原图较短边按原图比例缩放（不会失真）
                 False 将原图最长边缩放到指定大小，再将原图较短边按原图比例缩放,最后将较短边两边pad操作缩放到最长边大小（不会失真）
    :param scale_fill: True 简单粗暴的将原图resize到指定的大小 相当于就是resize 没有pad操作（失真）
    :param scale_up: True  对于小于new_shape的原图进行缩放,大于的不变
                     False 对于大于new_shape的原图进行缩放,小于的不变
    :return: img: letterbox后的图片 HWC
             ratio: wh ratios
             (dw, dh): w和h的pad
    """
    shape = img.shape[:2]  # 原图大小[h, w] = [375, 500]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)  # (512, 512)

    # scale ratio (new / old)   1.024
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scale_up:  # (for better test mAP) scale_up = False 对于大于new_shape（r<1）的原图进行缩放,小于new_shape（r>1）的不变
        r = min(r, 1.0)

    # compute padding
    ratio = r, r  # width, height ratios  (1.024, 1.024)
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))  # wh(512, 384) 保证缩放后图像比例不变
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding dw=0 dh=128
    if auto:  # minimun rectangle 保证原图比例不变，将图像最大边缩放到指定大小
        # 这里的取余操作可以保证padding后的图片是32的整数倍(416x416)，如果是(512x512)可以保证是64的整数倍
        dw, dh = np.mod(dw, 64), np.mod(dh, 64)  # wh padding dw=0 dh=0
    elif scale_fill:  # stretch 简单粗暴的将图片缩放到指定尺寸
        dw, dh = 0, 0
        new_unpad = new_shape
        ratio = new_shape[0] / shape[1], new_shape[1] / shape[0]  # wh ratios

    dw /= 2  # divide padding into 2 sides 将padding分到上下，左右两侧
    dh /= 2

    # shape:[h, w]  new_unpad:[w, h]
    if shape[::-1] != new_unpad:  # 将原图resize到new_unpad（长边相同，比例相同的新图）
        img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))  # 计算上下两侧的padding
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))  # 计算左右两侧的padding

    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border/pad
    return img, ratio, (dw, dh)