rectangular training 矩阵训练

最新推荐文章于 2024-04-21 16:29:23 发布

REstrat

最新推荐文章于 2024-04-21 16:29:23 发布

阅读量1.9k

点赞数 9

文章标签：矩阵 python 计算机视觉

本文链接：https://blog.csdn.net/REstrat/article/details/126851437

版权

该博客介绍了YOLOv3模型训练中的两种策略：Square和Rectangular Training。Square Training通过将所有图片调整为正方形来简化处理，但引入了冗余信息。而Rectangular Training则仅将短边填充到32的倍数，减少了冗余，但可能导致图片尺寸不一致，影响数据加载器的shuffling功能。这种优化可以加快训练速度并减少无效信息的影响。

摘要由CSDN通过智能技术生成

矩阵训练是yolov3使用的一个tricks

以前的训练都是Square training，也就是说输入图片是一个正方形。

Square training（正方形训练）

代码

def square(img: np.ndarray, newshape=(414,414), color=(128,128,128)):
	# img: 输入图片，测试是用cv2读取输入的
	# newshape： 图片新的形状
	# color: 填充的像素颜色
    if isinstance(newshape, int):
        newshape = (newshape, newshape)
    h, w, _ = img.shape
    # h大 和 w大 分别处理，保证输出的图片形状一定是newshape
    if h > w:
        r = newshape[1] / h
        h_ = newshape[1]
        w_ = int(round(w * r))
        img = cv2.resize(img, (w_, h_))
        left_pad = int((newshape[0] - w_) / 2)
        right_pad = newshape[0] - w_ - left_pad
        img = cv2.copyMakeBorder(img, 0, 0, left_pad, right_pad,cv2.BORDER_CONSTANT, value=color)
    else:
        r = newshape[0] / w
        h_ = int(round(h * r))
        w_ = newshape[0]
        img = cv2.resize(img, (w_, h_))
        bottom_pad = int((newshape[1] - h_) / 2)
        top_pad = newshape[1] - h_ - bottom_pad
        img = cv2.copyMakeBorder(img, top_pad, bottom_pad, 0, 0, cv2.BORDER_CONSTANT, value=color)
    return img

结果显示

321x481

414x414
在这里插入图片描述

正方形训练可以统一所有图片的大小，从而训练方便，但是问题在填充图片的过程中，我们引入了很多冗余信息。为了处理这个问题，yolov3提出使用矩形训练。

rectangular training （矩阵训练）

矩形训练也很好理解，也就是原来图片的长边还是填充到最大长度，但是短边只填充到32的倍数。
这样处理过后可以引入较少的冗余信息。加快训练速度。
但是引入了其他问题，第一就是图片集的大小不一样，yolov3的处理是将尺寸接近的放到一起处理，这就导致不能使用dataloader中的shuffle功能。

def rectangular(img: np.ndarray, newshape=414, color=(128,128,128)):
	# img: 输入图片，测试是用cv2读取输入的
	# newshape： 图片新的形状
	# color: 填充的像素颜色
    if isinstance(newshape, (tuple,list)):
        newshape = newshape[0]
    h, w, _ = img.shape
    if h > w:
        r = newshape / h
        h_ = newshape
        w_ = int(round(w * r))
        img = cv2.resize(img, (w_, h_))
        left_pad = int((32 - (w_ % 32)) / 2) if w_ % 32 != 0 else 0
        right_pad = 32 - (w_ % 32) - left_pad if w_ % 32 != 0  else 0
        img = cv2.copyMakeBorder(img, 0, 0, left_pad, right_pad,cv2.BORDER_CONSTANT, value=color)
    else:
        r = newshape / w
        h_ = int(round(h * r))
        w_ = newshape
        img = cv2.resize(img, (w_, h_))
        bottom_pad = int((32 - (h_ % 32)) / 2) if h_ % 32 != 0 else 0
        top_pad = 32 - (h_ % 32) - bottom_pad if h_ % 32 != 0  else 0
        img = cv2.copyMakeBorder(img, top_pad, bottom_pad, 0, 0, cv2.BORDER_CONSTANT, value=color)
    return img