目标检测小数据集使用旋转（90度、180度、270度）进行图像增广

兔子Code

已于 2023-09-03 16:00:20 修改

阅读量681

点赞数 1

分类专栏： pytorch数据处理文章标签：目标检测计算机视觉 python

于 2023-09-03 15:59:01 首次发布

本文链接：https://blog.csdn.net/YXD0514/article/details/132651023

版权

pytorch数据处理专栏收录该内容

8 篇文章 5 订阅

订阅专栏

1. 图像旋转简介

目标检测图像旋转是一种常见的数据增强方法，它可以增加训练数据的多样性，提高模型的泛化能力。在求边界时，假定图像进行顺时针旋转，因此需要进行反推新像素位置在原图像中的对应位置，需要用逆时针计算。
顺时针计算方法：
$\mathrm{X=xcos(\theta)+ysin(\theta)} \\ \mathrm{Y=ycos(\theta)-xsin(\theta)}$ 逆时针计算方法
$\mathrm{X=xcos(\theta)-ysin(\theta)} \\ \mathrm{Y=xsin(\theta)+ycos(\theta)}$
${x，y}$ 以图像的中点为坐标原点

2. 代码及详解

import cv2, glob
import math
import numpy as np

# 图像路径获取
paths = glob.glob('./dateset/images/*')
paths.sort()
# 图像标签路径获取
lables = glob.glob('./dateset/labels/*')
lables.sort()

# 对图像进行旋转
# 旋转角度 rotate_angle 必须为 90 / 180 / 270
def rotate_im_poly(im, box, rotate_angle):
    
    im_w, im_h = im.shape[1], im.shape[0]
    dst_im = im.copy()
    dst_polys = []
    rand_degree_cnt = int(rotate_angle/90)
    # np.rot90() 将图像矩阵 逆时针 旋转90°
    for i in range(rand_degree_cnt):
        dst_im = np.rot90(dst_im)

    # 将角度转换为 π，注意一下这里的负号
    rot_degree = -90 * rand_degree_cnt
    rot_angle = rot_degree * math.pi / 180.0
    # 图像的中心点
    cx, cy = 0.5 * im_w, 0.5 * im_h
    # 转换后图像的中心点
    ncx, ncy = 0.5 * dst_im.shape[1], 0.5 * dst_im.shape[0]
    # 进行坐标的转换
    x1 = math.cos(rot_angle) * (box[0] - cx) - math.sin(rot_angle) * (box[1] - cy) + ncx
    y1 = math.sin(rot_angle) * (box[0] - cx) + math.cos(rot_angle) * (box[1] - cy) + ncy
    x2 = math.cos(rot_angle) * (box[2] - cx) - math.sin(rot_angle) * (box[3] - cy) + ncx
    y2 = math.sin(rot_angle) * (box[2] - cx) + math.cos(rot_angle) * (box[3] - cy) + ncy
    dst_polys = [x1, y1, x2, y2]

    # 左上角位置标签要小于右下角位置标签
    if dst_polys[0] > dst_polys[2]:
        sw = dst_polys[0]
        dst_polys[0] = dst_polys[2]
        dst_polys[2] = sw
    if dst_polys[1] > dst_polys[3]:
        sw = dst_polys[1]
        dst_polys[1] = dst_polys[3]
        dst_polys[3] = sw

    return dst_im, dst_polys

# Yolo格式数据集
for path, label in zip(paths, lables):
    # 图像读取
    img = cv2.imread(path)
    with open('./' + path.split('\\')[-1][:-3] + 'txt', 'w') as up:
        # 标签读取
        lines = open(label).readlines()
        for line in lines:
            line = line.strip()
            line = line.split()
            cls = line[0]
            box = [int(line[1]), int(line[2]), int(line[3]), int(line[4])]
            # 进行旋转
            img_, box = rotate_im_poly(img, box, 90)
            # 旋转后标签写入标签文件
            up.write(str(cls)+ ' ' + " ".join([str(int(a)) for a in box]) + '\n')
    path = './' + path.split('\\')[-1][:-3] + 'jpg'
    # 图像存储
    cv2.imwrite(path, img_)