深度学习：图像扩增方法

最新推荐文章于 2024-09-07 00:02:48 发布

冰菓(笑)

最新推荐文章于 2024-09-07 00:02:48 发布

阅读量3.2k

点赞数 1

分类专栏：图像处理 pytorch 目标检测

本文链接：https://blog.csdn.net/a362682954/article/details/89787363

版权

本文探讨了深度学习中图像预处理的重要方法，包括图像大小变化、图像翻转及Random Distortion。图像大小变化涉及尺寸随机抖动和位置处理，以避免直接缩放导致的变形。图像翻转通常只做左右翻转，并需考虑目标检测任务中标签的对应调整。Random Distortion通过HSV色彩空间变换实现图像扩增，但需注意变换可能导致的颜色失真。适当亮度变换可能是更合理的图像增强策略。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一个良好的图像预处理能够有效提升模型的准确率。本文总结了常用的图像预处理方法。

常见的模型输入一般为固定大小的图像输入，而数据集中的图像常常是不规则大小的图像，因此，对于大小不规则的图像需要放缩至固定大小，而直接使用resize()函数会使得图像变形，因此需要对图像继续填充后继续放缩。

图像大小变化

import cv2
import numpy as np
def preprocess(img, imgsize, jitter, random_placing=False):
    """
    Image preprocess for yolo input
    Pad the shorter side of the image and resize to (imgsize, imgsize)
    Args:
        img (numpy.ndarray): input image whose shape is :math:`(H, W, C)`.
            Values range from 0 to 255.
        imgsize (int): target image size after pre-processing
        jitter (float): amplitude of jitter for resizing
        random_placing (bool): if True, place the image at random position

    Returns:
        img (numpy.ndarray): input image whose shape is :math:`(C, imgsize, imgsize)`.
            Values range from 0 to 1.
        info_img : tuple of h, w, nh, nw, dx, dy.
            h, w (int): original shape of the image
            nh, nw (int): shape of the resized image without padding
            dx, dy (int): pad size
    """
    h, w, _ = img.shape
    img = img[:, :, ::-1]
    assert img is not None
    #尺寸大小的随机抖动，jitter越大，长宽的的变化越大
    if jitter > 0:
        # add jitter
        dw = jitter * w
        dh = jitter * h
        new_ar = (w + np.random.uniform(low=-dw, high=dw))\

最低0.47元/天解锁文章