一个良好的图像预处理能够有效提升模型的准确率。本文总结了常用的图像预处理方法。
常见的模型输入一般为固定大小的图像输入,而数据集中的图像常常是不规则大小的图像,因此,对于大小不规则的图像需要放缩至固定大小,而直接使用resize()函数会使得图像变形,因此需要对图像继续填充后继续放缩。
图像大小变化
import cv2
import numpy as np
def preprocess(img, imgsize, jitter, random_placing=False):
"""
Image preprocess for yolo input
Pad the shorter side of the image and resize to (imgsize, imgsize)
Args:
img (numpy.ndarray): input image whose shape is :math:`(H, W, C)`.
Values range from 0 to 255.
imgsize (int): target image size after pre-processing
jitter (float): amplitude of jitter for resizing
random_placing (bool): if True, place the image at random position
Returns:
img (numpy.ndarray): input image whose shape is :math:`(C, imgsize, imgsize)`.
Values range from 0 to 1.
info_img : tuple of h, w, nh, nw, dx, dy.
h, w (int): original shape of the image
nh, nw (int): shape of the resized image without padding
dx, dy (int): pad size
"""
h, w, _ = img.shape
img = img[:, :, ::-1]
assert img is not None
#尺寸大小的随机抖动,jitter越大,长宽的的变化越大
if jitter > 0:
# add jitter
dw = jitter * w
dh = jitter * h
new_ar = (w + np.random.uniform(low=-dw, high=dw))\