常见的图像增强方法

咕里个咚

已于 2022-04-14 15:29:58 修改

阅读量9.1k

点赞数 17

文章标签： python opencv 人工智能

于 2021-07-25 16:02:58 首次发布

本文链接：https://blog.csdn.net/guligedong/article/details/119081444

版权

上一篇文章已经介绍了opencv和PIL怎么去读图和他们两者的区别，那它们两个还有啥别的能力吗，不能就只会读个图吧，当然不会，他们还可以对图像做一些处理。这些处理可以我统称为augmentation，高大上的一个单词吧，讲这么多其实只是为了深度学习做准备，augmentation可以增加样本的数量和复杂程度，提高模型的鲁棒性，我是这么理解的，所谓见多识广，计算机视觉也是这样的，只是它看到的东西和我们人类有点差异，是图片的高维特征，扯多了，扯多了，进入今天的图像augmentation吧！

同样的，今天还是提供一张大家可以操作的图片，依然是主角咩咩狗

这高贵的气质不是一般的狗有的

augmentation 1：改变图片长宽

def augment_resize(img,size=(224,224),img_type='cv2'):
    """
    using for image resize ,default size = 224*224
    :param img:
    :param size: tuple (w,h)
    :param img_type:
    :return:
    """
    if img_type=='cv2':
        return cv2.resize(img, size)
    else:
        im_resize = img.resize(size)
        return im_resize

提供一个函数，这个函数适用于PIL和opencv读进来的图像，传入的第一个参数就是读取后的图像，第二个参数就是图片的宽和高（第一个元素是宽，第二个元素是高），你可能会问了为什么是224，为什么不是其他的数，我愿意，就是我愿意，第三个参数默认是opencv传入的图片，因为我比较习惯用opencv

来我们在主函数里调用一下这个函数，看看是个什么效果（当然你一预先import进来需要的库）

from PIL import Image,ImageDraw,ImageFilter
import numpy as np
from PIL import ImageEnhance
import cv2
import random

if __name__ == '__main__':

    path = r'./miemie3.jpg'
    ori_image = cv2.imread(path)
    cv2.imshow('ori_image',ori_image)
    resize_image = augment_resize(ori_image)
    cv2.imshow('resize_image', resize_image)
    cv2.waitKey()

上效果

有没有发现，变小了，希望它也可以一直这么小，小的时候没有烦恼，无忧无虑，多好啊

还有没有发现，使用的系统，是win11，不得不说win11的UI界面看上去顺眼一些了。

augmentation 2：图片的旋转

def augment_rotate(img,angle=90,img_type='cv2'):
    """
    using for image rotate,default degree=90
    :param img:
    :param angle: rotate degree
    :param img_type:
    :return:
    """
    if img_type=='cv2':
        height, width = img.shape[:2]
        if height > width:
            center = (height / 2, height / 2)
        else:
            center = (width / 2, width / 2)
        mata = cv2.getRotationMatrix2D(center, angle, scale=1)
        img = cv2.warpAffine(img, mata, (height, width), borderValue=(0,0,0))
        return img
    else:
        img_rotate = img.rotate(angle)
        return img_rotate

传入的仍然是3个参数，第一个和第三个上面已经说了，第二个参数是旋转角度，可以看到opencv的旋转图片的代码比PIL多很多，多就意味着给你选择就越多，它可以根据任意点旋转，PIL是基于中心点旋转的

主函数就不贴出来了，你自己调用一下就行了，不会调用，就回家闭关学学python吧，直接上效果图

细心的朋友可能已经看到了，这是逆时针旋转哦，hey 这是kong，在听反方向的钟，推荐大家一首说唱歌曲《hey kong》，还有更细心的朋友可能发现了，旋转之后的图片有黑边，这是为什么，你想想吧，顺便想想怎么解决，家庭作业。

augmentation 3：图片的镜像翻转

def augment_left_flip(img,img_type='cv2'):
    """
    using for mirror horizontal flip
    :param img:
    :param img_type:
    :return:
    """
    if img_type=='cv2':
        # 1表示水平翻转
        out = cv2.flip(img, 1)
        return out
    else:
        out = img.transpose(Image.FLIP_LEFT_RIGHT)
        return out

只有两个参数，说过了，不多说，每次说不多说了，但还要写这么多字，不大对啊。上图

左右镜像都有，没有上下翻转说不过去啊上代码，上图

def augment_top_flip(img,img_type='cv2'):
    """
    using for mirror vertical flip
    :param img:
    :param img_type:
    :return:
    """
    if img_type=='cv2':
        out = cv2.flip(img, 0)
        return out
    else:
        out = img.transpose(Image.FLIP_TOP_BOTTOM)
        return out

该说不说，有点晕啊

augmentation 4：图片的随机剪裁

def augment_randomcut(img,img_type='cv2',w_ratio=2,h_ratio=2):
    """
    using for random cut image
    :param img:
    :param img_type:
    :param w_ratio:change this value can change the max value of wight
    :param h_ratio:change this value can change the max value of height
    :return:
    """
    if img_type=='cv2':
        height = img.shape[0] // h_ratio
        width = img.shape[1] // w_ratio
        print(height, width)
        x = random.randint(0, img.shape[1] - width)
        y = random.randint(0, img.shape[0] - height)
        print(height, width, x, y)
        cropped = img[y:y + height, x:x + width]
        return cropped
    else:
        width, height = img.size
        x = random.randint(0,2*width//w_ratio)
        y = random.randint(0,height//h_ratio)
        box = (x,y,x+width//w_ratio,y+height//h_ratio)
        image = img.crop(box)
        return image

刚刚不想多说，这个不得不多说，这个函数提供了四个函数，第三个参数和第四个参数是用来控制随机剪裁图像的大小的，上效果图

咱在来个固定大小的剪裁吧

def augment_constantcut(img,left_top,right_bottom,img_type='cv2'):
    """
    using for constant cut image
    :param img:
    :param left_top: tuple (xmin,ymin)
    :param right_bottom:  tuple (xmax,ymax)     assert xmax>xmin ymax>ymin
    :param img_type:
    :return:
    """
    if img_type=='cv2':
        cropped = img[left_top[1]:right_bottom[1], left_top[0]:right_bottom[0]]
        return cropped
    else:
        box = (left_top[0],left_top[1],right_bottom[0],right_bottom[1])
        image = img.crop(box)
        return image

一共四个参数，第二个参数个第三个参数是左上角坐标和右下角坐标，上效果图

augmentation 5：改变图片亮度

def augment_light(img,img_type='cv2',factor=1.5):
    """
    using for change image light
    :param img:
    :param img_type:
    :param factor:  The degree of change in brightness
    :return:
    """
    if img_type == 'cv2':
        new_img = cv2.addWeighted(img,factor,img,0,0)
        return new_img
    else:
        enh_bri = ImageEnhance.Brightness(img)
        new_img = enh_bri.enhance(factor=factor)
        return new_img

第三个参数是改变亮度的程度，上效果图

不得不说，这咩咩狗是真靓啊

augmentation 6：给图片添加噪声

def augment_add_noise(img,img_type='cv2',num_noise=500):
    """
    using for random add noise on image
    :param img:
    :param img_type:
    :param num_noise:  number of nosise
    :return:
    """
    if img_type=='cv2':
        for num in range(num_noise):
            x = random.randint(0, img.shape[0] - 1)
            y = random.randint(0, img.shape[1] - 1)
            if num % 2 == 0:
                img[x, y] = 0
            else:
                img[x, y] = 255
        return img
    else:
        img1 = img.copy()
        rows, cols = img.size
        for noise in range(num_noise):
            x = np.random.randint(0, rows)
            y = np.random.randint(0, cols)
            if noise % 2:
                img1.putpixel((x, y), (255, 255, 255))
            else:
                img1.putpixel((x, y), (0, 0, 0))
        return img1

第三个参数是添加噪声的点数，上图

这噪声，是对咩咩狗的大不敬啊

augmentation 7：图片的模糊

def augment_blur(img, img_type='cv2', blur_type='gaussion', kernel=3):
    """
    using for blur image
    :param img:
    :param img_type:
    :param blur_type: you can select gaussion or median
    :param kernel: kernel size
    :return:
    """
    if img_type == 'cv2':
        if blur_type == "gaussion":
            return cv2.GaussianBlur(img, (kernel, kernel), sigmaX=1)
        else:
            return cv2.medianBlur(img, kernel)

    else:
        if blur_type=='gaussion':
            return img.filter(ImageFilter.GaussianBlur(radius=kernel))
        else:
            return img.filter(ImageFilter.MedianFilter(kernel))

第四个参数是模糊的卷积核的大小，第三个参数是模糊的方式，默认高斯模糊，高斯，嗯，牛掰plus，看看你对狗的模糊怎么样，上图

真的糊了，三番，一人8块，不好意思

augmentation 8：图片的遮挡

def augment_mask(img,mask_num,img_type='cv2'):
    """
    using for random set mask
    :param img:
    :param mask_num: number of mask
    :param img_type:
    :return:
    """
    if img_type == 'cv2':
        img_copy = img.copy()
        mask_height = img.shape[0] // 10
        mask_width = img.shape[1] // 10
        for index in range(mask_num):
            x = random.randint(0, img.shape[1] - mask_width)
            y = random.randint(0, img.shape[0] - mask_height)
            img_copy[y:y + mask_height, x:x + mask_width] = 0
        return img_copy
    else:
        rows, cols = img.size
        img1 = img.copy()
        draw = ImageDraw.Draw(img1)
        for index in mask_num:
            x = random.randint(0, 9*rows//10)
            y = random.randint(0, 9*cols//10)
            draw.rectangle((x, y, x + rows//10, y + cols//10), fill=(0, 0, 0))
        return img1

这个遮挡是啥意思呢，上图就知道了，第二个参数是遮挡块的个数，上图