(二)为AI模型训练准备影像

最新推荐文章于 2024-03-23 23:51:36 发布

寒冰屋

最新推荐文章于 2024-03-23 23:51:36 发布

阅读量482

点赞数

分类专栏： python 人工智能

原文链接：https://www.codeproject.com/Articles/5293069/Preparing-Images-for-AI-Model-Training

版权

人工智能同时被 2 个专栏收录

515 篇文章 54 订阅

订阅专栏

python

258 篇文章 9 订阅

订阅专栏

在这里，我将向您展示如何收集、预处理和扩充模型训练所需的数据。

介绍

在本系列的上一篇文章中，我们讨论了创建口罩检测器可以采用的不同方法。在本文中，我们将为口罩检测器解决方案准备一个数据集。

对于任何图像数据集，收集图像，对其进行预处理以及扩充结果数据集的过程基本上是相同的。我们将走很长的路，涵盖数据稀缺的现实情况。我已经从两个不同的来源获得了这些图像，并且我将向您展示如何对它们进行标准化和扩充以用于将来的标签。

尽管有多种自动化工具使此过程轻松自如，但我们将以艰苦的方式来学习更多信息。

我们将使用Roboflow数据集，其中包含149个戴着口罩的人的图像，所有图像都带有黑色填充和“相同尺寸”，以及另一组图像，它们是从Kaggle的完全不同的来源获得的，仅包含人脸（不带口罩）。通过这两个代表两个类别的数据集——带口罩的脸和无口罩的脸——让我们逐步完成步骤，以实现标准化和扩充的数据集。

Roboflow数据集标准化

我将使用Kaggle notebooks来运行本文中的代码，因为它们可以轻松访问计算能力，并且它们已经预先配置了我们需要的所有工具，因此我们无需安装Python，Tensorflow或其他任何内容。但是它们不是强制性的。如果愿意，您可以在本地运行Jupyter Notebook获得相同的结果。

在这种情况下，我手动下载了数据集，将其压缩并上传到Kaggle Notebook。要启动Kaggle notebooks，请访问https://kaggle.com，登录，转到左侧面板中的Notebooks，然后点击新建Notebooks。运行后，上传zip文件并运行以下单元格。

基本库导入：

import os # to explore directories
import matplotlib.pyplot as plt #to plot images
#import matplotlib.image as mpimg
import cv2 #to make image transformations
from PIL import Image,ImageOps #for images handling

让我们探索图像的尺寸。我们将读取每个图像，获取其形状，并获取数据集中的唯一尺寸：

#Image size exploration
shapes = []
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            shapes.append(cv2.imread(os.path.join(dirname, filename)).shape)
 
print('Unique shapes at imageset: ',set(shapes))

在这里，我得到了意料之外的东西。这是输出：

Unique shapes at imageset:  {(415, 415, 3), (415, 416, 3), (416, 415, 3), (416, 416, 3)}

如您所知，我们无法为任何模型提供不同尺寸的图像。让我们将它们标准化为一个尺寸（415x415）：

def make_square(image, minimun_size=256, fill=(0, 0, 0, 0)):
    x, y = image.size
    size = max(minimun_size, x, y)
    new_image = Image.new('RGB', (size, size), fill)
    new_image.paste(image, (int((size - x) / 2), int((size - y) / 2)))
return new_image
 
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            counter += 1
            new_image = Image.open(os.path.join(dirname, filename))
            new_image = make_square(new_image)
            new_image = new_image.resize((415, 415))
            new_image.save("/kaggle/working/"+str(counter)+"-roboflow.jpg")
            if counter == 150:
                break

在Kaggle中保存文件并将它们作为输出的方便目录是/kaggle/working。

在下载正规化的数据集之前，运行此单元来压缩所有图像，以使您可以更轻松地找到最终存档：

!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf  /kaggle/working/*.jpg

现在，您可以在右侧的目录浏览器中查找output.zip文件：

人脸数据集的正规化

此任务的方法与我们为上面的Roboflow数据集选择的方法略有不同。这次，数据集包含4,000多个图像，所有图像的尺寸完全不同。转到数据集链接，然后从那里启动Jupyter Notebook。我们将选择前150张图像。

基本导入：

import os # to explore directories
import matplotlib.pyplot as plt #to plot images
import cv2 #to make image transformations
from PIL import Image #for images handling

如果要浏览数据集：

#How many images do we have?
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            counter += 1
print('Images in directory: ',counter)
 
#Let's explore an image
%matplotlib inline
plt.figure()
image = cv2.imread('/kaggle/input/human-faces/Humans/1 (719).jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
plt.imshow(image)
plt.show()
 
 
#Image size exploration
shapes = []
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            shapes.append(cv2.imread(os.path.join(dirname, filename)).shape)
 
print('Unique shapes at imageset: ',set(shapes))

最后一个单元返回各种各样的尺寸，因此规范化势在必行。让我们将所有图像的尺寸调整为（415x415），黑色填充：

def make_square(image, minimun_size=256, fill=(0, 0, 0, 0)):
    x, y = image.size
    size = max(minimun_size, x, y)
    new_image = Image.new('RGBA', (size, size), fill)
    new_image.paste(image, (int((size - x) / 2), int((size - y) / 2)))
return new_image
 
counter = 0
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        if filename.endswith('.jpg'):
            counter += 1
            test_image = Image.open(os.path.join(dirname, filename))
            new_image = make_square(test_image)
            new_image = new_image.convert("RGB")
            new_image = new_image.resize((415, 415))
            new_image.save("/kaggle/working/"+str(counter)+"-kaggle.jpg")
            if counter == 150:
                Break

要下载的数据集：

!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf  /kaggle/working/*.jpg

现在，您可以在右侧面板中轻松找到它。

数据集扩充

将两个数据集正规化后，就该加入数据并扩充结果集了。数据扩充为我们提供了一种从相对较小的数据集人工生成更多较小训练数据的方法。增强通常是必需的，因为任何模型都需要大量数据才能在训练期间获得良好的结果。

将两个文件解压缩到您的计算机上，将所有图像放在同一文件夹中，将它们压缩，启动新的Kaggle notebooks（在这里，我的文件夹），然后上传生成的文件。

接下来，让我们看看您需要做些什么来扩充数据。我们可以使用自动化服务来偷工减料，但是我们决定自己做所有事情，以了解更多信息。

基本导入：

import numpy as np
from numpy import expand_dims
import os
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import cv2
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from PIL import Image

让我们直接进行扩充。我们将使用Keras中的ImageDataGenerator方法，该方法在计算机视觉社区中被广泛使用：

def data_augmentation(filename):
    
    """
    This function will perform data augmentation:
    for each one of the images, will create expanded/reduced, darker/lighter, rotated images. 5 for every modification type.
    In total, we will create 15 extra images for every one in the original dataset.
    """
    
    image_data = []
    #reading the image
    image = cv2.imread(filename,3)
    #image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    #expanding the image dimension to one sample
    samples = expand_dims(image, 0)
    # creating the image data augmentation generators
    datagen1 = ImageDataGenerator(zoom_range=[0.5,1.2])
    datagen2 = ImageDataGenerator(brightness_range=[0.2,1.0])
    datagen3 = ImageDataGenerator(rotation_range=20)
      
    # preparing iterators
    it1 = datagen1.flow(samples, batch_size=1)
    it2 = datagen2.flow(samples, batch_size=1)
    it3 = datagen3.flow(samples, batch_size=1)
    image_data.append(image)
    for i in range(5):
        # generating batch of images
        batch1 = it1.next()
        batch2 = it2.next()
        batch3 = it3.next()
        # convert to unsigned integers
        image1 = batch1[0].astype('uint8')
        image2 = batch2[0].astype('uint8')
        image3 = batch3[0].astype('uint8')
        #appending to the list of images
        image_data.append(image1)
        image_data.append(image2)
        image_data.append(image3)
        
    return image_data

要实现它，让我们遍历/kaggle/input目录中的每个图像，并将所有结果保存在/kaggle/working中以供将来下载：

for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))
        result = data_augmentation(os.path.join(dirname, filename))
        for i in range(16):
            cv2.imwrite('/kaggle/working/'+str(counter)+'.jpg', result[i])

同样，在下载之前，只需在行旁边运行即可，以使在右侧面板中更容易找到文件：

!zip -r /kaggle/working/output.zip /kaggle/working/
!rm -rf  /kaggle/working/*.jpg

现在，您可以下载output.zip文件。

下一步

在接下来的文章中，我们将看到如何正确标注所产生的图像，以训练YOLO模型。敬请关注！

https://www.codeproject.com/Articles/5293069/Preparing-Images-for-AI-Model-Training

寒冰屋

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
(二)为AI模型训练准备影像

目录介绍Roboflow数据集标准化人脸数据集的正规化数据集扩充下一步在这里，我将向您展示如何收集、预处理和扩充模型训练所需的数据。介绍在本系列的上一篇文章中，我们讨论了创建口罩检测器可以采用的不同方法。在本文中，我们将为口罩检测器解决方案准备一个数据集。对于任何图像数据集，收集图像，对其进行预处理以及扩充结果数据集的过程基本上是相同的。我们将走很长的路，涵盖数据稀缺的现实情况。我已经从两个不同的来源获得了这些图像，并且我将向您展示如何对它们进行标准化和扩充以用于将来的标
复制链接

扫一扫