利用ImageDataGenerator构建数据集

ImageDataGenerator属于Keras的图片预处理模块,在Tensorflow 2.0中已集成了Keras的API。

本文利用ImageDataGenerator来完成一个基本的机器学习流程:

  1. 检查并了解数据
  2. 建立输入管道
  3. 建立模型
  4. 训练模型
  5. 测试模型
  6. 改进模型并重复该过程

 1. 检查并了解数据:

  • 导入必要的package

from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten, Dropout, MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator

import os
import numpy as np
import matplotlib.pyplot as plt
  • 下载图片数据

本文以猫狗分类数据集为例子。

_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'

path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)

PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')

可以打印PATH变量查看图片保存路径:print(PATH)

图片文件结构如下:

cats_and_dogs_filtered
|__ train
    |______ cats: [cat.0.jpg, cat.1.jpg, cat.2.jpg ....]
    |______ dogs: [dog.0.jpg, dog.1.jpg, dog.2.jpg ...]
|__ validation
    |______ cats: [cat.2000.jpg, cat.2001.jpg, cat.2002.jpg ....]
    |______ dogs: [dog.2000.jpg, dog.2001.jpg, dog.2002.jpg ...]
  • 划分训练/验证数据集

由于这个数据集本身就已经按文件夹划分好的训练/验证数据集,因此可以直接基于这些文件夹来生成不同的数据集。

后面部分将使用.flow_from_directory(directory)方法来生成数据集,因此先构建训练/验证数据集的文件路径名:

train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')

train_cats_dir = os.path.join(train_dir, 'cats')  # directory with our training cat pictures
train_dogs_dir = os.path.join(train_dir, 'dogs')  # directory with our training dog pictures
validation_cats_dir = os.path.join(validation_dir, 'cats')  # directory with our validation cat pictures
validation_dogs_dir = os.path.join(validation_dir, 'dogs')  # directory with our validation dog pictures

查看训练/验证数据集的大小:

num_cats_tr = len(os.listdir(train_cats_dir))
num_dogs_tr = len(os.listdir(train_dogs_dir))

num_cats_val = len(os.listdir(validation_cats_dir))
num_dogs_val = len(os.listdir(validation_dogs_dir))

total_train = num_cats_tr + num_dogs_tr
total_val = num_cats_val + num_dogs_val

print('total training cat images:', num_cats_tr)
print('total training dog images:', num_dogs_tr)

print('total validation cat images:', num_cats_val)
print('total validation dog images:', num_dogs_val)
print("--")
print("Total training images:", total_train)
print("Total validation images:", total_val)

2. 建立输入管道

定义一些参数,方便后续使用:

batch_size = 128
epochs = 15
IMG_HEIGHT = 150
IMG_WIDTH = 150
  • 构造ImageDataGenerator

ImageDataGenerator类包含了许多图片预处理参数,例如 rescale 可以实现图片像素归一化。同时,为了防止模型过拟合可以采取一些数据增强(Data augmentation)操作:水平翻转、随机旋转等。完整的ImageDataGenerator初始化参数如下:

Class ImageDataGenerator

Generate batches of tensor image data with real-time data augmentation.

Arguments:

  • featurewise_center: Boolean. Set input mean to 0 over the dataset, feature-wise.
  • samplewise_center: Boolean. Set each sample mean to 0.
  • featurewise_std_normalization: Boolean. Divide inputs by std of the dataset, feature-wise.
  • samplewise_std_normalization: Boolean. Divide each input by its std.
  • zca_epsilon: epsilon for ZCA whitening. Default is 1e-6.
  • zca_whitening: Boolean. Apply ZCA whitening.
  • rotation_range: Int. Degree range for random rotations.
  • width_shift_range: Float, 1-D array-like or int
    • float: fraction of total width, if < 1, or pixels if >= 1.
    • 1-D array-like: random elements from the array.
    • int: integer number of pixels from interval (-width_shift_range, +width_shift_range)
    • With width_shift_range=2 possible values are integers [-1, 0, +1], same as with width_shift_range=[-1, 0, +1], while with width_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0).
  • height_shift_range: Float, 1-D array-like or int
    • float: fraction of total height, if < 1, or pixels if >= 1.
    • 1-D array-like: random elements from the array.
    • int: integer number of pixels from interval (-height_shift_range, +height_shift_range)
    • With height_shift_range=2 possible values are integers [-1, 0, +1], same as with height_shift_range=[-1, 0, +1], while with height_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0).
  • brightness_range: Tuple or list of two floats. Range for picking a brightness shift value from.
  • shear_range: Float. Shear Intensity (Shear angle in counter-clockwise direction in degrees)
  • zoom_range: Float or [lower, upper]. Range for random zoom. If a float, [lower, upper] = [1-zoom_range, 1+zoom_range].
  • channel_shift_range: Float. Range for random channel shifts.
  • fill_mode: One of {"constant", "nearest", "reflect" or "wrap"}. Default is 'nearest'. Points outside the boundaries of the input are filled according to the given mode:
    • 'constant': kkkkkkkk|abcd|kkkkkkkk (cval=k)
    • 'nearest': aaaaaaaa|abcd|dddddddd
    • 'reflect': abcddcba|abcd|dcbaabcd
    • 'wrap': abcdabcd|abcd|abcdabcd
  • cval: Float or Int. Value used for points outside the boundaries when fill_mode = "constant".
  • horizontal_flip: Boolean. Randomly flip inputs horizontally.
  • vertical_flip: Boolean. Randomly flip inputs vertically.
  • rescale: rescaling factor. Defaults to None. If None or 0, no rescaling is applied, otherwise we multiply the data by the value provided (after applying all other transformations).
  • preprocessing_function: function that will be implied on each input. The function will run after the image is resized and augmented. The function should take one argument: one image (Numpy tensor with rank 3), and should output a Numpy tensor with the same shape.
  • data_format: Image data format, either "channels_first" or "channels_last". "channels_last" mode means that the images should have shape (samples, height, width, channels), "channels_first" mode means that the images should have shape (samples, channels, height, width). It defaults to the image_data_format value found in your Keras config file at ~/.keras/keras.json. If you never set it, then it will be "channels_last".
  • validation_split: Float. Fraction of images reserved for validation (strictly between 0 and 1).
  • dtype: Dtype to use for the generated arrays.

 

  • 构建训练数据集: 

train_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our training data
train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
                                                           directory=train_dir,
                                                           shuffle=True,
                                                           target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                           class_mode='binary')
  • 构建验证数据集:

validation_image_generator = ImageDataGenerator(rescale=1./255)
val_data_gen = validation_image_generator.flow_from_directory(batch_size=batch_size,
                                                              directory=validation_dir,
                                                              target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                              class_mode='binary')

可视化图片,检查下预处理操作:

# The next function returns a batch from the dataset.
# The return value of next function is in form of (x_train, y_train)
# where x_train is training features and y_train, its labels.
# Discard the labels to only visualize the training images.
sample_training_images, _ = next(train_data_gen)

def plot_images(images_arr):
    fig, axes = plt.subplots(nrows=1, ncols=5, figsize=(20, 20))
    axes = axes.flatten()

    for img, ax in zip(images_arr, axes):
        ax.imshow(img)
        ax.axis('off')

    plt.tight_layout()
    plt.show()

plot_images(sample_training_images[:5])

3. 建立模型

model = Sequential([
    Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH, 3)),
    MaxPooling2D(),
    Conv2D(filters=32, kernel_size=3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(filters=64, kernel_size=3, padding='same', activation='relu'),
    MaxPooling2D(),  
    Flatten(),
    Dense(units=512, activation='relu'),
    Dense(units=1, activation='sigmoid')
])
  • 编译模型

    model.compile(optimizer='adam',
                  loss='binary_crossentropy',
                  metrics=['accuracy'])
    
    model.summary()

 4. 训练模型

在训练的过程中,每5个epoch保存一下模型参数。

checkpoint_path = 'trainging/cp-{epoch:04d}.ckpt'

""" Create the model save method """
# Create a callback that saves the model's weights every 5 epochs
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                    verbose=1,
                                                    save_weights_only=True,
                                                    save_freq=5)
model.save_weights(checkpoint_path.format(epoch=0))

""" Train the model """
history = model.fit_generator(
    generator=train_data_gen,
    steps_per_epoch=total_train//batch_size,
    epochs=epochs,
    callbacks=[cp_callback],
    validation_data=val_data_gen,
    validation_steps=total_val//batch_size
)

训练完成后可以可视化查看下训练的效果:

""" Visualize training results """
print(history.history)
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 8))

plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

从图中可以看出,训练精度和验证精度相差很大,模型仅在验证集上获得了约70%的精度。

5. 测试模型

载入训练中保存的checkpoints,馈入验证数据进行测试:

checkpoint_dir = os.path.dirname(checkpoint_path)
latest = tf.train.latest_checkpoint(checkpoint_dir)
model.load_weights(latest)
loss, acc = model.evaluate_generator(generator=val_data_gen, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

其实如果在model.fit_generator()中馈入了验证样本集,在模型训练完成后会自动进行validation的操作。

6. 改进模型并重复该过程

可以发现在训练时,模型的准确率达到了90%,而在验证样本中准确率却为70%,模型过拟合了。

如果用于训练的样本数很少,而模型中用于训练的参数又很多,很容易产生过拟合。

防止模型过拟合的方法有很多,可以增大训练样本集,也可以在模型中增加Dropout层,加入正则化项等。

  • 这里利用rotation_range,width_shift_range,height_shift_range,horizontal_flip,zoom_range来实现Data augmentation。
train_image_generator = ImageDataGenerator(rescale=1./255,
                                           rotation_range=45,
                                           width_shift_range=.15,
                                           height_shift_range=.15,
                                           horizontal_flip=True,
                                           zoom_range=0.5)
train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
                                                           directory=train_dir,
                                                           shuffle=True,
                                                           target_size=(IMG_HEIGHT, IMG_WIDTH),
  • 同时在模型中增加Dropout层:
model = Sequential([
    Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH, 3)),
    MaxPooling2D(),
    Dropout(rate=0.2),
    Conv2D(filters=32, kernel_size=3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(filters=64, kernel_size=3, padding='same', activation='relu'),
    MaxPooling2D(),
    Dropout(rate=0.2),  
    Flatten(),
    Dense(units=512, activation='relu'),
    Dense(units=1, activation='sigmoid')
])

再次训练模型后发现,过拟合被抑制了:

 

完整代码如下:

from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten, Dropout, MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator

import os
import numpy as np
import matplotlib.pyplot as plt

have_data = False
taining_mode = True

batch_size = 128
epochs = 15
IMG_HEIGHT = 150
IMG_WIDTH = 150

checkpoint_path = 'trainging_adv/cp-{epoch:04d}.ckpt'

""" Load date """
if have_data:
    PATH = '/home/<user-id>/.keras/datasets/cats_and_dogs_filtered'
else:
    _URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
    path_to_zip = tf.keras.utils.get_file(fname='cats_and_dogs.zip', origin=_URL, extract=True)
    PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')

train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')

train_cats_dir = os.path.join(train_dir, 'cats')  # directory with our training cat pictures
train_dogs_dir = os.path.join(train_dir, 'dogs')  # directory with our training dog pictures
validation_cats_dir = os.path.join(validation_dir, 'cats')  # directory with our validation cat pictures
validation_dogs_dir = os.path.join(validation_dir, 'dogs')  # directory with our validation dog pictures

""" Understand the data counts """
num_cats_tr = len(os.listdir(train_cats_dir))
num_dogs_tr = len(os.listdir(train_dogs_dir))
num_cats_val = len(os.listdir(validation_cats_dir))
num_dogs_val = len(os.listdir(validation_dogs_dir))

total_train = num_cats_tr + num_dogs_tr
total_val = num_cats_val + num_dogs_val

print('total training cat images:', num_cats_tr)
print('total training dog images:', num_dogs_tr)

print('total validation cat images:', num_cats_val)
print('total validation dog images:', num_dogs_val)
print("--")
print("Total training images:", total_train)
print("Total validation images:", total_val)

""" Data preparation """
train_image_generator = ImageDataGenerator(rescale=1./255,
                                           rotation_range=45,
                                           width_shift_range=.15,
                                           height_shift_range=.15,
                                           horizontal_flip=True,
                                           zoom_range=0.5)

validation_image_generator = ImageDataGenerator(rescale=1./255)

train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
                                                           directory=train_dir,
                                                           shuffle=True,
                                                           target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                           class_mode='binary')
val_data_gen = validation_image_generator.flow_from_directory(batch_size=batch_size,
                                                              directory=validation_dir,
                                                              target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                              class_mode='binary')

""" Visualize training images """
# The next function returns a batch from the dataset.
# The return value of next function is in form of (x_train, y_train)
# where x_train is training features and y_train, its labels.
# Discard the labels to only visualize the training images.
sample_training_images, _ = next(train_data_gen)

def plot_images(images_arr):
    fig, axes = plt.subplots(nrows=1, ncols=5, figsize=(20, 20))
    axes = axes.flatten()

    for img, ax in zip(images_arr, axes):
        ax.imshow(img)
        ax.axis('off')

    plt.tight_layout()
    plt.show()

plot_images(sample_training_images[:5])

""" Create the model """
model = Sequential([
    Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH, 3)),
    MaxPooling2D(),
    Dropout(rate=0.2),
    Conv2D(filters=32, kernel_size=3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(filters=64, kernel_size=3, padding='same', activation='relu'),
    MaxPooling2D(),
    Dropout(rate=0.2),  
    Flatten(),
    Dense(units=512, activation='relu'),
    Dense(units=1, activation='sigmoid')
])

""" Compile the model """
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.summary()

if taining_mode:
    """ Create the model save method """
    # Create a callback that saves the model's weights every 5 epochs
    cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                     verbose=1,
                                                     save_weights_only=True,
                                                     save_freq=5)
    model.save_weights(checkpoint_path.format(epoch=0))

    """ Train the model """
    history = model.fit_generator(
        generator=train_data_gen,
        steps_per_epoch=total_train//batch_size,
        epochs=epochs,
        callbacks=[cp_callback],
        validation_data=val_data_gen,
        validation_steps=total_val//batch_size
    )

    """ Visualize training results """
    print(history.history)
    acc = history.history['accuracy']
    val_acc = history.history['val_accuracy']
    loss = history.history['loss']
    val_loss = history.history['val_loss']

    epochs_range = range(epochs)

    plt.figure(figsize=(8, 8))

    plt.subplot(1, 2, 1)
    plt.plot(epochs_range, acc, label='Training Accuracy')
    plt.plot(epochs_range, val_acc, label='Validation Accuracy')
    plt.legend(loc='lower right')
    plt.title('Training and Validation Accuracy')

    plt.subplot(1, 2, 2)
    plt.plot(epochs_range, loss, label='Training Loss')
    plt.plot(epochs_range, val_loss, label='Validation Loss')
    plt.legend(loc='upper right')
    plt.title('Training and Validation Loss')
    plt.show()

else:
    checkpoint_dir = os.path.dirname(checkpoint_path)
    latest = tf.train.latest_checkpoint(checkpoint_dir)
    model.load_weights(latest)
    loss, acc = model.evaluate_generator(generator=val_data_gen, verbose=2)
    print("Restored model, accuracy: {:5.2f}%".format(100*acc))

 

  • 10
    点赞
  • 24
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
好的,我可以帮助你完成这个任务。 首先,你需要解压缩“mnist.rar”文件,然后使用图像处理库(如OpenCV)对图像进行切分。以下是一个简单的Python代码示例,用于提取每个数字的小片: ```python import cv2 import os # 定义感兴趣区域 roi = [(50, 100), (400, 100), (50, 600), (400, 600)] # 读取图像 img = cv2.imread('mnist.png') # 切割感兴趣区域 for i in range(4): x, y = roi[i] crop = img[y-50:y+50, x-50:x+50] # 切割数字小片 for j in range(10): num_crop = crop[10*j:10*(j+1), :] filename = "num{}_{}_{}.jpg".format(i+1, j+1) # 保存数字小片 cv2.imwrite(os.path.join('dataset', filename), num_crop) ``` 在这个示例中,我们定义了四个感兴趣区域,然后使用OpenCV库读取图像并进行切割。对于每个区域,我们切割出10个数字小片,并将它们保存到“dataset”文件夹中。 接下来,你可以使用以下代码将数据集拆分为训练集和测试集: ```python import os import random import shutil # 定义数据集路径和拆分比例 data_dir = 'dataset' train_dir = 'train' test_dir = 'test' split_ratio = 0.7 # 创建训练集和测试集目录 os.makedirs(train_dir, exist_ok=True) os.makedirs(test_dir, exist_ok=True) # 遍历所有数字小片 for file in os.listdir(data_dir): # 获取数字类别和文件名 num = file.split('_')[0] filename = os.path.join(data_dir, file) # 随机划分训练集和测试集 if random.random() < split_ratio: shutil.copy(filename, os.path.join(train_dir, num)) else: shutil.copy(filename, os.path.join(test_dir, num)) ``` 在这个示例中,我们遍历数据集中的所有数字小片,并随机将它们划分为训练集和测试集。要确保同一名同学的手写体要么全部在训练集中,要么全部在测试集中,我们可以根据文件名中的信息来进行划分。 最后,你可以构建一个简单的卷积神经网络来对手写数字图像进行分类。以下是一个可能的模型架构: ```python from keras.models import Sequential from keras.layers import Conv2D, MaxPooling2D from keras.layers import Dense, Flatten # 定义模型 model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(20, 20, 1))) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Conv2D(64, kernel_size=(3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dense(10, activation='softmax')) # 编译模型 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) ``` 在这个模型中,我们使用了两个卷积层和两个最大池化层来提取特征。然后我们使用了两个全连接层来进行分类。最后,我们使用了softmax函数来产生10种不同的输出,对应于0到9的数字。 你可以使用类似以下的代码来训练模型: ```python from keras.preprocessing.image import ImageDataGenerator # 定义训练数据生成器 train_datagen = ImageDataGenerator(rescale=1./255) # 从文件夹中读取训练数据 train_generator = train_datagen.flow_from_directory( train_dir, target_size=(20, 20), color_mode='grayscale', batch_size=32, class_mode='categorical') # 训练模型 model.fit_generator(train_generator, steps_per_epoch=100, epochs=10) ``` 在这个示例中,我们使用了Keras库中的ImageDataGenerator类来生成训练数据,然后使用fit_generator()函数进行模型训练。 希望这个示例能够对你有所帮助!

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值