用tensorflow扩充数据集

      在cifar10_input.py有个功能很强大的数据-----distorted_inputs。它可以对train数据进行变形处理,起到数据增广的作用,在数据集比较小,数据量远远不够的情况下,可以对图片进行翻转、随机剪裁等操作以增加数据,制造出更加多的样本,提高度图片的利用率。

核心功能代码在cifar10_input.py文件的 169~183 行:

 

# Image processing for training the network. Note the many random
# distortions applied to the image.

# Randomly crop a [height, width] section of the image.
distorted_image = tf.random_crop(reshaped_image, [height, width, 3])

# Randomly flip the image horizontally.
distorted_image = tf.image.random_flip_left_right(distorted_image)

# Because these operations are not commutative, consider randomizing
# the order their operation.
# NOTE: since per_image_standardization zeros the mean and makes
# the stddev unit, this likely has no effect see tensorflow#1458.
distorted_image = tf.image.random_brightness(distorted_image,
                                             max_delta=63)
distorted_image = tf.image.random_contrast(distorted_image,
                                           lower=0.2, upper=1.8)

# Subtract off the mean and divide by the variance of the pixels.
float_image = tf.image.per_image_standardization(distorted_image)

tf.random_crop()       对图片随机剪裁   
tf.image.random_flip_left_right(distorted_image)    随机左右翻转
tf.image.random_brightness()         随机亮度变化
tf.image.random_contrast()       随机对比度变化
tf.image.per_image_standardization()   减去均值像素,并除以像素方差(图片标准化)

项目目录结构如下:

读取Cifar10数据并简单实现数据增强(load_data.py):

import os
import pickle
import numpy as np


def show_img(data):
    from matplotlib import pyplot as plt
    plt.figure("Image")  # 图像窗口名称
    plt.imshow(data)
    plt.axis('off')  # 关掉坐标轴为 off
    plt.title('image')  # 图像题目
    plt.savefig('fix.jpg')
    plt.show()


def data_aug(img):
    import tensorflow as tf
    # img = tf.image.random_flip_left_right(img)
    # img = tf.image.flip_up_down(img)
    # img = tf.random_crop(img, [22,22,3])
    # img = tf.image.flip_up_down(img)
    img = tf.image.per_image_standardization(img)
    with tf.Session() as sess:
        img = img.eval()
    return img


def unpickle(file):
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict


def get_data(file):
    images = []
    labels = []
    IMAGE_SIZE = 32
    IMAGE_DEPTH = 3
    for i in range(1, 6):  # cifar数据包含data_batch_1 ... data_batch_5
        file_path = os.path.join(file, 'data_batch_' + str(i))
        ret = unpickle(file_path)  # 读取数据
        # print([k for k in ret.keys()])  # 显示字典的键
        # print(ret[b'data'].shape)   # 显示图像数据的维度
        # print(len(ret[b'labels']))   # 显示变签长度
        images = np.r_[images, ret[b'data']] if len(images) > 0 else ret[b'data']
        labels = np.r_[labels, ret[b'labels']] if len(labels) > 0 else ret[b'labels']

    images = np.reshape(images, (images.shape[0], IMAGE_DEPTH, IMAGE_SIZE, IMAGE_SIZE)) \
        .transpose(0, 2, 3, 1) \
        .astype("uint8")
    labels = np.reshape(labels, (len(labels), 1))
    # show_img(images[1])
    aug = data_aug(images[1])  # 数据增强
    show_img(aug)

    # print(images.shape)
    # print(labels.shape)
    return images, labels


if __name__ == '__main__':
    file_path = './data'
    get_data(file_path)

 

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值