tensorflow Data Augmentation 代码（翻转，亮度，裁切）

最新推荐文章于 2023-12-30 08:47:45 发布

Alen_Ii

最新推荐文章于 2023-12-30 08:47:45 发布

阅读量2k

点赞数

分类专栏：计算机视觉CV 文章标签： tensorflow 图片处理数据增强

本文链接：https://blog.csdn.net/qq_41419675/article/details/80653865

版权

计算机视觉CV 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

本文介绍了TensorFlow中用于图像处理的数据增强技术，包括JPEG和PNG图片的解码与编码，图像的resize、crop、翻转及亮度调整等操作，这些方法有助于训练更健壮的模型。

摘要由CSDN通过智能技术生成

《一》 Encoding and Decoding

1. tf.image.decode_jpeg(contents, channels=None, ratio=None, fancy_upscaling=None, try_recover_truncated=None, acceptable_fraction=None, name=None)

Decode a JPEG-encoded image to a uint8 tensor. 将一张jpeg的图片解码成uint8的张量，形状3-D, [height, width, channels]

contents 是编码过的图片，是一个类型为string的tensor，形状0-D，

channels 代表通道，如果为默认的0，则表示使用编码图片的通道数，若为1，则输出为灰度图，若为3，则输出为rgb格式

eg:

import tensorflow as tf
import numpy as np
img_path = '/Users/apple/Downloads/Medlinker/util/1.jpg'   # 图片存放的路径

def decode_img(path):
    file_queue = tf.train.string_input_producer([path])    # 注意这儿的输入是一个列表，参数还有shuffle，是否打乱
    image_reader = tf.WholeFileReader()
    key, image = image_reader.read(file_queue)
    image_decode = tf.image.decode_jpeg(image, channels=0)

    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)
        # sess.run(tf.global_variables_initializer())
        x = sess.run(image_decode)
        coord.request_stop()
        coord.join(threads)
    return x


x = decode_img(img_path)
print(x.shape)

2. tf.image.encode_jpeg 是 decode 的逆向操作，将uint8的jpeg图片进行编码

3. tf.image.decode_png(contents, channels=None, name=None)

Decode a PNG-encoded image to a uint8 tensor. 将PNG图片解码成uint8的张量，形状3-D with shape [height, width, channels]

channels 有四个值可选：

0: Use the number of channels in the PNG-encoded image.
1: output a grayscale image.
3: output an RGB image.
4: output an RGBA image.

4. tf.image.encode_png decode 的逆向操作，将uint8的png图片进行编码

《二》 Resizing

1. tf.image.resize_images(images, new_height, new_width, method=0)

将图片resize成新的形状，宽和高需要用括号括起来

def resize_img(path):
    file_queue = tf.train.string_input_producer([path])
    image_reader = tf.WholeFileReader()
    key, image = image_reader.read(file_queue)
    decode_ = tf.image.decode_png(image)
    image_resize = tf.image.resize_images(decode_, (666, 55))   # 这儿的宽高是元祖形式
    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)
        x = sess.run(image_resize)
        coord.request_stop()
        coord.join(threads)
    return x

2. tf.image.resize_image_with_crop_or_pad(image, target_height, target_width)

将图片裁切（或者填充）到目标尺寸，宽和高不用括起来

image_resize = tf.image.resize_image_with_crop_or_pad(decode_, 666, 55)

3.tf.image.crop_to_bounding_box(image, offset_height, offset_width, target_height, target_width)

image_resize = tf.image.crop_to_bounding_box(decode_, 100,200,300,400)  # 100，200 是左上角坐标，300，400 是目标尺寸

4. tf.image.decode_and_crop_jpeg(image, size, seed=None, name=None) 将图片进行解码，然后裁剪，size是裁剪的坐标以及目标尺寸，这个函数将decode和crop组合在了一起

def resize_img(path):
    file_queue = tf.train.string_input_producer([path])
    image_reader = tf.WholeFileReader()
    key, image = image_reader.read(file_queue)
    # decode_ = tf.image.decode_png(image)
    image_resize = tf.image.decode_and_crop_jpeg(image, [40, 50, 200, 300])
    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)
        x = sess.run(image_resize)
        coord.request_stop()
        coord.join(threads)
    return x

x = resize_img(img_path)
plt.ion()
plt.imshow(x)
plt.pause(10)

《三》 Flipping and Transposing

1. tf.image.flip_up_down(image) 将图片上下翻转

2. tf.image.random_flip_up_down(image, seed=None) 随机上下翻转

3. tf.image.flip_left_right(image) 左右翻转

4. tf.image.random_flip_left_right(image, seed=None) 随机左右翻转

def flip_img(path):
    file_queue = tf.train.string_input_producer([path])
    image_reader = tf.WholeFileReader()
    key, image = image_reader.read(file_queue)
    decode_img = tf.image.decode_png(image)
    flip_im = tf.image.random_flip_up_down(decode_img)
    with tf.Session() as sess:
        coord = tf.train.Coordinator()   # #创建一个协调器,管理线程
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)
        x = sess.run(flip_im)
        coord.request_stop()
        coord.join(threads)
    return x


x = flip_img(img_path)
plt.ion()
plt.imshow(x)
plt.pause(10)

5. tf.image.transpose_image(image) 将图片转置，即宽高对换

《四》Image Adjustments

1. tf.image.adjust_brightness(image, delta, min_value=None, max_value=None) # delta 可以为负数

2. tf.image.random_brightness(image, max_delta, seed=None) # 这儿的max_delta 必须为非负

3. tf.image.adjust_contrast(images, contrast_factor, min_value=None, max_value=None) 调整对比度

4. tf.image.random_contrast(image, lower, upper, seed=None)

def adjust_img(path):
    file_queue = tf.train.string_input_producer([path])
    image_reader = tf.WholeFileReader()
    key, image = image_reader.read(file_queue)
    decode_img = tf.image.decode_jpeg(image)
    adjust_img = tf.image.adjust_brightness(decode_img, 0.2)    # 0.2 是增加亮度的系数，数值越大图片越亮
    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(coord=coord)
        x = sess.run(adjust_img)
        coord.request_stop()
        coord.join(threads)
    return x

x = adjust_img(img_path)
plt.ion()
plt.imshow(x)
plt.pause(10)

5. tf.image.per_image_standardization(image) 白化操作,三维矩阵中的数字均值变为0，方差变为1。