TensorFlow学习笔记(6) 图像数据处理

最新推荐文章于 2022-09-23 17:58:32 发布

JoeYF_

最新推荐文章于 2022-09-23 17:58:32 发布

阅读量1.3k

点赞数 2

分类专栏： TensorFlow：实战Google深度学习框架文章标签： TensorFlow

本文链接：https://blog.csdn.net/qyf394613530/article/details/79321872

版权

TensorFlow：实战Google深度学习框架专栏收录该内容

8 篇文章 2 订阅

订阅专栏

TFRecord输入数据格式

TF提供了一种统一的格式来存储数据，这个格式就是TFRecord。TFRecord文件中的数据都是通过tf.train.Example存储的。tf.train.Example中包括一个从属性名称到取值的字典。其中属性名称为一个字符串，属性的取值为字符串（BytesList）、实数列表（FloatList）或者整数列表（Int64List）。

将MNIST输入数据转化为TFRecord格式

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

#生成整数型的属性
def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

#生成字符串型的属性
def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

mnist = input_data.read_data_sets('MNIST_data', dtype=tf.uint8, one_hot=True)
images = mnist.train.images
labels = mnist.train.labels
pixels = images.shape[1]
num_examples = mnist.train.num_examples

filename = 'mnist.tfrecords'
#创建一个writer来写TFRecord文件
writer = tf.python_io.TFRecordWriter(filename)
for index in range(num_examples):
    #将图像矩阵转化为一个字符串
    images_raw = images[index].tobytes()
    example = tf.train.Example(features=tf.train.Features(feature={
        'pixels': _int64_feature(pixels),
        'label': _int64_feature(np.argmax(labels[index])),
        'image_raw': _bytes_feature(images_raw)}))
    writer.write(example.SerializeToString())
writer.close()

读取TFRecord文件中的数据

import tensorflow as tf

#创建一个reader来读取TFRecord文件中的样例
reader = tf.TFRecordReader()
#创建一个队列来维护输入文件列表
filename_queue = tf.train.string_input_producer(['mnist.tfrecords'])
#从文件中读出一个样例
_, serialized_example = reader.read(filename_queue)
#解析读入的一个样例
features = tf.parse_single_example(serialized_example, features={
    #tf.FixedLenFeature解析结果为tensor
    'image_raw': tf.FixedLenFeature([], tf.string),
    'pixels': tf.FixedLenFeature([], tf.int64),
    'label': tf.FixedLenFeature([],tf.int64)
})

#tf.decode_raw可以将字符串解析成图像对应的像素数组
images = tf.decode_raw(features['image_raw'], tf.uint8)
labels = tf.cast(features['label'], tf.int32)
pixels = tf.cast(features['pixels'], tf.int32)

sess = tf.Session()
#启动多线程处理数据
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)

for i in range(10):
    image, label, pixel = sess.run([images, labels, pixels])

Tensorflow图像预处理函数

图像编码处理

图像在存储时并不是直接记录图像矩阵中各个像素值，而是记录经过压缩编码之后的结果。将一张图像还原成矩阵，需要解码的过程。TF提供了对jpeg和png格式图像的编码/解码函数：

import matplotlib.pyplot as plt
import tensorflow as tf

#读取原始图像数据
image_raw_data = tf.gfile.FastGFile('picture.jpeg', 'rb').read()

with tf.Session() as sess:
    #tf.image.decode_jpeg对jpeg图像进行解码
    #tf.image.decode_png对png图像进行解码，结果均为一张量
    img_data = tf.image.decode_jpeg(image_raw_data)
    print(sess.run(img_data))

    plt.imshow(sess.run(img_data))
    plt.show()

    #转换数据类型为实数方便处理
    img_data = tf.image.convert_image_dtype(img_data, dtype=tf.float32)

    #将张量编码为jpeg格式，重新保存
    encode_img = tf.image.encode_jpeg(img_data)
    with tf.gfile.FastGfile('picture_out.png', 'wb') as f:
        f.write(sess.run(encode_img))

图像大小调整

图像大小的调整有两种方式，第一种是通过tf.image.resize_images算法使得新的图像尽量保存原始图像上的信息。

#img_data是已经经过解码且进行类型转化的数据

#第二个第三个参数为调整后图像的大小
#method=0   采用双线性插值法
#method=1   采用最近邻居法
#method=2   采用双三次插值法
#method=3   采用面积插值法
resized = tf.image.resize_images(img_data, 300, 300, method=0)

另外一种是tf.image.resize_image_with_crop_or_pad对图像进行裁剪或填充。

#通过tf.image.resize_image_with_crop_or_pad可以实现图像的裁剪和填充
#如果给定尺寸比原图大则填充，否则裁剪原图居中部分
croped = tf.image.resize_image_with_crop_or_pad(img_data, 1000, 1000)
padded = tf.image.resize_image_with_crop_or_pad(img_data, 3000, 3000)

另外tf.image.central_crop还支持通过比例调整图片大小。

# central_fraction：调整比例，(0,1]直接的实数
crop = tf.image.central_crop(image, central_fraction=0.5)

另外tf.image.crop_to_bounding_box函数和tf.image.pad_to_bounding_box函数可以用来剪裁或填充给定区域的图像。

图像翻转

以下代码实现图像上下翻转、左右翻转以及对角线翻转。

   fliped = tf.image.flip_up_down(img_data)
fliped = tf.image.flip_left_right(img_data)

transposed = tf.image.transpose_image(img_data)

通过随机翻转训练图像的方式可以使得训练得到的模型可以识别不同角度的实体，因此，下面的api可以以一定概率翻转图像：

    fliped = tf.image.random_flip_up_down(img_data)
    fliped = tf.image.random_flip_left_right(img_data)

图像色彩调整

#亮度-0.5 及 +0.5
adjusted = tf.image.adjust_brightness(img_data, -0.5)
adjusted = tf.image.adjust_brightness(img_data, +0.5)
#在[-x，x]范围内随机调整亮度
adjusted = tf.image.random_brightness(img_data, x)

#对比度-0.5 及 +0.5
adjusted = tf.image.adjust_contrast(img_data, -0.5)
adjusted = tf.image.adjust_contrast(img_data, +0.5)
#在[x，y]范围内随机调整对比度
adjusted = tf.image.random_contrast(img_data, x, y)

#色相 +0.1  及 +0.5
adjusted = tf.image.adjust_hue(img_data, 0.1)
adjusted = tf.image.adjust_hue(img_data, 0.5)
#在[-x，x]范围内随机调整色相
adjusted = tf.image.random_hue(img_data, x)

#饱和度 -0.5 及 +0.5
adjusted = tf.image.adjust_saturation(img_data, -0.5)
adjusted = tf.image.adjust_saturation(img_data, 0.5)
#在[-x，x]范围内随机调整饱和度
adjusted = tf.image.random_saturation(img_data, x, y)

#将图像变为均值为0方差为1的图像
adjusted = tf.image.per_image_standardization(img_data)

处理标注框

TF提供了一些工具来标注图像中需要关注的物体，使用tf.image.draw_bounding_boxes可以在图像中加入标注框。

    img_data = tf.image.resize_images(img_data, [108, 192], method=0)
    #由于tf.image.draw_bounding_boxes函数输入图像为一个四维矩阵，所以解码之后要加一维，其中第一维度表示第几张图片
    batched = tf.expand_dims(tf.image.convert_image_dtype(img_data, tf.float32), 0)
    #给出图像的标注框,一个标注框有四个数字，分别代表[ymin,xmin,ymax,xmax],其中数字都代表其相对位置，即与原图大小相乘
    boxes = tf.constant([[[0.1, 0.2, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]])
    result = tf.image.draw_bounding_boxes(batched, boxes)
# 展示画框后的图片
    plt.imshow(sess.run(result[0]))
    plt.show()

另外，也可以通过tf.image.sample_distorted_bounding_box函数实现图像的随机截取，这样可以提高模型的robustness，不受被识别物体大小的影响。

    boxes = tf.constant([[[0.1, 0.2, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]])
    #通过提供标注框的方式来限定随机截取的范围
    begin, size, bbox_for_draw = tf.image.sample_distorted_bounding_box(tf.shape(img_data), bounding_boxes=boxes)
    batched = tf.expand_dims(tf.image.convert_image_dtype(img_data, tf.float32), 0)
    #绘制标注框
    image_with_box = tf.image.draw_bounding_boxes(batched, bbox_for_draw)
    #提取截图切片
    distorted_image = tf.slice(img_data, begin, size)

图像预处理完整样例

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

# 随机调整图片的色彩，color_ordering为指定的顺序的参数
def distort_color(image, color_ordering=0):
    if color_ordering == 0:
        image = tf.image.random_brightness(image, max_delta=32./255.)
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
        image = tf.image.random_hue(image, max_delta=0.2)
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
    elif color_ordering == 1:
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
        image = tf.image.random_brightness(image, max_delta=32./255.)
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
        image = tf.image.random_hue(image, max_delta=0.2)
    #...more
    return tf.clip_by_value(image, 0.0, 1.0)

# 对图片进行预处理，并将其变成成神经网络的输入层的大小
# 给定一张解码的图像、目标图像的尺寸以及图像上的标注匡，对给出的图像进行预处理
# 注意：此处只处理模型的训练数据，对预测数据不需要使用随机变换的步骤
def preprocess_for_train(image, height, width, bbox):
    # 如果没有标注框，则认为图像就是整个需要关注的部分
    if bbox is None:
        bbox = tf.constant([0.0, 0.0, 1.0, 1.0], dtype=tf.float32, shape=[1, 1, 4])
    # 转换图像的张量的类型
    if image.dtype != tf.float32:
        image = tf.image.convert_image_dtype(image, dtype=tf.float32)

    # 随机的截取图片中一个块，减小物体大小对图像识别算法的影响
    bbox_begin, bbox_size, _ = tf.image.sample_distorted_bounding_box(tf.shape(image), bounding_boxes=bbox, min_object_covered=0.1)
    distorted_image = tf.slice(image, bbox_begin, bbox_size)

    # 将随机截取的图片调整为神经网络输入层的大小，大小调整的算法是随机选择的
    distorted_image = tf.image.resize_images(distorted_image, [height, width], method=np.random.randint(4))
    # 随机左右翻转图像
    distorted_image = tf.image.random_flip_left_right(distorted_image)
    # 使用一种随机的顺序调整图像的色彩
    distorted_image = distort_color(distorted_image, np.random.randint(2))
    return distorted_image


image_raw_data = tf.gfile.FastGFile("jpg", "rb").read()
with tf.Session() as sess:
    img_data = tf.image.decode_jpeg(image_raw_data)
    boxes = tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]])
    # 运行6次获得6种不同的图像
    for i in range(6):
        result = preprocess_for_train(img_data, 299, 299, boxes)
        plt.imshow(result.eval())
        plt.show()

JoeYF_

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
TensorFlow学习笔记(6) 图像数据处理

目录TFRecord输入数据格式Tensorflow图像预处理函数图像编码处理图像大小调整图像翻转图像色彩调整处理标注框图像预处理完整样例TFRecord输入数据格式TF提供了一种统一的格式来存储数据，这个格式就是TFRecord。TFRecord文件中的数据都是通过tf.train.Example存储的。tf.train.Example中包括一个从属性名称...
复制链接

扫一扫