TensorFlow 深度学习框架（11） -- 图像数据处理

最新推荐文章于 2022-05-18 16:30:01 发布

ouprince

最新推荐文章于 2022-05-18 16:30:01 发布

阅读量1.1k

点赞数

分类专栏： TensorFlow 深度学习笔记文章标签： tensorflow 图像处理

TensorFlow 深度学习笔记专栏收录该内容

21 篇文章 5 订阅

订阅专栏

图像编码处理

图像在存储时并不是直接记录矩阵中的数字，而是经过压缩编码后的结果。所以要将一张图像还原成一个三维矩阵，需要解码的过程。TensorFlow 提供了对 jpeg 和 png 格式图像的编码/解码函数。

# 读取原始图像的数据
import tensorflow as tf
import matplotlib.pyplot as plt

image_raw_data = tf.gfile.FastGFile("data/2345_image_file_copy_1.jpg","r").read()

with tf.Session() as sess:
    # TensorFlow 提供了 tf.image.decode_png 函数对 png 格式图片解码
    # jpg 格式提供了 tf.image.decode_jpeg 解码,结果为一个张量
    img_data = tf.image.decode_jpeg(image_raw_data)
    # 输出为图片的三维矩阵
    print img_data.eval()

    # 使用pyplot 工具可视化得到的图像
    plt.imshow(img_data.eval())
    plt.show()

    # 将三维矩阵编码成图像, encode
    encoded_image = tf.image.encode_jpeg(img_data)
    # 保存编码后的图片
    with tf.gfile.GFile("data/ou.jpg","w") as f:
        f.write(encoded_image.eval())

图片大小调整

一般来说，图片的大小是不固定，但神经网络输入节点的个数是固定的。所以在将图像的像素作为输入提供给神经网络之前，需要先将图像的大小统一。这就是图像大小调整需要完成的任务。

# TensorFlow 提供了四种不同的方法调整图像大小，并将它们封装到了
# tf.image.resize_images 函数
'''将图像解码的三维矩阵的数据类型转化成实数方便处理 '''
img_data = tf.image.convert_image_dtype(img_data,dtype = tf.float32)

'''
通过 tf.image.resize_images 调整图像大小。第一位是图片的高度，第二维是宽度
method = 0 : 双线性插值法
method = 1 : 最近邻法
method = 2 : 双三次插值法
method = 4 : 面积插值法
'''
resized = tf.image.resize_images(img_data,[300,400],method = 0)
print resized.get_shape()

# 保存调整的图片需要先转换三维矩阵类型
resized = tf.image.convert_image_dtype(resized,dtype = tf.uint8)

剪裁与填充图片，通过 tf.image.resize_image_with_crop_or_pad 来调整图片，如果原始图片的尺寸大于目标图像，则剪裁中间的部分，否则使用全 0 填充（也就是黑色填充）

# 剪裁
croped = tf.image.resize_image_with_crop_or_pad(img_data,300,400)
# 填充
padded = tf.image.resize_image_with_crop_or_pad(img_data,3000,4000)

TensorFlow 还通过比例调整图像大小。比例在(0,1] （按比例剪裁）

# 剪裁 50% ,取中间 50% 像素
central_cropped = tf.image.central_crop(img_data,0.5)

图像翻转

实现将图像上下翻转，左右翻转以及沿对角线翻转的功能

# 上下翻转
flipped = tf.image.flip_up_down(img_data)
# 左右翻转
flipped = tf.image.flip_left_right(img_data)
# 沿对角线翻转
flipped = tf.image.transpose_image(img_data)

'''
在训练过程中，图像的翻转应该不会影响识别的结果。于是在训练图像识别的神经网络时，可以随机翻转
训练图像，使得模型可以识别不同角度的实体
'''
# 以一定概率上下翻转
flipped = tf.image.random_flip_up_down(img_data)
# 以一定概率左右翻转
flipped = tf.image.random_flip_left_right(img_data)

图像色彩调整

和图像翻转类似，调整图像的亮度，对比度，饱和度和色相在很多图像识别应用中都不会影响识别结果。所以在训练时，可以随机调整图像的这些属性，从而使得训练得到的模型尽可能小的收到无关因素的影响。

# 将图像的亮度 -0.5
adjusted = tf.image.adjust_brightness(img_data,-0.5)
# 将图像的亮度 +0.5
adjusted = tf.image.adjust_brightness(img_data,0.5)
# 在 [-max_delta,max_delta) 的范围随机调整图像的亮度
adjusted = tf.image.random_brightness(image,max_delta)

# 将图片对比度 -5
adjusted = tf.image.adjust_contrast(img_data,-5)
# 将图片对比度 +5
adjusted = tf.image.adjust_contrast(img_data,5)
# 在 [lower,upper] 的范围随机调整对比度
adjusted = tf.image.randon_contrast(img_data,lower,upper)

# 调整色相 +0.1
adjusted = tf.image.adjust_hue(img_data,0.1)
# max_delta 的取值在 [0,0.5]
adjusted = tf.image.random_hue(img_data,max_delta)

# 将图像饱和度 -5
adjusted = tf.image.adjust_saturation(img_data,-5)
# 将图像饱和度 +5
adjusted = tf.image.adjust_saturation(img_data,5)
# 在 [lower,upper] 的范围内随机调整饱和度
adjusted = tf.image.random_saturation(img_data,lower,upper)

# 将代表一张图像的三维矩阵的数字 均值变成0 ，方差变成 1 （符合标准正态分布）
adjusted = tf.image.per_image_standardization(img_data)

处理标注框

在很多图像识别的数据集中，图像中需要关注的物体通常会被标注框圈出来。TensorFlow 提供了一些工具来处理标注框。

# 通过 tf.image.draw_bounding_boxes 函数在图像中加入标注框
# 要求图像矩阵中的数字为实数
# 输入是一个 batch 的数据，也就是多张图片的四维矩阵
batched = tf.expand_dims(
    tf.image.convert_image_dtype(img_data,tf.float32),0)
''' 
给出每一张图片的所有标注框。一个标注框有四个数字，分别代表 [ymin,xmin,ymax,xmax]
注意这里的数字都是 相对位置，也就是都是小于 1 的（三维矩阵） '''
boxes = tf.constant([[[0.05,0.05,0.9,0.7],[0.35,0.47,0.5,0.56]]])
# 加入标注框
result = tf.image.draw_bounding_boxes(batched,boxes)

''' 
和随机翻转，随机调整一样，随机截取图像上有信息量的部分也是一种提高模型健壮性的方式
通过 tf.image.sample_distorted_bounding_box 函数完成随机截取 '''
boxes = tf.constant([[[0.05,0.05,0.9,0.7],[0.35,0.47,0.5,0.56]]])
''' 
通过提供标注框的方式来告诉随机截取图像的算法哪部分是 "有信息量" 的 '''
begin,size,bbox_for_draw = tf.image.sample_distorted_bounding_box(
    tf.shape(img_data),bounding_boxes = boxes)
batched = tf.expand_dims(
    tf.image.convert_image_dtype(img_data,tf.float32),0)
image_with_box = tf.image.draw_bounding_boxes(batched,bbox_for_draw)
# 截取随机出来的图像。因为算法带有随机成分，所以每次得到的结果不同
distorted_image = tf.slice(img_data,begin,size)