tensorflow图像处理(总结)

最新推荐文章于 2024-08-20 19:06:15 发布

那郎

最新推荐文章于 2024-08-20 19:06:15 发布

阅读量597

点赞数 1

分类专栏：神经网络图像处理文章标签：计算机视觉 python

本文链接：https://blog.csdn.net/weixin_44816589/article/details/104425967

版权

图像处理同时被 2 个专栏收录

6 篇文章 0 订阅

订阅专栏

神经网络

3 篇文章 0 订阅

订阅专栏

tensorflow图像处理

1、图像编解码

一幅RGB色彩模式的图像，可以看成是一个三维矩阵，即通道数为3。
然而图像在存储是并没有直接记录为矩阵，而是经过压缩编码后的结果。所以需要解码过程。
常见的图片格式：jpeg, jpg, png, gif。

1.1 jpeg/jpg格式

注意点：
1、在tensorflow 1.0中使用tf.gfile.FastGFile()，而在tensorflow 2.0中使用tf.gfile.GFile()
2、读取格式，必须“rb”，相关解释在Joweay 博客，https://blog.csdn.net/Joweay/article/details/89182311
3、编码：image.encode_jpeg()；解码：image.decode_jpeg().

代码如下：

import tensorflow.compat.v1 as tf
import matplotlib.pyplot as plt

# 注意：此为tensorflow 2.0
# 如果是tensorflow 1.0 则需要：
# import tensorflow as tf

image = tf.gfile.GFile("myq.jpg", "rb").read()
with tf.Session() as sess:
    # 解码过程
    img_after_decode = tf.image.decode_jpeg(image)
    print("img_after_decode \n", img_after_decode.eval())

    plt.imshow(img_after_decode.eval())
    plt.show()
    
    # 解码数据重新编码过程，感兴趣的同学可以自定义一个myq.jpg文件，运行后比较一下myq.jpg和myq1.jpg两个文件，确认一下是否相同。
    myq_encode_image = tf.image.encode_jpeg(img_after_decode)
    with tf.gfile.GFile("myq1.jpg", "wb") as f:
        f.write(myq_encode_image.eval())

1.2 png格式

类似于jpeg/jpg格式编解码，编码：image.encode_png()；解码：image.decode_png().
注意：解码jpeg/jpg格式图像和解码png格式图像后数值格式不同（通道数不同）这是因为png格式图像不仅储存了RGB三通道数据外，还存储了透明度Alpha数据。
代码于1.1类似，只需要更改一下图片和API即可，此处不再赘述。

1.3 gif格式

编码：image.encode_gif()；解码：image.decode_gif().

参考：
tensorflow 深度学习算法原理与编程开发，蒋子阳

2 图像翻转

目的：通过翻转图像，获得更多的训练数据的同时，保证可接受的加大存储程度。
随机左右翻转
image.random_flip_left_right(image, seed)
左右翻转
image.flip_left_right(image)
随机上下翻转
image.random_flip_up_rdown(image, seed)
上下翻转
image.flip_up_down(image)
沿对角线翻转
image.transpose_image(image)

以随机左右翻转为例，代码如下：

import tensorflow.compat.v1 as tf
import matplotlib.pyplot as plt

image = tf.gfile.GFile("myq.jpg", "rb").read()
with tf.Session() as sess:
    # 解码过程
    img_after_decode = tf.image.decode_jpeg(image)
   
    print("img_after_decode \n", img_after_decode.eval())
    plt.imshow(img_after_decode.eval())
    plt.show()
    # 随机左右翻转过程
    flipped = tf.image.random_flip_left_right(img_after_decode)
    plt.imshow(flipped.eval())
    plt.show()

    # 编码过程
    myq_encode_image = tf.image.encode_jpeg(flipped)
    with tf.gfile.GFile("myq_flipped.jpg", "wb") as f:
        f.write(myq_encode_image.eval())

3 图像旋转

待完善。。。

4 图像色彩调整

亮度：
image.random_brightness(image, max_delta, seed)
image.adjust_brightness(image, delta)
delta取值[-1,1]。为正时，亮度更加；为负时，亮度降低。
对比度：random_contrast(); adjust_contrast()。
饱和度：random_saturation(); adjust_saturation()。
色相：random_hue(); adjust_hue()。不适用于通道数为4的情况，所以png图片先转化为jpg/jpeg格式。

以亮度为例，代码如下：

import tensorflow.compat.v1 as tf
import matplotlib.pyplot as plt

image = tf.gfile.GFile("myq.jpg", "rb").read()
with tf.Session() as sess:
    # 解码过程
    img_after_decode = tf.image.decode_jpeg(image)
    print("img_after_decode \n", img_after_decode.eval())
    plt.imshow(img_after_decode.eval())
    plt.show()
    
    # 随机亮度调整
    adjusted_brightness = tf.image.random_brightness(img_after_decode, max_delta=0.5)
    plt.imshow(adjusted_brightness.eval())
    plt.show()

    # 编码过程
    myq_encode_image = tf.image.encode_jpeg(adjusted_brightness)
    with tf.gfile.GFile("myq_adjust_brightness.jpg", "wb") as f:
        f.write(myq_encode_image.eval())

5 图像标准化处理

什么是图像标准化：亮度均值为0，方差为1。
API：standardization = tf.image.per_image_standardization(img_after_decode)

6 图像调整大小

现实中，各渠道获取的图片大小千变万化，而我们的神经网络输入即图像的大小是固定的，那就需要对我们获取的图片进行大小的调整。
API：image.resize_images()；该函数通过一定的算法使得新的图像看起来和原始图像一模一样，也就是说尽可能的保留了原始图像的特征/所有信息。
4种调整大小的算法：

method	图像大小调整算法
0	双线性插值法（bilinear interpolation）
1	最近邻居法（nearest neighbour interpolation）
2	双三次插值法（bicubic interpolation）
3	面积插值法（areainterpolation）

以面积插值法为例，代码如下：

import tensorflow.compat.v1 as tf
import matplotlib.pyplot as plt
import numpy as np

image = tf.gfile.GFile("myq.jpg", "rb").read()
with tf.Session() as sess:
    # 解码过程
    img_after_decode = tf.image.decode_jpeg(image)

    print("img_after_decode \n", img_after_decode.eval())
    plt.imshow(img_after_decode.eval())
    plt.show()

    # 调整图像大小过程，method=3，面积插值法
    resized = tf.image.resize_images(img_after_decode, [300, 300], method=3)     # 格式uint8
    print("type of resized: ", type(resized))
    print("dtype of resized: ", resized.dtype)
    # resize_images()函数处理图片后返回的数据是float32格式的，所以需要转换成uint8才能正确打印图片。
    resized = np.asarray(resized.eval(), dtype="uint8")
    plt.imshow(resized)
    plt.show()

此处，笔者忽然想比较下type与dtype的去区别（首先是用法不一样，哈哈哈），于是进行了打印查看。
print("type of resized: ", type(resized))
print("dtype of resized: ", resized.dtype)
对应的结果为，各位看官可以自行区别。了解即可。
type of resized: <class’tensorflow.python.framework.ops.Tensor’>
dtype of resized: <dtype: ‘float32’>

另外，image.resize_with_crop_or_pad()函数实现了剪裁/填充的功能，在使用时会对该函数传入的图像数据（解码后）调整为一个指定目标大小。如果传入图像小于目标大小，则四周进行0填充（此处类似于卷积过程/池化过程的0填充）；如果传入图像大于目标大小，则函数会以图像的中心为中心对图像进行剪裁，从而得到目标大小的图像。

import tensorflow.compat.v1 as tf
import matplotlib.pyplot as plt
import numpy as np

image = tf.gfile.GFile("myq.jpg", "rb").read()
with tf.Session() as sess:
    # 解码过程
    img_after_decode = tf.image.decode_jpeg(image)

    print("img_after_decode \n", img_after_decode.eval())
    plt.imshow(img_after_decode.eval())
    plt.show()

    # 调整图像大小过程，method=3，面积插值法
    resized = tf.image.resize_images(img_after_decode, [300, 300], method=3)     # 格式uint8
    print("type of resized: ", type(resized))
    print("dtype of resized: ", resized.dtype)
    # resize_images()函数处理图片后返回的数据是float32格式的，所以需要转换成uint8才能正确打印图片。
    resized = np.asarray(resized.eval(), dtype="uint8")
    plt.imshow(resized)
    plt.show()

    resize_crop_pad1 = tf.image.resize_image_with_crop_or_pad(img_after_decode, 300, 300)
    resize_crop_pad2 = tf.image.resize_image_with_crop_or_pad(img_after_decode, 5000, 7000)

    plt.imshow(resize_crop_pad1.eval())
    plt.show()
    plt.imshow(resize_crop_pad2.eval())
    plt.show()

此外，
image_central_crop(image, central_fraction)，以原图像中心为中心，按传递进来的比例参数将图像进行剪裁。其中central_fraction区间为（0，1）。
image.crop_to_bounding_box(image, offset_height, offset_width, target_height, target_width)通过设置高度和宽度方向的偏移量控制所要剪裁的区域。
image.pad_to_bounding_box(image, offset_height, offset_width, target_height, target_width)通过设置高度和宽度方向的偏移量控制所要填充的区域。

7 图像添加标注框

在图像识别的数据集中，某些需要关注的特征/物体通常需要标注框圈出来。
API：image.draw_bounding_boxes()，注意传入函数的数据集为实数型。

import tensorflow.compat.v1 as tf
import matplotlib.pyplot as plt

image = tf.gfile.GFile("myq.jpg", "rb").read()
with tf.Session() as sess:
    # 解码过程
    img_after_decode = tf.image.decode_jpeg(image)

    print("img_after_decode \n", img_after_decode.eval())
    plt.imshow(img_after_decode.eval())
    plt.show()

    # tf.expand_dims()处理多张图片即batch,所以需要加一维；tf.image.convert_image_dtype()将解码后的数据转为实数型
    batched = tf.expand_dims(tf.image.convert_image_dtype(img_after_decode, tf.float32), 0)

    # 定义边框的坐标参数[y_min, x_min, y_max, x_max],且各个数值在[0,1]范围内
    boxes = tf.constant([[[0.20, 0.23, 0.95, 0.62], [0.25, 0.4, 0.4, 0.6]]])

    # 绘制边框
    image_boxed = tf.image.draw_bounding_boxes(batched, boxes)
    plt.imshow(image_boxed[0].eval())
    plt.show()

增加随机添加标注框和随机剪裁：
API：image_draw_bounding_boxes() 以及slice()

import tensorflow.compat.v1 as tf
import matplotlib.pyplot as plt

image = tf.gfile.GFile("myq.jpg", "rb").read()
with tf.Session() as sess:
    # 解码过程
    img_after_decode = tf.image.decode_jpeg(image)

    print("img_after_decode \n", img_after_decode.eval())
    plt.imshow(img_after_decode.eval())
    plt.show()

    # 定义边框的坐标参数[y_min, x_min, y_max, x_max],且各个数值在[0,1]范围内
    boxes = tf.constant([[[0.20, 0.23, 0.95, 0.62], [0.25, 0.4, 0.4, 0.6]]])

    begin, size, bounding_box = tf.image.sample_distorted_bounding_box(tf.shape(img_after_decode), bounding_boxes=boxes)
    print("begin is \n", begin)
    print("size is \n", size)
    print("bounding_box is \n", bounding_box)
    # tf.expand_dims()处理多张图片即batch,所以需要加一维；tf.image.convert_image_dtype()将解码后的数据转为实数型
    batched = tf.expand_dims(tf.image.convert_image_dtype(img_after_decode, tf.float32), 0)

    # 绘制边框
    image_boxed = tf.image.draw_bounding_boxes(batched, bounding_box)
    plt.imshow(image_boxed[0].eval())
    plt.show()

    sliced_image = tf.slice(img_after_decode, begin, size)
    plt.imshow(sliced_image .eval())
    plt.show()

其中：begin, size, bounding_box打印的结果如下：
begin is
Tensor(“sample_distorted_bounding_box/SampleDistortedBoundingBoxV2:0”, shape=(3,), dtype=int32)
size is
Tensor(“sample_distorted_bounding_box/SampleDistortedBoundingBoxV2:1”, shape=(3,), dtype=int32)
bounding_box is
Tensor(“sample_distorted_bounding_box/SampleDistortedBoundingBoxV2:2”, shape=(1, 1, 4), dtype=float32)

由于：tf.image.sample_distorted_bounding_box()返回值具有随机性，所以每次得到的结果也都不同。