【Tensorflow】用tersorflow内置函数做图片预处理

本文链接：https://blog.csdn.net/mao_xiao_feng/article/details/75386477

本文介绍TensorFlow内置图像处理函数，包括调整尺寸、裁剪、翻转等操作，并演示如何利用这些函数进行图像预处理。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

本篇作为【Tensorflow】超大规模数据集解决方案的补充，介绍一下tersorflow内置函数对图片的预处理。前面的方法都是用skimage等辅助库来处理图像，因为我们都是在外部处理完所有的图像，然后再输入网络，以Placeholder的形式。但是当我们使用Tensorflow内部的Input pipeline的时候，图片一经读取，就已经转换成了Tensorflow内置的格式，这种格式下，我们无法再用其他辅助工具来处理，此时，就只能使用tensorflow内部的图片处理方法了。

环境Tensorflow1.2，python2.7

我们还是使用上一篇的CoCo2014数据集里拷贝出来的那些图。

首先来看Image Resize：tf.image.resize_images函数，参数就不详细介绍了，重点对比一下不同的插值方法，程序如下

import tensorflow as tf
import matplotlib.pyplot as plt


dataset_path='train2014/COCO_train2014_000000000025.jpg'


with tf.Session() as sess:

    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
    resized_image1 = tf.image.resize_images(img_data_jpg, [200, 200])
    resized_image2 = tf.image.resize_images(img_data_jpg, [200, 200], 1)
    resized_image3 = tf.image.resize_images(img_data_jpg, [200, 200], 2)
    resized_image4 = tf.image.resize_images(img_data_jpg, [200, 200], 3)

    plt.figure()

    plt.subplot(221)
    plt.imshow(resized_image1.eval())
    plt.title('Bilinear interpolation')
    plt.subplot(222)
    plt.imshow(resized_image2.eval())
    plt.title('Nearest neighbor interpolation')
    plt.subplot(223)
    plt.imshow(resized_image3.eval())
    plt.title('Bicubic interpolation')
    plt.subplot(224)
    plt.imshow(resized_image4.eval())
    plt.title('Area interpolation')
    plt.show()
    sess.close()

Cropping：tf.image.resize_image_with_crop_or_pad，tf.image.central_crop，tf.image.pad_to_bounding_box，tf.image.crop_to_bounding_box实验程序：

import tensorflow as tf
import matplotlib.pyplot as plt


dataset_path='train2014/COCO_train2014_000000000025.jpg'


with tf.Session() as sess:

    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
    resized_image1 = tf.image.resize_image_with_crop_or_pad(img_data_jpg, 200, 200)
    resized_image2 = tf.image.central_crop(img_data_jpg, 0.6)
    resized_image3 = tf.image.crop_to_bounding_box(img_data_jpg, 0,0,200,200)
    resized_image4 = tf.image.resize_image_with_crop_or_pad(img_data_jpg,800,800)

    plt.figure()

    plt.subplot(221)
    plt.imshow(resized_image1.eval())
    plt.title('crop 200*200')
    plt.subplot(222)
    plt.imshow(resized_image2.eval())
    plt.title('60% of picture')
    plt.subplot(223)
    plt.imshow(resized_image3.eval())
    plt.title('from (0,0) crop 200*200(bounding box)')
    plt.subplot(224)
    plt.imshow(resized_image4.eval())
    plt.title('pad 800*800')
    plt.show()
    sess.close()

Flipping, Rotating and Transposing:

tf.image.flip_up_down，tf.image.random_flip_up_down，tf.image.flip_left_right，tf.image.random_flip_left_right，tf.image.transpose_image这里要说一下random_flip_up_down这个函数是以二分之一的几率反转。

import tensorflow as tf
import matplotlib.pyplot as plt


dataset_path='train2014/COCO_train2014_000000000025.jpg'


with tf.Session() as sess:

    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
    resized_image1 = tf.image.flip_up_down(img_data_jpg)
    resized_image2 = tf.image.random_flip_up_down(img_data_jpg)
    resized_image3 = tf.image.flip_left_right(img_data_jpg)
    resized_image4 = tf.image.transpose_image(img_data_jpg)

    plt.figure()

    plt.subplot(221)
    plt.imshow(resized_image1.eval())
    plt.title('flip_up_down')
    plt.subplot(222)
    plt.imshow(resized_image2.eval())
    plt.title('random_flip_up_down')
    plt.subplot(223)
    plt.imshow(resized_image3.eval())
    plt.title('flip_left_right')
    plt.subplot(224)
    plt.imshow(resized_image4.eval())
    plt.title('transpose')
    plt.show()
    sess.close()

Converting Between Colorspaces:tf.image.rgb_to_grayscale，tf.image.grayscale_to_rgb，tf.image.hsv_to_rgb，tf.image.rgb_to_hsv，tf.image.convert_image_dtype颜色空间转换，比较简单，不详述了。

ImageAdjustments:tf.image.adjust_brightness，tf.image.random_brightness，tf.image.adjust_contrast，tf.image.random_contrast，tf.image.adjust_hue，tf.image.random_hue，tf.image.adjust_gamma，tf.image.adjust_saturation，tf.image.random_saturation，tf.image.per_image_standardization调节亮度对比度的一系列操作。需要注意的是tf.image.per_image_standardization这个函数是对单张图像做规范化的，它计算 (x - mean) / adjusted_stddev, 其中 mean 指图像的均值, adjusted_stddev = max(stddev, 1.0/sqrt(image.NumElements()))，stddev则是标准差。

import tensorflow as tf
import matplotlib.pyplot as plt


dataset_path='train2014/COCO_train2014_000000000025.jpg'


with tf.Session() as sess:

    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
    resized_image1 = tf.image.adjust_brightness(img_data_jpg,0.003)
    resized_image2 = tf.image.adjust_contrast(img_data_jpg,0.3)
    resized_image3 = tf.image.random_hue(img_data_jpg,0.3)
    resized_image4 = tf.image.per_image_standardization(img_data_jpg)

    plt.figure()

    plt.subplot(221)
    plt.imshow(resized_image1.eval())
    plt.title('adjust_brightness')
    plt.subplot(222)
    plt.imshow(resized_image2.eval())
    plt.title('adjust_contrast')
    plt.subplot(223)
    plt.imshow(resized_image3.eval())
    plt.title('random_hue')
    plt.subplot(224)
    plt.imshow(resized_image4.eval())
    plt.title('standardization')
    plt.show()
    sess.close()

Draw Bounding Boxes:tf.image.draw_bounding_boxes

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

dataset_path='train2014/COCO_train2014_000000000025.jpg'


with tf.Session() as sess:
    boxes = tf.constant([[[0.1, 0.2, 0.5, 0.9]]],dtype=tf.float32)
    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
    batch_data_jpg=tf.expand_dims(img_data_jpg, 0)
    resized_image1 = tf.image.draw_bounding_boxes(batch_data_jpg,boxes)

    plt.figure()

    plt.imshow(np.squeeze(resized_image1.eval()))
    plt.title('draw_bounding boxes')
    plt.show()
    sess.close()

Total_variation：tf.image.total_variation。这是对一张图片计算总变差，即像素之间的差异大小。通常设置loss = tf.reduce_sum(tf.image.total_variation(images))加到优化项里面，可以平滑生成的图像。

import tensorflow as tf

dataset_path='train2014/COCO_train2014_000000000025.jpg'


with tf.Session() as sess:
    boxes = tf.constant([[[0.1, 0.2, 0.5, 0.9]]],dtype=tf.float32)
    image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
    img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
    img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
    total_v=tf.image.total_variation(img_data_jpg)

    print total_v.eval()
    sess.close()

这张图片的Total_variation即为：