本篇作为【Tensorflow】超大规模数据集解决方案的补充,介绍一下tersorflow内置函数对图片的预处理。前面的方法都是用skimage等辅助库来处理图像,因为我们都是在外部处理完所有的图像,然后再输入网络,以Placeholder的形式。但是当我们使用Tensorflow内部的Input pipeline的时候,图片一经读取,就已经转换成了Tensorflow内置的格式,这种格式下,我们无法再用其他辅助工具来处理,此时,就只能使用tensorflow内部的图片处理方法了。
环境Tensorflow1.2,python2.7
我们还是使用上一篇的CoCo2014数据集里拷贝出来的那些图。
首先来看Image Resize:tf.image.resize_images函数,参数就不详细介绍了,重点对比一下不同的插值方法,程序如下
import tensorflow as tf
import matplotlib.pyplot as plt
dataset_path='train2014/COCO_train2014_000000000025.jpg'
with tf.Session() as sess:
image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
resized_image1 = tf.image.resize_images(img_data_jpg, [200, 200])
resized_image2 = tf.image.resize_images(img_data_jpg, [200, 200], 1)
resized_image3 = tf.image.resize_images(img_data_jpg, [200, 200], 2)
resized_image4 = tf.image.resize_images(img_data_jpg, [200, 200], 3)
plt.figure()
plt.subplot(221)
plt.imshow(resized_image1.eval())
plt.title('Bilinear interpolation')
plt.subplot(222)
plt.imshow(resized_image2.eval())
plt.title('Nearest neighbor interpolation')
plt.subplot(223)
plt.imshow(resized_image3.eval())
plt.title('Bicubic interpolation')
plt.subplot(224)
plt.imshow(resized_image4.eval())
plt.title('Area interpolation')
plt.show()
sess.close()
Cropping:tf.image.resize_image_with_crop_or_pad,tf.image.central_crop,tf.image.pad_to_bounding_box,tf.image.crop_to_bounding_box实验程序:
import tensorflow as tf
import matplotlib.pyplot as plt
dataset_path='train2014/COCO_train2014_000000000025.jpg'
with tf.Session() as sess:
image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
resized_image1 = tf.image.resize_image_with_crop_or_pad(img_data_jpg, 200, 200)
resized_image2 = tf.image.central_crop(img_data_jpg, 0.6)
resized_image3 = tf.image.crop_to_bounding_box(img_data_jpg, 0,0,200,200)
resized_image4 = tf.image.resize_image_with_crop_or_pad(img_data_jpg,800,800)
plt.figure()
plt.subplot(221)
plt.imshow(resized_image1.eval())
plt.title('crop 200*200')
plt.subplot(222)
plt.imshow(resized_image2.eval())
plt.title('60% of picture')
plt.subplot(223)
plt.imshow(resized_image3.eval())
plt.title('from (0,0) crop 200*200(bounding box)')
plt.subplot(224)
plt.imshow(resized_image4.eval())
plt.title('pad 800*800')
plt.show()
sess.close()
Flipping, Rotating and Transposing:
tf.image.flip_up_down,tf.image.random_flip_up_down,tf.image.flip_left_right,tf.image.random_flip_left_right,tf.image.transpose_image这里要说一下random_flip_up_down这个函数是以二分之一的几率反转。
import tensorflow as tf
import matplotlib.pyplot as plt
dataset_path='train2014/COCO_train2014_000000000025.jpg'
with tf.Session() as sess:
image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
resized_image1 = tf.image.flip_up_down(img_data_jpg)
resized_image2 = tf.image.random_flip_up_down(img_data_jpg)
resized_image3 = tf.image.flip_left_right(img_data_jpg)
resized_image4 = tf.image.transpose_image(img_data_jpg)
plt.figure()
plt.subplot(221)
plt.imshow(resized_image1.eval())
plt.title('flip_up_down')
plt.subplot(222)
plt.imshow(resized_image2.eval())
plt.title('random_flip_up_down')
plt.subplot(223)
plt.imshow(resized_image3.eval())
plt.title('flip_left_right')
plt.subplot(224)
plt.imshow(resized_image4.eval())
plt.title('transpose')
plt.show()
sess.close()
Converting Between Colorspaces:tf.image.rgb_to_grayscale,tf.image.grayscale_to_rgb,tf.image.hsv_to_rgb,tf.image.rgb_to_hsv,tf.image.convert_image_dtype颜色空间转换,比较简单,不详述了。
ImageAdjustments:tf.image.adjust_brightness,tf.image.random_brightness,tf.image.adjust_contrast,tf.image.random_contrast,tf.image.adjust_hue,tf.image.random_hue,tf.image.adjust_gamma,tf.image.adjust_saturation,tf.image.random_saturation,tf.image.per_image_standardization调节亮度对比度的一系列操作。需要注意的是tf.image.per_image_standardization这个函数是对单张图像做规范化的,它计算 (x - mean) / adjusted_stddev
, 其中 mean
指图像的均值, adjusted_stddev = max(stddev, 1.0/sqrt(image.NumElements()))
,stddev则是标准差。
import tensorflow as tf
import matplotlib.pyplot as plt
dataset_path='train2014/COCO_train2014_000000000025.jpg'
with tf.Session() as sess:
image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
resized_image1 = tf.image.adjust_brightness(img_data_jpg,0.003)
resized_image2 = tf.image.adjust_contrast(img_data_jpg,0.3)
resized_image3 = tf.image.random_hue(img_data_jpg,0.3)
resized_image4 = tf.image.per_image_standardization(img_data_jpg)
plt.figure()
plt.subplot(221)
plt.imshow(resized_image1.eval())
plt.title('adjust_brightness')
plt.subplot(222)
plt.imshow(resized_image2.eval())
plt.title('adjust_contrast')
plt.subplot(223)
plt.imshow(resized_image3.eval())
plt.title('random_hue')
plt.subplot(224)
plt.imshow(resized_image4.eval())
plt.title('standardization')
plt.show()
sess.close()
Draw Bounding Boxes:tf.image.draw_bounding_boxes
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
dataset_path='train2014/COCO_train2014_000000000025.jpg'
with tf.Session() as sess:
boxes = tf.constant([[[0.1, 0.2, 0.5, 0.9]]],dtype=tf.float32)
image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
batch_data_jpg=tf.expand_dims(img_data_jpg, 0)
resized_image1 = tf.image.draw_bounding_boxes(batch_data_jpg,boxes)
plt.figure()
plt.imshow(np.squeeze(resized_image1.eval()))
plt.title('draw_bounding boxes')
plt.show()
sess.close()
Total_variation:tf.image.total_variation。这是对一张图片计算总变差,即像素之间的差异大小。通常设置loss = tf.reduce_sum(tf.image.total_variation(images))加到优化项里面,可以平滑生成的图像。
import tensorflow as tf
dataset_path='train2014/COCO_train2014_000000000025.jpg'
with tf.Session() as sess:
boxes = tf.constant([[[0.1, 0.2, 0.5, 0.9]]],dtype=tf.float32)
image_raw_data_jpg = tf.gfile.FastGFile(dataset_path, 'r').read()
img_data_jpg = tf.image.decode_png(image_raw_data_jpg)
img_data_jpg = tf.image.convert_image_dtype(img_data_jpg, dtype=tf.float32)
total_v=tf.image.total_variation(img_data_jpg)
print total_v.eval()
sess.close()
这张图片的Total_variation即为:
135254.0