Tfrecords的基本技巧-CSDN博客

显示如何将数据集转换为.tfrecords文件，然后用作计算图片的一部分

介绍之前

在本次练习中，使用到了skimage.oi 模块，要想用这个模块需要 scikit-image ，在安装scikit-image之前要先暗转mlk以及numpy。直接pip install安装mlk以及numpy，但是scikit-image模块可能会安装不上，这就需要到官网下载www.lfd.uci.edu/~gohlke/pyt…自己需要的版本，比如电脑是64位，python版本是3.6等，下载之后直接用

安装完成就可以愉快的玩耍了

介绍

在第一部分中，我们演示如何使用numpy获取任何图像的原始数据字节，这在某种意义上类似于将数据集转换为二进制格式时的操作。第二部分展示了如何将数据集转换为tfrecord文件，而不需要定义计算图，只需要使用一些内置的张量函数。第三部分解释了如何定义一个模型，用于从创建的二进制文件中读取数据并以随机的方式进行批处理，这在机器学习的训练中十分重要的。

使用numpy获取原始数据字节

import numpy as np
import skimage.io as io
import pylab
import matplotlib.pyplot as plt
cat_img = io.imread('cat_img.png') #在png加上as_grey=True是读取灰色照片

io.imshow(cat_img)
io.show()#和下面的图片显示一样
#plt.show()
复制代码

# 首先通过np.ndarray.tostring函数将图片
# 转换成字符串
cat_string = cat_img.tostring()

# 将字符串转换成图片
# 注意: dtype 应该被指明
# 否则重建将会发生错误
# 重建的是1D， 需要照片的尺寸来完全重建
reconstructed_cat_1d = np.fromstring(cat_string, dtype=np.uint8)

# 再形成图片
# 这就是为什么要伴随着图片序列化的储存图片
# 尺寸
reconstructed_cat_img = reconstructed_cat_1d.reshape(cat_img.shape)

#io.imshow(reconstructed_cat_img)
#pylab.show()

# 检查一下重建的图片和原始的图片是不是一致
result = np.allclose(cat_img, reconstructed_cat_img)
print(result)
#True
复制代码

创建.tfrecord文件，并在不定义计算图的情况下读取它

在这里，将演示如何将一个小数据集（三个图像/注释）写入.tfrrecord 文件并在不定义计算图的情况下读取它。还要确保从.tfrecord文件读回的图像与原始图像相同。请注意，我们还会以原始格式将图片的大小和图片一起写入。并展示说明为什么需要在前一节中存储图片的大小。

from PIL import Image
import numpy as np
import skimage.io as io
import tensorflow as tf

# 3对图片以及他们的标注图例子
filename_pairs = [
('/zzzzhong/1.jpg','/zzzzhong/1.png'), 
('/zzzzhong/2.jpg','/zzzzhong/2.png'),
('/zzzzhong/3.jpg','/zzzzhong/3.png'),
]

#定义特征，只是为了后面好输入
def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

tfrecords_filename = 'pascal_voc_segmentation.tfrecords'

writer = tf.python_io.TFRecordWriter(tfrecords_filename)

# 将原始的图片放起来用来和重建的图片比较
original_images = []

for img_path, annotation_path in filename_pairs:
    
    img = np.array(Image.open(img_path))
    annotation = np.array(Image.open(annotation_path))
    
    # The reason to store image sizes was demonstrated
    # in the previous example -- we have to know sizes
    # of images to later read raw serialized string,
    # convert to 1d array and convert to respective
    # shape that image used to have.
    height = img.shape[0]
    width = img.shape[1]
    
    # Put in the original images into array
    # Just for future check for correctness
    original_images.append((img, annotation))
    
    img_raw = img.tostring()
    annotation_raw = annotation.tostring()
    
    example = tf.train.Example(features=tf.train.Features(feature={
        'height': _int64_feature(height),
        'width': _int64_feature(width),
        'image_raw': _bytes_feature(img_raw),
        'mask_raw': _bytes_feature(annotation_raw)}))
    
    writer.write(example.SerializeToString())

writer.close()

复制代码

运行上面的代码，将会在你脚本的同一目录下出现一个名为pascal_voc_segmentation.tfrecords的文件。

#重建图片
reconstructed_images = []

record_iterator = tf.python_io.tf_record_iterator(path=tfrecords_filename)

for string_record in record_iterator:
    
    example = tf.train.Example()
    example.ParseFromString(string_record)
    
    height = int(example.features.feature['height']
                                 .int64_list
                                 .value[0])
    
    width = int(example.features.feature['width']
                                .int64_list
                                .value[0])
    
    img_string = (example.features.feature['image_raw']
                                  .bytes_list
                                  .value[0])
    
    annotation_string = (example.features.feature['mask_raw']
                                .bytes_list
                                .value[0])
    
    img_1d = np.fromstring(img_string, dtype=np.uint8)
    reconstructed_img = img_1d.reshape((height, width, -1))
    
    annotation_1d = np.fromstring(annotation_string, dtype=np.uint8)
    
    # Annotations don't have depth (3rd dimension)
    reconstructed_annotation = annotation_1d.reshape((height, width, -1))
    
    reconstructed_images.append((reconstructed_img, reconstructed_annotation))
    
## Let's check if the reconstructed images match
# the original images

for original_pair, reconstructed_pair in zip(original_images, reconstructed_images):
    
    img_pair_to_compare, annotation_pair_to_compare = zip(original_pair,
                                                          reconstructed_pair)
    print(np.allclose(*img_pair_to_compare))
    print(np.allclose(*annotation_pair_to_compare)) 


复制代码

上面的代码有助于你测试原始图片和重建图片有没有区别。当然他的返回值是True*6.

从.tfrecords读取定义的图并批处理照片

从我们之前创建的文件中读取一个定义的图片并批量处理图像，在训练过程中随机洗牌图像非常重要，根据应用情况我们必须使用不同的批量。

指出如果我们使用批处理 - 我们必须事先定义图像的大小，这一点非常重要。这可能听起来像是一种限制，但实际上，在图像分类和图像分割字段中，将对相同大小的图像执行训练。

import tensorflow as tf
import skimage.io as io

IMAGE_HEIGHT = 240
IMAGE_WIDTH = 240

tfrecords_filename = 'pascal_voc_segmentation.tfrecords'

def read_and_decode(filename_queue):
    
    reader = tf.TFRecordReader()

    _, serialized_example = reader.read(filename_queue)

    features = tf.parse_single_example(
      serialized_example,
      # Defaults are not specified since both keys are required.
      features={
        'height': tf.FixedLenFeature([], tf.int64),
        'width': tf.FixedLenFeature([], tf.int64),
        'image_raw': tf.FixedLenFeature([], tf.string),
        'mask_raw': tf.FixedLenFeature([], tf.string)
        })

    # Convert from a scalar string tensor (whose single string has
    # length mnist.IMAGE_PIXELS) to a uint8 tensor with shape
    # [mnist.IMAGE_PIXELS].
    image = tf.decode_raw(features['image_raw'], tf.uint8)
    annotation = tf.decode_raw(features['mask_raw'], tf.uint8)
    
    height = tf.cast(features['height'], tf.int32)
    width = tf.cast(features['width'], tf.int32)

    image_shape = tf.stack([height, width, 3])
    annotation_shape = tf.stack([height, width, 3])# 注意3和一的区别，因为我用的图没有差别
    
    image = tf.reshape(image, image_shape)
    annotation = tf.reshape(annotation, annotation_shape)
    
    image_size_const = tf.constant((IMAGE_HEIGHT, IMAGE_WIDTH, 3), dtype=tf.int32)
    annotation_size_const = tf.constant((IMAGE_HEIGHT, IMAGE_WIDTH, 1), dtype=tf.int32)
    
    # Random transformations can be put here: right before you crop images
    # to predefined size. To get more information look at the stackoverflow
    # question linked above.
    
    resized_image = tf.image.resize_image_with_crop_or_pad(image=image,
                                           target_height=IMAGE_HEIGHT,
                                           target_width=IMAGE_WIDTH)
    
    resized_annotation = tf.image.resize_image_with_crop_or_pad(image=annotation,
                                           target_height=IMAGE_HEIGHT,
                                           target_width=IMAGE_WIDTH)
    
    
    images, annotations = tf.train.shuffle_batch( [resized_image, resized_annotation],
                                                 batch_size=2,
                                                 capacity=30,
                                                 num_threads=2,
                                                 min_after_dequeue=10)
    
    return images, annotations


########################################################################################################

filename_queue = tf.train.string_input_producer(
    [tfrecords_filename], num_epochs=10)

# Even when reading in multiple threads, share the filename
# queue.
image, annotation = read_and_decode(filename_queue)

# The op for initializing the variables.
init_op = tf.group(tf.global_variables_initializer(),
                   tf.local_variables_initializer())

with tf.Session()  as sess:
    
    sess.run(init_op)
    
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)
    
    # Let's read off 3 batches just for example
    for i in range(3):
    
        img, anno = sess.run([image, annotation])
        print(img[0, :, :, :].shape)
        
        print('current batch')
        
        # We selected the batch size of two
        # So we should get two image pairs in each batch
        # Let's make sure it is random

        io.imshow(img[0, :, :, :])
        io.show()

        io.imshow(anno[0, :, :, 0])
        io.show()
        
        io.imshow(img[1, :, :, :])
        io.show()

        io.imshow(anno[1, :, :, 0])
        io.show()
        
    
    coord.request_stop()
    coord.join(threads)
复制代码

注：我测试的是win7+python3.6