TFRecord 使用总结

最新推荐文章于 2024-08-13 16:25:59 发布

loovelj

最新推荐文章于 2024-08-13 16:25:59 发布

阅读量1.7k

点赞数 1

分类专栏： python tensorflow

本文链接：https://blog.csdn.net/loovelj/article/details/86482700

版权

python 同时被 2 个专栏收录

74 篇文章 4 订阅

订阅专栏

tensorflow

31 篇文章 0 订阅

订阅专栏

TFRecord

简介
TFrecord 是TensorFlow使用的一种数据格式，他可以把多个训练的图片许多信息压缩在一个文件中，用特殊的方式存储和读取，通过tf.dataset 这个API进行快速的读取和写入。具体使用官方教材，参考官方文档，里面有具体的使用方法，最近又出了高阶的使用方法，等流程跑通了再继续优化，Tensflow公众号-tf.data API，构建高性能 TensorFlow 输入管道
读取图片时使用tf.gfile.GFile()

#使用GFile读取图片会快，而且文件会压缩，如果用opencv读取，文件会特别大
  with tf.gfile.GFile(img_path, 'rb') as fid:
        encoded_png = fid.read()

文件的压缩和解压

#我这里加了压缩的选项，到时候读取时也要添加解压选项
   writer_options = tf.python_io.TFRecordOptions(
        tf.python_io.TFRecordCompressionType.ZLIB)
    writer = tf.python_io.TFRecordWriter(
        path=save_path, options=writer_options)

#在读取的时候，也要用ZLIB方式解压
#读取到dataset里
dataset = tf.data.TFRecordDataset(filenames, compression_type='ZLIB')

4.图片解码时最好定义好通道类型

#解码是  最好定义好channel,我默认读取图片后，有的会解码为 1通道的，影响后面的reshape
image = tf.image.decode_image(parsed["img"],channels=3)
#还原会原来的形状，对了，图片集一定要提前检查好图像的大小是否一致，要不解码也会出问题
image = tf.reshape(image, [32,32,3])
#转为灰度图，放入训练集
gray = tf.image.rgb_to_grayscale(image)

中文要加encode(‘utf-8’)

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))


def img_to_feature(img_path, index, label):
    with tf.gfile.GFile(img_path, 'rb') as fid:
        encoded_png = fid.read()
    image_format = img_path.split('.')[-1]

    feature = tf.train.Features(feature={
        'img': _bytes_feature(encoded_png),
        'index':  _int64_feature(index),
        'img_path':  _bytes_feature(img_path.encode('utf8'))

    })
    return feature

#解码的时候，再需要转码回来
with tf.Session() as sess:
    sess.run(iterator.initializer)
    while True:
        try:
            label,img_path,image=sess.run(next_element)
            print(img_path.decode('utf-8'))
        except tf.errors.OutOfRangeError:
            break