python读取图片数据集_使用新的TensorFlow数据集API读取TFRecord图像数据

最新推荐文章于 2023-10-31 11:58:08 发布

采油工

最新推荐文章于 2023-10-31 11:58:08 发布

阅读量278

点赞数

文章标签： python读取图片数据集

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_42626820/article/details/112010831

版权

我在使用“new”(tensorflowv1.4)数据集API读取TFRecord格式的图像数据时遇到问题。我相信问题在于，当我试图读取数据时，我不知何故消耗了整个数据集，而不是一批数据。我这里有一个使用批处理/文件队列API来完成这项工作的示例：https://github.com/gnperdue/TFExperiments/tree/master/conv(在这个示例中，我运行了一个分类器，但是读取TFRecord图像的代码在DataReaders.py类中)。在

我认为问题函数是：def parse_mnist_tfrec(tfrecord, features_shape):

tfrecord_features = tf.parse_single_example(

tfrecord,

features={

'features': tf.FixedLenFeature([], tf.string),

'targets': tf.FixedLenFeature([], tf.string)

}

)

features = tf.decode_raw(tfrecord_features['features'], tf.uint8)

features = tf.reshape(features, features_shape)

features = tf.cast(features, tf.float32)

targets = tf.decode_raw(tfrecord_features['targets'], tf.uint8)

targets = tf.one_hot(indices=targets, depth=10, on_value=1, off_value=0)

targets = tf.cast(targets, tf.float32)

return features, targets

class MNISTDataReaderDset:

def __init__(self, data_reader_dict):

# doesn't matter here

def batch_generator(self, num_epochs=1):

def parse_fn(tfrecord):

return parse_mnist_tfrec(

tfrecord, self.name, self.features_shape

)

dataset = tf.data.TFRecordDataset(

self.filenames_list, compression_type=self.compression_type

)

dataset = dataset.map(parse_fn)

dataset = dataset.repeat(num_epochs)

dataset = dataset.batch(self.batch_size)

iterator = dataset.make_one_shot_iterator()

batch_features, batch_labels = iterator.get_next()

return batch_features, batch_labels

然后，在使用中：

^{pr2}$

这会产生如下错误：[[Node: Reshape_1 = Reshape[T=DT_UINT8, Tshape=DT_INT32](DecodeRaw_1, Reshape_1/shape)]]

Input to reshape is a tensor with 50000 values, but the requested shape has 1

[[Node: Reshape_1 = Reshape[T=DT_UINT8, Tshape=DT_INT32](DecodeRaw_1, Reshape_1/shape)]]

[[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,28,28,1], [?,10]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]

有人有什么想法吗？在

我有一个读者示例中完整代码的要点和到TFRecord文件(我们的老朋友MNIST，TFRecord格式)的链接如下：

谢谢！在

Edit-我还尝试了flat_map，例如：def batch_generator(self, num_epochs=1):

"""

TODO - we can use placeholders for the list of file names and

init with a feed_dict when we call `sess.run` - give this a

try with one list for training and one for validation

"""

def parse_fn(tfrecord):

return parse_mnist_tfrec(

tfrecord, self.name, self.features_shape

)

dataset = tf.data.Dataset.from_tensor_slices(self.filenames_list)

dataset = dataset.flat_map(

lambda filename: (

tf.data.TFRecordDataset(

filename, compression_type=self.compression_type

).map(parse_fn).batch(self.batch_size)

)

)

dataset = dataset.repeat(num_epochs)

iterator = dataset.make_one_shot_iterator()

batch_features, batch_labels = iterator.get_next()

return batch_features, batch_labels

我还试着只用一个文件而不是一个列表(这是我第一次处理这个问题的方法)。不管怎样，TF似乎总是想把整个文件都吃进TFRecordDataset，而且不会对单个记录进行操作。在

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python读取图片数据集_使用新的TensorFlow数据集API读取TFRecord图像数据

我在使用“new”(tensorflowv1.4)数据集API读取TFRecord格式的图像数据时遇到问题。我相信问题在于，当我试图读取数据时，我不知何故消耗了整个数据集，而不是一批数据。我这里有一个使用批处理/文件队列API来完成这项工作的示例：https://github.com/gnperdue/TFExperiments/tree/master/conv(在这个示例中，我运行了一个分类器，...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。