tensorflow中global_step,histogram/image/scalar_summary,filename_queue,record_reader,batch操作的更新

最新推荐文章于 2023-08-05 15:49:23 发布

渡水葫芦喵

最新推荐文章于 2023-08-05 15:49:23 发布

阅读量885

点赞数

分类专栏： tensorflow 文章标签： tensorflow 操作更新

本文链接：https://blog.csdn.net/weixin_44374595/article/details/88764852

版权

tensorflow 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

笔者当前Tensorflow版本为较新的1.12 gpu版，最近在阅读源码时，发现tf很多常规操作指令和函数随着tensorflow的版本更新，其语法表述也被更新，特将这些更新信息总结记录如下，以供参考。

global_step =tf.contrib.framework.get_or_create_global_step()
其用来创建global_step,计算当前迭代的步数，但在较新的版本中，虽然不会报错，会提示warning：

 get_or_create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
  Instructions for updating:
  Please switch to tf.train.get_or_create_global_step

tf.contrib.framework.get_or_create_global_step()的操作在新版本中会被弃用，用 tf.train.get_or_create_global_step来代替，可以看见表达更加简洁，容易理解。

tensorboard可视化日志写入操作
tf.histogram_summary,tf.image_summary,tf.scalar_summary等用来记录模型训练的日志信息，便于用tensorboard可视化。而较新版本中（至少在1.04之后），被tf.summary.histogram,tf.summary.image,tf.summary.scalar取代。

filename_queue = tf.train.string_input_producer(filenames)
其用来创建文件名队列。
string_input_producer (from tensorflow.python.training.input) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by ‘tf.data’
Use tf.data.Dataset.from_tensor_slices(string_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs). if shuffle=False , omit the .shufle(…).

有关文件名队列的类都用tf.data取代，创建文件名队列改为tf.data…Dataset.from_tensor_slices(string_tensor).shuffle(tf.shape(input_tensor, out_type=tf.int64)[0]).repeat(num_epochs),shuffle指文件名队列是否乱序，非乱序的话就可以把.shuffle（）的部分去掉，直接加repeat多少epoch。

reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
对于连续分布的二进制dataset文件，用变长度的recordreader去读取。
FixedLengthRecordReader.init (from tensorflow.python.ops.io_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Queue-based input pipelines have been replaced by tf.data. Use tf.data.FixedLengthRecordDataset.
把处理dataset读取的类，用专门的类tf.data来处理，即把tf.data.FixedLengthRecordDataset改为tf.data.FixedLengthRecordDataset（）。

shuffle_batch
images, label_batch = tf.train.shuffle_batch(
[image, label],
batch_size=batch_size,
num_threads=num_preprocess_threads,
capacity=min_queue_examples + 3 * batch_size,
min_after_dequeue=min_queue_examples)

设置无序生成batch的格式，涉及多线程数据处理，batch的组织方式等（见https://blog.csdn.net/ying86615791/article/details/73864381 https://blog.csdn.net/qq_32023541/article/details/81170282）。

   shuffle_batch (from tensorflow.python.training.input) is deprecated and will be removed in a future version.    
    Instructions for updating:
    Queue-based input pipelines have been replaced by `tf.data`.
    Use `tf.data.Dataset.shuffle(min_after_dequeue).batch(batch_size)`.

tf.train.shuffle_batch还是变更到tf.data类，改为
tf.data.Dataset.shuffle(min_after_dequeue).batch(batch_size)

总之，tensorflow较新版本专门开辟出一个tf.data类来处理dataset的读取，把原本tf.train中生成batch的操作也划分到tf.data中，更加明了，还有许多相关的文件名队列操作如queue.runner等与之类似，都归入tf.data内，划分更为清晰。
虽然目前沿用旧的模块和类操作，只会warning，并不会报错，但官方还是申明这些旧操作即将在未来的新版本中（tensorflow2.0即将发布）移除，所以平时还是尽量与时俱进更新代码，便于后续模型维护，提高兼容性。