在训练分布式 tensorflow 时遇到问题:
W tensorflow/core/framework/op_kernel.cc:975] Invalid argument: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /tmp/train_logs/model.ckpt-1: Not found: /tmp/train_logs
说是没有找到 /tmp/train_logs 目录,加载不到文件,但是路径和文件名没有错,去目录下查找也有文件
出错的一句是
sv = tf.train.Supervisor(is_chief=(FLAGS.task_index == 0), logdir="/tmp/train_logs", init_op=init_op, summary_op=summary_op, saver=saver, global_step=glo