RuntimeError: Coordinator stopped with threads still running: Thread-2 Thread-3 Thread-1 Thread-4

参考网址

https://github.com/tensorflow/tensorflow/issues/2130

  • 报错原因:创建tf.FIFOQueue队列并且启动多个进程对同一个队列进行操作时,往往是通过调用tf.Coordinator类的should_stop,request_stopjoin三个方法进行停止;
  • 当某一个线程退出时,则should_stop返回True时,停止当前线程;
  • 通常使用tf.FIFOQueue先入先出队列,should_top不会自动返回true,关闭当前线程,只有当调用request_stop停止其他线程,才会在下一个step去判断should_stop的返回值,但此时当前线程的入队操作依然在进行,等待数据入队,因此在Coordinator stopped会报错Coordinator stopped with threads still running: Thread-2 Thread-3 Thread-1 Thread-4
  • 解决方法:在执行coord.request_stop()方法之前,sess.run(queue.close())把队列关闭就可以;有时可能需要给队列指定参数cancel_pending_enqueues=True,如sess.run(queue.close(cancel_pending_enqueues=True))

“queue.close()”, but that actually returns an op which needs to be run to do anything. You need to do sess.run(q.close()). Since your first queue is not closed, your “batch” queue is waiting forever for something to be added to the first queue.

Furthermore, this wait is happening in C++ mutex, so stop_grace_period_secs is useless – the queue runner thread checks for “stop_requested” between session run calls, but because dequeue op never returns, it’s stuck inside session.run forever.

def create_session():
  """Resets local session, returns new InteractiveSession"""

  config = tf.ConfigProto(log_device_placement=True)
  config.gpu_options.per_process_gpu_memory_fraction=0.3 # don't hog all vRAM
  config.operation_timeout_in_ms=5000   # terminate on long hangs
  sess = tf.InteractiveSession("", config=config)
  return sess

tf.reset_default_graph()
q = tf.FIFOQueue(4, tf.string)
enqueue_val = tf.placeholder(dtype=tf.string)
enqueue_op = q.enqueue(enqueue_val)
size_op = q.size()
dequeue_op = q.dequeue()
sess = create_session()
def enqueueit(val):
  sess.run([enqueue_op], feed_dict={enqueue_val:val})
  print "queue1 size: ", sess.run(size_op)
enqueueit("1")
enqueueit("2")
enqueueit("3")

dequeue_op.set_shape([])
queue2 = tf.train.batch([dequeue_op], batch_size=1, num_threads=1, capacity=1)
threads = tf.train.start_queue_runners()

def dequeueit():
  print "queue1 size: ", sess.run(size_op)
  print "queue2 size before: ", sess.run("batch/fifo_queue_Size:0")
  print "result: ", sess.run(queue2)
  print "queue2 size after: ", sess.run("batch/fifo_queue_Size:0")

dequeueit()
dequeueit()
dequeueit()
#solution here
# Ask the threads to stop and wait until they do it
sess.run(q.close(cancel_pending_enqueues=True))
coord.request_stop()
coord.join(threads, stop_grace_period_secs=5)
#sess.close()
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值