import tensorflow as tf
1 tensorflow加速数据读取
训练模型往往需要处理大量数据,数据的读取是训练的第一步,因此数据的读取速度直接影响训练速度.虽然每次读取的数据是在内存中处理,处理速度很快,但是提取数据的速度会影响数据的输送,因此Tensorflow
为加快数据提取速度,开启了线程
+队列
处理模式,以提高数据提取速度.
处理函数:
tf.train.start_queue_runners()
2 函数tf.train.start_queue_runners()解析
2.1 参数
序号 | 参数 | 描述 |
---|---|---|
1 | sess | 默认None,会话Session,运行队列操作 |
2 | coord | 默认None,用于协调已启动的线程,阻塞线程和维护线程运行周期 |
3 | daemon | 默认True,线程运行模式,默认为后台线程,即主线程结束后,子线程任务结束后,自动结束所有线程活动,为False则不会自动结束线程,需手动结束 |
4 | start | 默认True,默认开启线程,False则不开启线程 |
5 | collection | 默认tf.GraphKeys.QUEUE_RUNNERS |
2.2 解析
2.2.1 线程数量
该函数每次运行开启的线程数量为18,验证如下:
- Demo
import tensorflow as tf
import threading
from threading import Thread, Lock
def tensor_thread():
data = tf.zeros([2, 3], dtype=tf.int32)
with tf.Session() as sess:
threads = tf.train.start_queue_runners(sess=sess)
enum_threads = threading.enumerate()
counts_threads = threading.active_count()
data = sess.run(data)
print("data: {}".format(data))
print("Threads of number: {}".format(enum_threads))
print("Counts of thread: {}".format(counts_threads))
if __name__ == "__main__":
tensor_thread()
print("starting")
- Result
data: [[0 0 0]
[0 0 0]]
Threads of number: [<_MainThread(MainThread, started 140423660799808)>, <Thread(Thread-2, started daemon 140423418869504)>, <Heartbeat(Thread-3, started daemon 140423410476800)>, <HistorySavingThread(IPythonHistorySavingThread, started 140423385298688)>, <ParentPollerUnix(Thread-1, started daemon 140423376905984)>, <Thread(QueueRunnerThread-input_producer-input_producer/input_producer_EnqueueMany, started daemon 140421523035904)>, <Thread(QueueRunnerThread-input_producer_1-input_producer_1/input_producer_1_EnqueueMany, started daemon 140421539821312)>, <Thread(QueueRunnerThread-input_producer_2-input_producer_2/input_producer_2_EnqueueMany, started daemon 140420969379584)>, <Thread(QueueRunnerThread-input_producer_3-input_producer_3/input_producer_3_EnqueueMany, started daemon 140422120498944)>, <Thread(QueueRunnerThread-input_producer_4-input_producer_4/input_producer_4_EnqueueMany, started daemon 140422112106240)>, <Thread(QueueRunnerThread-input_producer_5-input_producer_5/input_producer_5_EnqueueMany, started daemon 140422103713536)>, <Thread(QueueRunnerThread-input_producer_6-input_producer_6/input_producer_6_EnqueueMany, started daemon 140422095320832)>, <Thread(QueueRunnerThread-input_producer_7-input_producer_7/input_producer_7_EnqueueMany, started daemon 140422086928128)>, <Thread(QueueRunnerThread-input_producer_8-input_producer_8/input_producer_8_EnqueueMany, started daemon 140421531428608)>, <Thread(QueueRunnerThread-input_producer_9-input_producer_9/input_producer_9_EnqueueMany, started daemon 140421514643200)>, <Thread(QueueRunnerThread-input_producer_10-input_producer_10/input_producer_10_EnqueueMany, started daemon 140421497857792)>, <Thread(QueueRunnerThread-input_producer_11-input_producer_11/input_producer_11_EnqueueMany, started daemon 140421002950400)>, <Thread(QueueRunnerThread-input_producer_12-input_producer_12/input_producer_12_EnqueueMany, started daemon 140421506250496)>]
Counts of thread: 18
- Analysis
(1) 以上开启线程程序未使用线程协调器,即只开启线程和数据进队列操作;
(2) 共开启了18个线程,其中QueueRunnerThread-input_producer_12-input_producer_12/input_producer_12_EnqueueMany
中的Enqueue为数据进队列,18个进程中,5个主线程的描述,13个数据进队列线程,即开启了13个线程;进程数解释请参考博文:Python之线程threading第3节Python程序默认线程;
(3) 从结果started daemon 140421506250496
可以看出,本次运行仅开启了进队列,没有关闭队列,其他的线程会乱入
,因此会出现数据混乱;
2.2.2 总线程数量
- Demo
import tensorflow as tf
import threading
from threading import Thread, Lock
def tensor_thread():
data = tf.zeros([2, 3], dtype=tf.int32)
with tf.Session() as sess:
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
enum_threads = threading.enumerate()
counts_threads = threading.active_count()
coord.request_stop()
coord.join(threads)
data = sess.run(data)
print("data: {}".format(data))
print("Threads of number: {}".format(enum_threads))
print("Counts of thread: {}".format(counts_threads))
if __name__ == "__main__":
tensor_thread()
print("starting")
- Result
data: [[0 0 0]
[0 0 0]]
Threads of number: [<_MainThread(MainThread, started 140423660799808)>, <Thread(Thread-2, started daemon 140423418869504)>, <Heartbeat(Thread-3, started daemon 140423410476800)>, <HistorySavingThread(IPythonHistorySavingThread, started 140423385298688)>, <ParentPollerUnix(Thread-1, started daemon 140423376905984)>, <Thread(QueueRunnerThread-input_producer-input_producer/input_producer_EnqueueMany, stopped daemon 140421523035904)>, <Thread(QueueRunnerThread-input_producer-close_on_stop, stopped daemon 140422112106240)>, <Thread(QueueRunnerThread-input_producer_1-input_producer_1/input_producer_1_EnqueueMany, stopped daemon 140422103713536)>, <Thread(QueueRunnerThread-input_producer_1-close_on_stop, stopped daemon 140422120498944)>, <Thread(QueueRunnerThread-input_producer_2-input_producer_2/input_producer_2_EnqueueMany, stopped daemon 140422095320832)>, <Thread(QueueRunnerThread-input_producer_2-close_on_stop, stopped daemon 140422086928128)>, <Thread(QueueRunnerThread-input_producer_3-input_producer_3/input_producer_3_EnqueueMany, stopped daemon 140421531428608)>, <Thread(QueueRunnerThread-input_producer_3-close_on_stop, stopped daemon 140421514643200)>, <Thread(QueueRunnerThread-input_producer_4-input_producer_4/input_producer_4_EnqueueMany, stopped daemon 140421506250496)>, <Thread(QueueRunnerThread-input_producer_4-close_on_stop, stopped daemon 140421497857792)>, <Thread(QueueRunnerThread-input_producer_5-input_producer_5/input_producer_5_EnqueueMany, stopped daemon 140421489465088)>, <Thread(QueueRunnerThread-input_producer_5-close_on_stop, stopped daemon 140421002950400)>, <Thread(QueueRunnerThread-input_producer_6-input_producer_6/input_producer_6_EnqueueMany, stopped daemon 140420994557696)>, <Thread(QueueRunnerThread-input_producer_6-close_on_stop, stopped daemon 140420986164992)>, <Thread(QueueRunnerThread-input_producer_7-input_producer_7/input_producer_7_EnqueueMany, stopped daemon 140420977772288)>, <Thread(QueueRunnerThread-input_producer_7-close_on_stop, stopped daemon 140420969379584)>, <Thread(QueueRunnerThread-input_producer_8-input_producer_8/input_producer_8_EnqueueMany, stopped daemon 140420960986880)>, <Thread(QueueRunnerThread-input_producer_8-close_on_stop, stopped daemon 140420952594176)>, <Thread(QueueRunnerThread-input_producer_9-input_producer_9/input_producer_9_EnqueueMany, stopped daemon 140420944201472)>, <Thread(QueueRunnerThread-input_producer_9-close_on_stop, stopped daemon 140420935808768)>, <Thread(QueueRunnerThread-input_producer_10-input_producer_10/input_producer_10_EnqueueMany, stopped daemon 140420927416064)>, <Thread(QueueRunnerThread-input_producer_10-close_on_stop, stopped daemon 140420919023360)>, <Thread(QueueRunnerThread-input_producer_11-input_producer_11/input_producer_11_EnqueueMany, stopped daemon 140420910630656)>, <Thread(QueueRunnerThread-input_producer_11-close_on_stop, stopped daemon 140420893845248)>, <Thread(QueueRunnerThread-input_producer_12-input_producer_12/input_producer_12_EnqueueMany, stopped daemon 140420902237952)>, <Thread(QueueRunnerThread-input_producer_12-close_on_stop, stopped daemon 140420877059840)>]
Counts of thread: 31
starting
- Analysis
(1) 本次使用协调控制,Coordinator
,控制线程的生命周期,即开启线程阻塞,保护线程的运行,直到本次线程任务结束,才会开启第二个线程(线程池概念);
(2) 线程总数为31,新开启的线程为26个,Thread(QueueRunnerThread-input_producer_12-input_producer_12/input_producer_12_EnqueueMany
进队列,Thread(QueueRunnerThread-input_producer_12-close_on_stop
进队列,两拨进队列,分别有13个队列;
(3)stopped daemon 140420902237952
关闭线程,stopped daemon 140420877059840
关闭线程,线程任务结束后,会自动结束线程;
线程队列知识补充:
Python之线程threading
Python之多进程multiprocessing
3 总结
Tensorflow
开启的线程数为26个,即同时开启26个队列通道读取数据;- 使用
Coordinator
维护线程的生命周期,即每个线程执行任务期间,是阻塞状态,保持队列的安全;
[参考文献]
[1]https://tensorflow.google.cn/api_docs/python/tf/train/start_queue_runners
[2]https://tensorflow.google.cn/api_docs/python/tf/train/QueueRunner
[3]https://tensorflow.google.cn/api_docs/python/tf/train/Coordinator
[4]https://blog.csdn.net/Xin_101/article/details/86593151
[5]https://blog.csdn.net/Xin_101/article/details/86517938