在Python里,queue.Queue主要是为了线程间通信,作为“队列”只是附带的功能。而collections.deque就是个容器,和dict,list类似。
如果只是想用一个简单的队列,可能从名字上看上去“Queue”更合适。当然用是可以用的,不过,Queue相比deque有个坏处:慢不少。
这里只看最简单的操作,塞东西和取东西。
Queue:put和get
deque:append和popleft
import timeit
from queue import Queue
from collections import deque
def test_queue():
q = Queue()
for i in range(1000):
q.put(i)
for i in range(1000):
q.get()
def test_deque():
q = deque()
for i in range(1000):
q.append(i)
for i in range(1000):
q.popleft()
if __name__ == '__main__':
t_queue = timeit.timeit('test_queue()', setup='from __main__ import test_queue', number=100)
t_deque = timeit.timeit('test_deque()', setup='from __main__ import test_deque', number=100)
print('t_queue', t_queue, 't_deque', t_deque)
print('faster', t_queue / t_deque)
执行,结果如下
t_queue 0.5356368 t_deque 0.017948600000000203
faster 29.84281782423108
可见,Queue所用的时间,在这里几乎是deque的30倍。
Queue是很高级的同步设施,有例如get_nowait,join等同步用接口,该阻塞就阻塞,该返回就返回。而deque只是个容器。其实从类名也有所反映,Queue是大写的首字母;而deque是和list, dict等一样是小写的首字母。
可以看看Queue的实现,打开IDLE,然后
输入queue,回车
就可以打开Queue的源代码了,queue.py。下面看看Queue的源码,很短。
实际上,Queue的底层使用了deque,Queue构造的时候会先调用这个_init,构造底层容器
# Override these methods to implement other queue organizations
# (e.g. stack or priority queue).
# These will only be called with appropriate locks held
# Initialize the queue representation
def _init(self, maxsize):
self.queue = deque()
更高级的,如上面的注释所说,支持替换底层容器,例如,Last In First Out,使用list替换了原本的deque
class LifoQueue(Queue):
'''Variant of Queue that retrieves most recently added entries first.'''
def _init(self, maxsize):
self.queue = []
def _qsize(self):
return len(self.queue)
def _put(self, item):
self.queue.append(item)
def _get(self):
return self.queue.pop()
PriorityQueue更高级,不但使用list替换了,还用heappush和heappop替换了普通的append和pop,用堆实现了PriorityQueue
class PriorityQueue(Queue):
'''Variant of Queue that retrieves open entries in priority order (lowest first).
Entries are typically tuples of the form: (priority number, data).
'''
def _init(self, maxsize):
self.queue = []
def _qsize(self):
return len(self.queue)
def _put(self, item):
heappush(self.queue, item)
def _get(self):
return heappop(self.queue)
至于线程间通信那些,不长,但是有点复杂。
Queue,就是一个底层容器 + 一个Lock + 三个Condition。
# mutex must be held whenever the queue is mutating. All methods
# that acquire mutex must release it before returning. mutex
# is shared between the three conditions, so acquiring and
# releasing the conditions also acquires and releases mutex.
self.mutex = threading.Lock()
# Notify not_empty whenever an item is added to the queue; a
# thread waiting to get is notified then.
self.not_empty = threading.Condition(self.mutex)
# Notify not_full whenever an item is removed from the queue;
# a thread waiting to put is notified then.
self.not_full = threading.Condition(self.mutex)
# Notify all_tasks_done whenever the number of unfinished tasks
# drops to zero; thread waiting to join() is notified to resume
self.all_tasks_done = threading.Condition(self.mutex)
self.unfinished_tasks = 0
注意,Python里的Condition把关联的Lock封到内部了,和C++不一样。Python里的Condition构造的时候,可以使用外部传入的Lock,如果不传,会自己构造一个Lock自己用。
可以看到,Queue里的三个Condition都使用了同一个Lock,self.mutex。
像这种东西,强行看,也能理解,但是想自己写出来,在数学上没有race condition,怕是困难重重。
总之,如果只想要一个队列容器,用deque;如果想线程间同步,生产者消费者什么的,用Queue。