使用queue
进行线程同步
当线程之间如果要共享资源或数据的时候,可能变的非常复杂.如你所见,Python的threading模块提供了很多同步原语,包括信号量,条件变量,事件和锁.
如果可以使用这些原语的话,应该优先考虑使用这些,而不是使用queue模块.
队列操作起来更容易,也使多线程编程更安全,因为队列可以将资源的使用通过单线程进行完全控制,并且允许使用更加整洁和可读性更高的设计模式.
Queue模块提供了队列操作的模块,队列是线程间最常用的数据交换的形式.
>>> import queue
>>> pdir(queue)
class:
LifoQueue: Variant of Queue that retrieves most recently added entries first.
PriorityQueue: Variant of Queue that retrieves open entries in priority order (lowest first).
Queue: Create a queue object with a given maximum size.
SimpleQueue: Simple, unbounded, reentrant FIFO queue.
_PySimpleQueue: Simple, unbounded FIFO queue.
deque: deque([iterable[, maxlen]]) --> deque object
function:
heappop: Pop the smallest item off the heap, maintaining the heap invariant.
heappush: heappush(heap, item) -> None. Push item onto heap, maintaining the heap invariant.
time: monotonic() -> float
exception:
Empty: Exception raised by Queue.get(block=0)/get_nowait().
Full: Exception raised by Queue.put(block=0)/put_nowait().
queue
模块提供了主要的队列类
LifoQueue(maxsize)
后进先出,maxsize是队列的大小,0和负数表示无穷队列 PriorityQueue(maxsize)
优先级队列 Queue(maxsize)
先进先出. deque
双向队列 其中
Queue
先入先出队列的方法有:
>>> pdir(queue.Queue)
function:
_get:
_init:
_put:
_qsize:
empty: Return True if the queue is empty, False otherwise (not reliable!).
full: Return True if the queue is full, False otherwise (not reliable!).
get: Remove and return an item from the queue.
get_nowait: Remove and return an item from the queue without blocking.
join: Blocks until all items in the Queue have been gotten and processed.
put: Put an item into the queue.
put_nowait: Put an item into the queue without blocking.
qsize: Return the approximate size of the queue (not reliable!).
task_done: Indicate that a formerly enqueued task is complete.
emtpy()
返回布尔值,队列为空,返回True full()
当设定了队列大小时,队列满返回True,否则返回False get([block[,timeout]])
从队列中删除元素并返回该元素的值,如果timeout是一个正数,它会阻塞最多超时秒数,并且如果在该时间内没有可用的项目,则引发Empty异常 get_nowait()
get(block=False) put(item[,block[,timeout]])
向队列里添加元素item,block设置为False的时候,如果队列满了则抛出Full异常.如果block设置为True,timeout设置为None时,则会等到有空位的时候再添加进队列;否则会根据timeout设定的超时值抛出Full异常 put_nowait()
put(block=False) join()
阻塞 task_done()
发送信号表明入列任务已完成,经常在消费者线程中用到 qsize()
返回近似的队列数据个数,为什么要加近似二字呢?因为当该值大于0的时候并不保证并发执行的时候get()方法不被阻塞,同样,对于put()方法有效
生产者消费者模式
在生产者消费者模式中,
Queue
相当于一个仓库,生产者不断的向仓库保存数据,而消费者只需要去取出数据即可.比如构造一个如下的队列
- 实现一个线程不断的生成一个随机数到一个队列中
- 实现一个线程从上面的队列中不断取出奇数
- 实现另外一个线程从上面的队列里不断取出偶数
#!/usr/bin/env python
# encoding: utf-8
"""
实现一个线程不断的生成一个随机数到一个队列中
实现一个线程从上面的队列中不断取出奇数
实现另外一个线程从上面的队列里不断取出偶数
"""
import threading,time,random
from queue import Queue
class Producter(threading.Thread):
def __init__(self, queue,):
self.queue = queue
super(Producter, self).__init__()
def run(self):
while True:
if self.queue.qsize() <= 7:
random_int = random.randint(0, 256)
print('{} put {} in queue'.format(self.name, random_int))
self.queue.put(random_int)
print(self.queue.qsize())
class Consumer(threading.Thread):
def __init__(self, queue):
self.queue = queue
super(Consumer, self).__init__()
def run(self):
while True:
if self.queue.qsize() > 7:
val_odd = self.queue.get()
if val_odd % 2 != 0:
print('{}:{} 取出奇数{}'.format(self.name, time.ctime(),
val_odd))
else:
self.queue.put(val_odd)
time.sleep(2)
class Consumer1(threading.Thread):
def __init__(self, queue):
self.queue = queue
super(Consumer1, self).__init__()
def run(self):
while True:
if self.queue.qsize() > 7:
val_odd = self.queue.get()
if val_odd % 2 == 0:
print('{}:{} 取出偶数{}'.format(self.name, time.ctime(),
val_odd))
else:
self.queue.put(val_odd)
time.sleep(2)
def main():
queue = Queue()
producter = Producter(queue)
coun = Consumer(queue)
coun1 = Consumer1(queue)
producter.start()
coun1.start()
coun.start()
producter.join()
coun1.join()
coun.join()
if __name__ == '__main__':
main()
线程池
在使用多线程处理任务时也不是线程越多越好,由于在切换线程的时候,需要切换上下文环境,依然会造成cpu的大量开销。为解决这个问题,线程池的概念被提出来了。预先创建好一个较为优化的数量的线程,让过来的任务立刻能够使用,就形成了线程池。在python中,没有内置的较好的线程池模块,需要自己实现或使用第三方模块。
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Author: ningyanke
# @Date: 2018-01-07 02:03:14
# @Last Modified by: ningyanke
# @Last Modified time: 2018-01-07 02:26:51
import queue
import threading
import contextlib
import time
StopEvent = object() # 创建一个空对象
class ThreadPool:
def __init__(self, max_num, max_task_num=None):
if max_task_num:
self.q = queue.Queue(max_task_num)
else:
self.q = queue.Queue()
self.max_num = max_num
self.cancel = False
self.terminal = False
self.generate_list = []
self.free_list = []
def run(self, func, args, callback=None):
"""
线程池执行一个任务
func: 任务函数
args: 任务函数所需要的参数
callback: 任务执行失败或成功后执行的回调函数, 回调函数有2个参数,
1.任务函数执行状态
2.任务函数返回值(默认为None, 即不执行回调函数)
return 如果线程池已经终止,则返回True,否则None
"""
if self.cancel:
return
if len(self.free_list) == 0 and len(self.generate_list) self.generate_thread()
w = (func, args, callback)
self.q.put(w)
def generate_thread(self):
"""
创建一个线程
"""
t = threading.Thread(target=self.call)
t.start()
def call(self):
"""
循环去获取任务函数并执行任务函数
"""
current_thread = threading.currentThread
self.generate_list.append(current_thread)
event = self.q.get()
while event != StopEvent:
func, arguments, callback = event
try:
result = func(*arguments)
success = True
except Exception as e:
success = False
result = None
if callback is not None:
try:
callback(success, result)
except Exception as e:
pass
with self.worker_state(self.free_list, current_thread):
if self.terminal:
event = StopEvent
else:
self.generate_list.remove(current_thread)
def close(self):
"""
执行完所有的任务后,所有线程停止
"""
self.cancel = True
full_size = len(self.generate_list)
while full_size:
self.q.put(StopEvent)
full_size -= 1
def terminate(self):
"""
无论是否还有任务,终止线程
"""
self.terminal = True
while self.generate_list:
self.q.put(StopEvent)
self.q.empty()
@contextlib.contextmanager
def worker_state(self, state_list, worker_thread):
"""
用于记录线程中正在等待的线程数
"""
state_list.append(worker_thread)
try:
yield
finally:
state_list.remove(worker_thread)
# How to use
pool = ThreadPool(5)
def callback(status, result):
# status, execute action status
# result, execute action return value
pass
def action(i):
print(i)
for i in range(30):
ret = pool.run(action, (i,), callback)
time.sleep(5)
print(len(pool.generate_list), len(pool.free_list))
print(len(pool.generate_list), len(pool.free_list))
# pool.close()
# pool.terminate()