python进程池和线程池_Python中的进程池和线程池

0.concurrent.futures库

之前我们使用多线程(threading)和多进程(multiprocessing)完成常规的需求:

在启动的时候start、jon等步骤不能省,复杂的需要还要用1-2个队列。

随着需求越来越复杂,如果没有良好的设计和抽象这部分的功能层次,代码量越多调试的难度就越大。

有没有什么好的方法把这些步骤抽象一下呢,让我们不关注这些细节,轻装上阵呢?

答案是:有的,

从Python3.2开始一个叫做concurrent.futures被纳入了标准库;

而在Python2它属于第三方的futures库,需要手动安装: pip install futures

The concurrent.futures module provides a high-level interface for asynchronously executing callables.

The asynchronous execution can be be performed by threads using ThreadPoolExecutor

or seperate processes using ProcessPoolExecutor. Both implement the same interface,

which is defined by the abstract Executor class.

1.进程池

- 串行执行的情况:

import math,time

PRIMES = [

112272535095293,

112582705942171,

112272535095293,

115280095190773,

115797848077099,

1099726899285419]

def is_prime(n):

if n % 2 == 0:

return False

sqrt_n = int(math.floor(math.sqrt(n)))

for i in range(3, sqrt_n + 1, 2):

if n % i == 0:

return False

return True

def main():

for num in PRIMES:

print('%d is prime: %s' % (num, is_prime(num)))

if __name__ == '__main__':

start_time = time.time()

main()

end_time = time.time()

print('Run time is %s' % (end_time-start_time))

---结果---

112272535095293 is prime: True

112582705942171 is prime: True

112272535095293 is prime: True

115280095190773 is prime: True

115797848077099 is prime: True

1099726899285419 is prime: False

Run time is 3.9570000171661377

- 使用multiprocessing.Pool的情况:

import math,time

from multiprocessing import Pool

PRIMES = [

112272535095293,

112582705942171,

112272535095293,

115280095190773,

115797848077099,

1099726899285419]

def is_prime(n):

if n % 2 == 0:

return False

sqrt_n = int(math.floor(math.sqrt(n)))

for i in range(3, sqrt_n + 1, 2):

if n % i == 0:

return False

return True

def main():

pool = Pool()

res_l = []

for prime in PRIMES:

res = pool.apply_async(func=is_prime,args=(prime,))

res_l.append(res)

pool.close()

pool.join()

for number, prime in zip(PRIMES, res_l):

print('%d is prime: %s' % (number, prime.get()))

if __name__ == '__main__':

start_time = time.time()

main()

end_time = time.time()

print('Run time is %s' % (end_time-start_time))

---结果---

112272535095293 is prime: True

112582705942171 is prime: True

112272535095293 is prime: True

115280095190773 is prime: True

115797848077099 is prime: True

1099726899285419 is prime: False

Run time is 2.687000036239624

- 使用进程池 concurrent.futures.ProcessPoolExecutor的情况:

ProcessPoolExecutor uses the multiprocessing module,

which allows it to side-step the Global Interpreter Lock

but also means that only picklable objects can be executed and returned.

class concurrent.futures.ProcessPoolExecutor(max_workers=None)

Executes calls asynchronously using a pool of at most max_workers processes.

If max_workers is None or not given then as many worker processes will be created as the machine has processors.

- ProcessPoolExecutor 本质上也是调用multiprocessing模块

import math,time

from concurrent import futures

PRIMES = [

112272535095293,

112582705942171,

112272535095293,

115280095190773,

115797848077099,

1099726899285419]

def is_prime(n):

if n % 2 == 0:

return False

sqrt_n = int(math.floor(math.sqrt(n)))

for i in range(3, sqrt_n + 1, 2):

if n % i == 0:

return False

return True

def main():

with futures.ProcessPoolExecutor() as executor:

for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):

print('%d is prime: %s' % (number, prime))

if __name__ == '__main__':

start_time = time.time()

main()

end_time = time.time()

print('Run time is %s' % (end_time-start_time))

---结果---

112272535095293 is prime: True

112582705942171 is prime: True

112272535095293 is prime: True

115280095190773 is prime: True

115797848077099 is prime: True

1099726899285419 is prime: False

Run time is 2.482999801635742

2.线程池

The ThreadPoolExecutor class is an Executor subclass that uses a pool of threads to execute calls asynchronously.

class concurrent.futures.ThreadPoolExecutor(max_workers)

Executes calls asynchronously using at pool of at most max_workers threads.

- 串行执行的情况:

import urllib.request

import time

URLS = [

'http://www.foxnews.com/',

'https://www.stanford.edu/',

'http://www.mit.edu/',

'https://www.python.org/',

'https://www.yahoo.com/',

'http://www.ox.ac.uk/'

]

def load_url(url, timeout):

return urllib.request.urlopen(url, timeout=timeout).read()

start_time = time.time()

for url in URLS:

print('%r page is %d bytes' % (url, len(load_url(url,60))))

end_time = time.time()

print("Run time is %s" % (end_time-start_time))

---结果---

'http://www.foxnews.com/' page is 71131 bytes

'https://www.stanford.edu/' page is 68595 bytes

'http://www.mit.edu/' page is 21405 bytes

'https://www.python.org/' page is 47701 bytes

'https://www.yahoo.com/' page is 434510 bytes

'http://www.ox.ac.uk/' page is 93411 bytes

Run time is 5.068000078201294

- 使用多线程的情况:

import urllib.request

import time

from threading import Thread

URLS = [

'http://www.foxnews.com/',

'https://www.stanford.edu/',

'http://www.mit.edu/',

'https://www.python.org/',

'https://www.yahoo.com/',

'http://www.ox.ac.uk/'

]

def load_url(url, timeout):

res = urllib.request.urlopen(url, timeout=timeout).read()

print('%r page is %d bytes' % (url, len(res)))

t_l = []

start_time = time.time()

for url in URLS:

t = Thread(target=load_url,args=(url,60,))

t_l.append(t)

t.start()

for t in t_l:

t.join()

end_time = time.time()

print("Run time is %s" % (end_time-start_time))

---结果---

'http://www.mit.edu/' page is 21403 bytes

'http://www.foxnews.com/' page is 71735 bytes

'https://www.python.org/' page is 47701 bytes

'https://www.stanford.edu/' page is 69130 bytes

'http://www.ox.ac.uk/' page is 93411 bytes

'https://www.yahoo.com/' page is 446715 bytes

Run time is 2.6540000438690186

- 使用线程池 concurrent.futures.ThreadPoolExecutor的情况:

from concurrent import futures

import urllib.request

import time

URLS = [

'http://www.foxnews.com/',

'https://www.stanford.edu/',

'http://www.mit.edu/',

'https://www.python.org/',

'https://www.yahoo.com/',

'http://www.ox.ac.uk/'

]

def load_url(url, timeout):

return urllib.request.urlopen(url, timeout=timeout).read()

start_time = time.time()

with futures.ThreadPoolExecutor(max_workers=5) as executor:

future_to_url = dict((executor.submit(load_url, url, 60), url) for url in URLS)

for future in futures.as_completed(future_to_url):

url = future_to_url[future]

if future.exception() is not None:

print('%r generated an exception: %s' % (url,future.exception()))

else:

print('%r page is %d bytes' % (url, len(future.result())))

end_time = time.time()

print("Run time is %s" % (end_time-start_time))

---结果---

'http://www.mit.edu/' page is 21405 bytes

'http://www.foxnews.com/' page is 71197 bytes

'https://www.python.org/' page is 47701 bytes

'http://www.ox.ac.uk/' page is 93411 bytes

'https://www.yahoo.com/' page is 444854 bytes

'https://www.stanford.edu/' page is 68595 bytes

Run time is 2.497999906539917

备注:由于网络不稳定因素,所以Run time仅作为参考值;

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值