目录
- 例
- 几个问题
- 链concurrent.futures
例
# -*- encoding -*-
'''
py 3.6
sulime
'''
import concurrent.futures
import requests
import time
now = lambda: time.perf_counter()
def download_one(url):
try:
req = requests.get(url)
req.raise_for_status()
# print(req.status_code)
req.encoding = req.apparent_encoding
print('Read {} from {}'.format(len(req.text), url))
except:
print(404)
def download_all(sites):
# with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor: # 并发
# with concurrent.futures.ProcessPoolExecutor() as executor: # 并行
# executor.map(download_one, sites)
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor:
to_do = []
for site in sites:
future = executor.submit(download_one, site)
to_do.append(future)
for future in concurrent.futures.as_completed(to_do):
future.result()
def main():
sites = [
'https://www.baidu.com/',
'https://pypi.org/',
'https://www.sina.com.cn/',
'https://www.163.com/',
'https://news.qq.com/',
'http://www.ifeng.com/',
'http://www.ce.cn/',
'https://news.baidu.com/',
'http://www.people.com.cn/',
'http://www.ce.cn/',
'https://news.163.com/',
'http://news.sohu.com/'
]
start = now()
download_all(sites)
print('Download {} sites in {} s'.format(len(sites), now() - start))
if __name__ == '__main__':
main()
# Read 2349 from https://www.baidu.com/
# Read 6013 from https://news.qq.com/
# Read 19054 from https://pypi.org/
# Read 70152 from https://news.baidu.com/
# Read 101414 from http://www.ce.cn/
# Read 101414 from http://www.ce.cn/
# Read 146906 from http://news.sohu.com/
# Read 217489 from http://www.ifeng.com/
# Read 155854 from http://www.people.com.cn/
# Read 205149 from https://news.163.com/
# Read 541508 from https://www.sina.com.cn/
# Read 677498 from https://www.163.com/
# Download 12 sites in 23.067600960415174 s
# [Finished in 23.8s]
-
.submit()之后,返回的future实例放入to_do中,再将其传送给.as_completed(),再返回一个future实例迭代器?
eg.1. 输出.as_completed()的future结果with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor: to_do = [] for site in sites: future = executor.submit(download_one, site) to_do.append(future) for future in concurrent.futures.as_completed(to_do): print(future.result()) Read 2349 from https://www.baidu.com/ t=utf-8><m Read 19132 from https://pypi.org/ -UA-Compat Read 6013 from https://news.qq.com/ ntent="IE= Read 70155 from https://news.baidu.com/ nt="text/h Read 101415 from http://www.ce.cn/ l1-transit Read 101415 from http://www.ce.cn/ l1-transit Read 146961 from http://news.sohu.com/ keywords" Read 217277 from http://www.ifeng.com/ ta http-eq Read 155854 from http://www.people.com.cn/ text/html; Read 205149 from https://news.163.com/ -[if IE 7 Read 541723 from https://www.sina.com.cn/ ntent-type Read 677310 from https://www.163.com/ -[if IE 7 Download 12 sites in 23.09518421943929 s [Finished in 23.8s]
eg.2.
.submit()
后直接.result()
with concurrent.futures.ThreadPoolExecutor(max_workers=10) as executor: to_do = [] for site in sites: future = executor.submit(download_one, site) print(future.result()) Read 2349 from https://www.baidu.com/ t=utf-8><m Read 19132 from https://pypi.org/ -UA-Compat Read 541723 from https://www.sina.com.cn/ ntent-type Read 677295 from https://www.163.com/ -[if IE 7 Read 6013 from https://news.qq.com/ ntent="IE= Read 217277 from http://www.ifeng.com/ ta http-eq Read 101415 from http://www.ce.cn/ l1-transit Read 70160 from https://news.baidu.com/ nt="text/h Read 155854 from http://www.people.com.cn/ text/html; Read 101415 from http://www.ce.cn/ l1-transit Read 205149 from https://news.163.com/ -[if IE 7 Read 146899 from http://news.sohu.com/ keywords" Download 12 sites in 24.93912943206427 s [Finished in 25.6s]
问:
- 什么是并发?什么是并行?
并发:多个任务交替进行;
并行:多个任务同时进行; - 在协程编程中:concurrent.futures 和 asyncio 中的Future 的区别是什么?
答:参考:基本上,如果您正在使用ThreadPoolExecutor
或者ProcessPoolExecutor
想要Future直接使用基于线程或基于进程的并发,请使用concurrent.futures.Future
。如果您正在使用asyncio,请使用asyncio.Future。
- 什么是线程安全?
当多个线程同时访问某一对象时,不论它们之间如何切换,主程序都不需要去做任何同步工作,它的行为都是正确的。我们就说这个对象是线程安全的。 - What’s “race condition”?
多个线程/进程/客户端同时争抢资源,而没有正确使用锁,造成结果混乱。如何解决,仅供参考。 - 如何确定在什么情况下使用并发或并行?协程和多线程呢?
1.并发通常用于IO频繁操作的场景,IO处理的时间大多会比CPU处理的时间长很多;并行则用于CPU heavy的场景,比如MapReduce,由于计算量大,通常会使用多台机器、或服务器同时处理;
2.在并发中,协程asyncio通常用于处理IO间隙比较长的任务;多线程threading通常用于IO间隙比较短的任务;效率更高。
参考链