协程
协程,又称微线程,纤程,是一种用户态的轻量级线程。
协程拥有自己的寄存器上下文和栈。协程调度切换时,将寄存器上下文和栈保存到其他地方,在切回来的时候,恢复先前保存的寄存器上下文和栈。
因此:协程能保留上一次调用时的状态(即所有局部状态的一个特定组合),每次过程重入时,就相当于进入上一次调用的状态(进入上一次离开时所处逻辑流的位置)
协程与线程类似,每个协程表示一个执行单元,既有自己的本地数据,也与其他协程共享全局数据和其他资源。
协程存在于线程中,需要用户来编写调度逻辑,对CPU而言,不需要考虑协程如何调度,切换上下文。
使用协程的好处:
- 无需线程上下文切换的开销
- 无需原子操作锁定及同步的开销
"原子操作(atomic operation)是指不会被线程调度机制打断的操作;这种操作一旦开始,就一直运行到结束,中间不会有任何上下文切换 (切换到另一个线程)。
原子操作可以是一个步骤,也可以是多个操作步骤,但是其顺序是不可以被打乱,或者切割掉只执行部分。视作整体是原子性的核心。
3. 高并发+高扩展性+低成本
协程的不足:
- 无法利用多核资源:协程的本质是个单线程,它不能同时将单个CPU 的多个核用上,协程需要和进程配合才能运行在多CPU上。
- 进行阻塞(Blocking)操作(如IO时)会阻塞掉整个程序。
yield
python通过yield提供了对协程的基本支持,但是并不完全
1 import time 2 import queue 3 4 5 def consumer(name): 6 print("--->starting eating baozi...") 7 while True: 8 new_baozi = yield 9 print("[%s] is eating baozi %s" % (name, new_baozi)) 10 # time.sleep(1) 11 12 13 def producer(): 14 r = con.__next__() 15 r = con2.__next__() 16 n = 0 17 while n < 5: 18 n += 1 19 con.send(n) 20 con2.send(n) 21 print("\033[32;1m[producer]\033[0m is making baozi %s" % n) 22 23 24 if __name__ == '__main__': 25 con = consumer("c1") 26 con2 = consumer("c2") 27 p = producer()
greenlet and gevent
greenlet是一个用C实现的协程模块,相比与python自带的yield,它可以使你在任意函数之间随意切换,而不需把这个函数先声明为generator。
gevent对协程的支持,本质上是greenlet在实现切换工作。
greenlet的工作流程:进行访问网络的IO操作时,出现阻塞,greenlet就显式切换到另一段没有被阻塞的代码执行,直到原来的阻塞状况消失以后,再切换回原来代码段继续处理。因此,greenlet是一种合理安排的串行方法。
greenlet.switch()可实现协程的切换,greenlet并不能实现自动切换。
1 from greenlet import greenlet 2 3 def test1(): 4 print(12) 5 gr2.switch() 6 print(34) 7 gr2.switch() 8 def test2(): 9 print(56) 10 gr1.switch() 11 print(78) 12 13 gr1 = greenlet(test1) #启动一个携程 14 gr2 = greenlet(test2) 15 gr1.switch() 16 17 # C:\D\program\Python354\python.exe C:/D/personal_data/workspace/四/day10/greenlet携程.py 18 # 12 19 # 56 20 # 34 21 # 78
gevent 是一个第三方库,可以轻松通过gevent实现并发同步或异步编程。gevent是对greenlet进行封装,实现协程的自动切换。
通过gevent.sleep模仿IO操作,实现协程的切换。
gevent.spawn 用来形成协程
gevent.joinall 添加这些协程任务,并且执行给定的gevent,同时阻塞当前程序流程,当所有gevent执行完毕程序继续向下执行
gevent.sleep 模拟IO操作多少时间
1 import gevent 2 3 def foo(): 4 print('Running in foo') 5 gevent.sleep(2) 6 print('Explicit context switch to foo again') 7 def bar(): 8 print('Explicit精确的 context内容 to bar') 9 gevent.sleep(1) 10 print('Implicit context switch back to bar') 11 def func3(): 12 print("running func3 ") 13 gevent.sleep(0) 14 print("running func3 again ") 15 16 gevent.joinall([ 17 gevent.spawn(foo), #生成, 18 gevent.spawn(bar), 19 gevent.spawn(func3), 20 ]) 21 22 # Running in foo 23 # Explicit精确的 context内容 to bar 24 # running func3 25 # running func3 again 26 # Implicit context switch back to bar 27 # Explicit context switch to foo again 28 # 29 # Process finished with exit code 0
上面程序的重要部分是将task函数封装到Greenlet内部线程的gevent.spawn。
初始化的greenlet列表存放在数组threads中,此数组被传给gevent.joinall 函数,后者阻塞当前流程,并执行所有给定的greenlet。
执行流程只会在 所有greenlet执行完后才会继续向下走。
协程切换是在IO操作时自动完成,在启动时通过monkey.patch_all()实现将一些常见的阻塞,如socket,select,urllib等地方实现协程跳转,因为gevent并不能完全识别所有当前操作是否为IO操作,而未切换。
monkey.patch_all() 相当于把当前程序的所有的io操作单独做上标记,完成自动切换。
1 import gevent 2 import requests 3 4 def run_task(url): 5 print('Visit --> %s'% url) 6 try: 7 res = requests.get(url) 8 data = res.text 9 print('%s bytes received from %s' %(len(data),url)) 10 except Exception as value: 11 print(value) 12 13 if __name__ == '__main__': 14 urls = ['https://www.baidu.com/','https://github.com/','https://www.python.org/'] 15 gevents = [ gevent.spawn(run_task, url) for url in urls] 16 gevent.joinall(gevents) 17 # 18 # Visit --> https://www.baidu.com/ 19 # 2443 bytes received from https://www.baidu.com/ 20 # Visit --> https://github.com/ 21 # 54833 bytes received from https://github.com/ 22 # Visit --> https://www.python.org/ 23 # 48703 bytes received from https://www.python.org/ 24 # 25 # Process finished with exit code 0
1 from gevent import monkey; monkey.patch_all() 2 import gevent 3 import requests 4 5 def run_task(url): 6 print('Visit --> %s'% url) 7 try: 8 res = requests.get(url) 9 data = res.text 10 print('%s bytes received from %s' %(len(data),url)) 11 except Exception as value: 12 print(value) 13 14 if __name__ == '__main__': 15 urls = ['https://www.baidu.com/','https://github.com/','https://www.python.org/'] 16 gevents = [ gevent.spawn(run_task, url) for url in urls] 17 gevent.joinall(gevents) 18 19 20 # 21 # Visit --> https://www.baidu.com/ 22 # Visit --> https://github.com/ 23 # Visit --> https://www.python.org/ 24 # 2443 bytes received from https://www.baidu.com/ 25 # 54833 bytes received from https://github.com/ 26 # 48703 bytes received from https://www.python.org/ 27 # 28 # Process finished with exit code 0
从上可以看出,没有monkey.path_all的情况无切换相当于串行,patch之后遇到IO操作自动切换,3个网络操作时并发执行,结束顺序不同,但其实只有一个线程。
gevent实现并发socket
1 import socket 2 import gevent 3 from gevent import monkey; monkey.patch_all() 4 5 def handler(conn): 6 try: 7 while True: 8 data = conn.recv(1024).decode() 9 if len(data): 10 print("receive: %s"%data) 11 conn.send(data.upper().encode()) 12 except Exception as value: 13 print(value) 14 HOST = ('0.0.0.0',9999) 15 server = socket.socket() 16 server.bind(HOST) 17 server.listen() 18 print('server start...') 19 while True: 20 conn,addr = server.accept() 21 print('receive connection:%s'% conn) 22 gevent.spawn(handler,conn)
1 import socket 2 3 4 5 client = socket.socket(socket.AF_INET,socket.SOCK_STREAM) 6 client.connect(('localhost',9999)) 7 while True: 8 data = input('>>>').strip() 9 client.send(data.encode()) 10 data = client.recv(1024) 11 print('receive: %s' %data)
gevent还提供对池的支持,当拥有动态数量的greenlet需要进行并发管理(限制并发数)时,就可以使用池,在处理大量的网络或IO操作时非常重要。
1 from gevent import monkey; monkey.patch_all() 2 from gevent.pool import Pool 3 import requests 4 5 def run_task(url): 6 print('Visit --> %s'% url) 7 try: 8 res = requests.get(url) 9 data = res.text 10 print('%s bytes received from %s' %(len(data),url)) 11 except Exception as value: 12 print(value) 13 return 'url:%s --> finished' % url 14 15 if __name__ == '__main__': 16 urls = ['https://www.baidu.com/','https://github.com/','https://www.python.org/'] 17 pool = Pool(2) 18 result = pool.map(run_task, urls) 19 print(result) 20 21 Visit --> https://www.baidu.com/ 22 Visit --> https://github.com/ 23 2443 bytes received from https://www.baidu.com/ 24 Visit --> https://www.python.org/ 25 54833 bytes received from https://github.com/ 26 48703 bytes received from https://www.python.org/ 27 ['url:https://www.baidu.com/ --> finished', 'url:https://github.com/ --> finished', 'url:https://www.python.org/ --> finished'] 28 29 Process finished with exit code 0