通过《通过实例认识Python的GIL》 ,《再谈Python的GIL》 ,《再谈Python的GIL(续)》已经基本上认识到了 Python的线程在多核情况下的性能是比较低的,但是在单核情况就就没有这个问题,难道就没有一个好的办法让Python的线程在多核情况下像单核情况下表现卓越吗?答案是有的,那就是限制Python在指定的CPU上运行。
在Windows下,有个函数可以指定进程在指定的CPU上运行:SetProcessAffinityMask
BOOL WINAPI SetProcessAffinityMask(
_In_ HANDLE hProcess,
_In_ DWORD_PTR dwProcessAffinityMask
);
dwProcessAffinityMask用于指定运行的CPU,比如1表示在CPU 0上运行,2表示在CPU 1上运行,3表示在CPU 0 和CPU 1上运行。
接下来看下怎么实现:
utility.pyx
cdef extern from "Windows.h":
ctypedef int BOOL
ctypedef void * HANDLE
ctypedef unsigned long DWORD_PTR
int SetProcessAffinityMask(HANDLE hProcess,DWORD_PTR dwProcessAffinityMask) nogil
HANDLE GetCurrentProcess() nogil
def SetAffinity(int mask):
with nogil:
SetProcessAffinityMask(<HANDLE>GetCurrentProcess(),mask)
Setup.py
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
ext = Extension("utility",
define_macros = [('MAJOR_VERSION', '1'),
('MINOR_VERSION', '0')],
sources = ["utility.pyx", ])
setup(
name = 'callback',
version = '1.0',
description = 'This is a callback demo package',
author = '',
author_email = 'shi19@163.com',
url = '',
long_description = '',
ext_modules=cythonize([ext,]),
)
编译生成utility.pyd:
python Setup.py build_ext --inplace
再看下测试用例:
count.py
from threading import Thread
from threading import Event as TEvent
from multiprocessing import Process
from multiprocessing import Event as PEvent
import utility
utility.SetAffinity(1)
from timeit import Timer
import sys
sys.setcheckinterval(100) #(100000)
def countdown(n,event):
while n > 0:
n -= 1
event.set()
def io_op(n,event,filename):
f = open(filename,'w')
while not event.is_set():
f.write('hello,world')
f.close()
def t1():
COUNT=100000000
event = TEvent()
thread1 = Thread(target=countdown,args=(COUNT,event))
thread1.start()
thread1.join()
def t2():
COUNT=100000000
event = TEvent()
thread1 = Thread(target=countdown,args=(COUNT//2,event))
thread2 = Thread(target=countdown,args=(COUNT//2,event))
thread1.start(); thread2.start()
thread1.join(); thread2.join()
def t3():
COUNT=100000000
event = PEvent()
p1 = Process(target=countdown,args=(COUNT//2,event))
p2 = Process(target=countdown,args=(COUNT//2,event))
p1.start(); p2.start()
p1.join(); p2.join()
def t4():
COUNT=100000000
event = TEvent()
thread1 = Thread(target=countdown,args=(COUNT,event))
thread2 = Thread(target=io_op,args=(COUNT,event,'thread.txt'))
thread1.start(); thread2.start()
thread1.join(); thread2.join()
def t5():
COUNT=100000000
event = PEvent()
p1 = Process(target=countdown,args=(COUNT,event))
p2 = Process(target=io_op,args=(COUNT,event,'process.txt'))
p1.start(); p2.start()
p1.join(); p2.join()
if __name__ == '__main__':
t = Timer(t1)
print('countdown in one thread:%f'%(t.timeit(1),))
t = Timer(t2)
print('countdown use two thread:%f'%(t.timeit(1),))
t = Timer(t3)
print('countdown use two Process:%f'%(t.timeit(1),))
t = Timer(t4)
print('countdown in one thread with io op in another thread:%f'%(t.timeit(1),))
t = Timer(t5)
print('countdown in one process with io op in another process:%f'%(t.timeit(1),))
相对于之前的测试用例,加了两行代码:
import utility
utility.SetAffinity(1)
我们来看下测试用例的输出:
countdown in one thread:7.005823
countdown use two thread:4.790538
countdown use two Process:4.936478
countdown in one thread with io op in another thread:9.526901
countdown in one process with io op in another process:9.262508
再对比一下之前在单核情况下的输出:
countdown in one thread:', 5.9650638561501195
countdown use two thread:', 5.8188333656781595
countdown use two Process', 6.197559396296269
countdown in one thread with io op in another thread:', 11.369204522553051
countdown in one process with io op in another process:', 11.79234388645473
由于这次测试时开的程序比较多,输出和之前有些差别,但是基本上是一致的。之前说要避免在多核情况下使用Thread,现在看来是错的了,只要限制进程运行的CPU即可。
在linux下,也可以用taskset命令来设置进程运行的CPU,这个以后再讨论。
从这篇文章我们可以得出一个结论:Python的GIL在多核CPU环境中的影响并没有之前想像的那么坏,在这个世界上,办法永远比困难多,就看你能不能坚持。