CPU 有多个核,一个进程可以有多个线程,通过让多个核执行多个线程可以加速程序的执行速度。
1. threading module
Python 的 Thread 类在 threading 库中:
class Thread:
def __init__(self, group=None, target=None, name=None,
args=(), kwargs=None, *, daemon=None):
其中,target
为创建线程要执行的函数(不可缺失),args
是 target 对应函数的参数,元组 类型,name
为线程名,默认为 “Thread-N”
2. 创建一个线程
import time
import threading
def func(n):
for i in range(1, n):
# print current thread name
print(f'{threading.current_thread().name} is running...')
time.sleep(1)
time1 = time.time()
thread = threading.Thread(target=func, name='Sub-thread', args=(5,))
thread.start()
func(5)
time2 = time.time()
print(f'dutation: {time2-time1}')
上面创建了一个名为 “Sub-thread” 的线程,创建对象后,线程被初始化,调用 start()
后 thread 开始运行,执行结果为:
Sub-thread is running...
MainThread is running...
MainThread is running...
Sub-thread is running...
MainThread is running...
Sub-thread is running...
MainThread is running...
Sub-thread is running...
dutation: 4.002989768981934
可以看到子线程和主线程是同时执行的,总耗时为 4s
3. JOIN
join()
函数提供了阻塞线程的手段,在当前线程执行完之前阻塞其他线程,语法如下:
def join(self, timeout=None):
"""Wait until the thread terminates.
This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception or until the optional timeout occurs.
When the timeout argument is present and not None, it should be a floating point number specifying a timeout for the operation in seconds
When the timeout argument is not present or None, the operation will block until the thread terminates.
举个例子:
import time
import threading
def func(n):
for i in range(1, n):
# print current thread name
print(f'{threading.current_thread().name} is running...')
time.sleep(1)
time1 = time.time()
thread = threading.Thread(target=func, name='Sub-thread', args=(5,))
thread.start()
thread.join() # blocks the calling thread
func(5)
time2 = time.time()
print(f'dutation: {time2-time1}')
此时,子线程 thread 调用 join()
阻塞了主线程,程序的执行结果为(耗时 8s):
Sub-thread is running...
Sub-thread is running...
Sub-thread is running...
Sub-thread is running...
MainThread is running...
MainThread is running...
MainThread is running...
MainThread is running...
dutation: 8.005128383636475
4. 数据共享与冲突
5.1 数据冲突
下面模拟多线程银行存钱的场景:
账户初始余额为零,创建多个线程,多个线程独自读取账户余额,并存入 1 元,耗时 0.1s
import time, threading
from threading import Thread, Lock
bank = {
'karl' : 0
}
def deposit(money):
amount = bank['karl']
time.sleep(0.5)
bank['karl'] = amount + money
print(f'{threading.current_thread().name} saved 1RMB.')
time1 = time.time()
threads = []
for i in range(0, 5):
thread = Thread(target=deposit, args=(1,))
thread.start()
threads.append(thread)
for thread in threads:
thread.join()
time2 = time.time()
print(f'duration: {time2-time1}')
print(f"Current money: {bank['karl']}")
程序运行结果如下:
Thread-1 saved 1RMB.
Thread-3 saved 1RMB.
Thread-2 saved 1RMB.
Thread-5 saved 1RMB.
Thread-4 saved 1RMB.
duration: 0.504030704498291
Current money: 1
可以看到,5 个线程读取当前余额都在 0.1s 内完成了,因此读取结果都是 0,然后分别将结果更新,最后系统只保存最后一个线程的写入结果!因此最后余额为 1
JOIN 的位置
可能有人会想,为什么 join 要单独放一个循环,合并到与 start 一个循环不可以吗?
for i in range(0, 5):
thread = Thread(target=deposit, args=(1,))
thread.start()
threads.append(thread)
thread.join()
不可以,如果 join 和 start 在一个循环,每个循环中,创建了一个线程后,join 会阻塞其他线程,等待当前线程完成再往下执行,也就是说只有当前线程执行完后才会创建下一个线程,也就是说相当于没有使用多线程!
相反,如果 join 单独放到一个循环,所有线程几乎同时创建同时运行,因此一个线程完成后,其余线程也快完成了,多个线程按顺序执行 join
至于打印出来的线程名字无序,这是线程速度不同导致的,速度快的线程当然先打印结果!
5.2 加锁解决冲突
通过加锁保证每次只有一个用户访问读写共享的数据,当一个线程申请到了 Lock,则可以读写被共享的数据,完成后释放 Lock,未申请到 Lock 的线程需要等待。
语法:
from threading import Lock
# 创建一个锁对象
lock = Lock()
def method(arg1, arg2):
lock.acquire() # 申请锁
# pass
lock.release() # 释放锁
对上面银行存钱的例子加锁:
import time, threading
from threading import Thread, Lock
bank = {
'karl' : 0
}
bank_lock = Lock()
def deposit(money):
# 操作共享数据前申请获取锁
bank_lock.acquire()
amount = bank['karl']
time.sleep(0.1)
bank['karl'] = amount + money
print(f'{threading.current_thread().name} saved 1RMB.')
# 释放锁
bank_lock.release()
time1 = time.time()
threads = []
for i in range(0, 5):
thread = Thread(target=deposit, args=(1,))
thread.start()
threads.append(thread)
for thread in threads:
thread.join()
time2 = time.time()
print(f'duration: {time2-time1}')
print(f"Current money: {bank['karl']}")
程序运行结果为:
Thread-1 saved 1RMB.
Thread-2 saved 1RMB.
Thread-3 saved 1RMB.
Thread-4 saved 1RMB.
Thread-5 saved 1RMB.
duration: 0.5050632953643799
Current money: 5
问题解决,但速度慢了一点点 ~