多线程基础概念
并行与并发
并行:同时处理多个任务,必须在多核环境下
一段时间内同时处理多个任务,单核也可以并发
并发手段
线程:内核空间的调度
进程:内核空间的调度
协程:用户空间的调度
线程可以允许程序在同一进程空间中并发运行多个操作。本次主要介绍Python标准库中的多线程模块threading。
threading模块
线程初始化
使用threading模块的Thread类初始化对象然后调用start方法启动线程。
import threading
import time
def worker(num):
time.sleep(1)
print('worker-{}'.format(num))
# 创建线程对象 target参数是一个函数, 这个函数即线程要执行的逻辑
threads = [threading.Thread(target=worker, args=(i, ))for i in range(5)]
for t in threads:
t.start()
# start 方法启动一个线程, 当这个线程的逻辑执行完毕的时候,线程自动退出, Python 没有提供主动退出线程的方法
# 输出以下结果
worker-0worker-1worker-2worker-3
worker-4
初始化的五个线程的执行逻辑中的print方法打印字符串及换行符出现了随机分布,即出现了资源竞争。
给线程传递参数
import threading
import time
def worker(*args, **kwargs):
time.sleep(1)
print(args)
print(kwargs)
threads = threading.Thread(target=worker, args=(1, 2, 3), kwargs={'a':'b'}).start()
# 输出
(1, 2, 3)
{'a': 'b'}
args传递位置参数,kwargs传递关键字参数。
Thread常用参数和方法
>>> help(threading.Thread)
可以看到Thread函数的初始化方法中的参数如下:
| __init__(self, group=None, target=None, name=None, args=(), kwargs=None, *, daemon=None)
| This constructor should always be called with keyword arguments. Arguments are:
|
| *group* should be None; reserved for future extension when a ThreadGroup
| class is implemented.
|
| *target* is the callable object to be invoked by the run()
| method. Defaults to None, meaning nothing is called.
|
| *name* is the thread name. By default, a unique name is constructed of
| the form "Thread-N" where N is a small decimal number.
|
| *args* is the argument tuple for the target invocation. Defaults to ().
|
| *kwargs* is a dictionary of keyword arguments for the target
| invocation. Defaults to {}.
name
表示线程名称,默认情况下,线程名称是 Thread-N ,N是一个较小的十进制数。我们可以传递name参数,控制线程名称。
以下会导入logging模块来显示线程的名称等详细信息
import threading
import time
import logging
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)s [%(threadName)s] %(message)s')
def worker(num):
time.sleep(1)
logging.info('worker-{}'.format(num))
threads = [threading.Thread(target=worker, args=(i, ), name='workerthread-{}'.format(i)) for i in range(5)]
for t in threads:
t.start()
# 输出
2017-03-20 21:39:29,339 INFO [workerthread-0] worker-0
2017-03-20 21:39:29,340 INFO [workerthread-1] worker-1
2017-03-20 21:39:29,340 INFO [workerthread-2] worker-2
2017-03-20 21:39:29,340 INFO [workerthread-3] worker-3
2017-03-20 21:39:29,346 INFO [workerthread-4] worker-4
其中logging模块的basicConfig函数的format中的%(threadName)s就是用来输出当前线程的名称的。
线程可以重名, 线程名并不是线程的唯一标识,但是通常应该避免线程重名,通常的处理手段是加前缀
daemon
Daemon:守护
和Daemon线程相对应的还有Non-Daemon线程,在此Thread初始化函数中的daemon参数即表示线程是否是Daemon线程。
Daemon线程:会伴随主线程结束而结束(可以理解为主线程结束,守护线程结束)
Non-Daemon线程:不会随着主线程结束而结束,主线程需要等待Non-Daemon结束
import logging
import time
import threading
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)s [%(threadName)s] %(message)s')
def worker():
logging.info('starting')
time.sleep(2)
logging.info('stopping')
if __name__ == '__main__':
logging.info('starting')
t1 = threading.Thread(target=worker, name='worker1', daemon=False)
t1.start()
time.sleep(1)
t2 = threading.Thread(target=worker, name='worker2', daemon=True)
t2.start()
logging.info('stopping')
# 输出
2017-03-20 23:28:06,404 INFO [MainThread] starting
2017-03-20 23:28:06,436 INFO [worker1] starting
2017-03-20 23:28:07,492 INFO [worker2] starting
2017-03-20 23:28:07,492 INFO [MainThread] stopping # 主线程执行完成
2017-03-20 23:28:08,439 INFO [worker1] stopping # 主线程执行完成之后会等Non-Daemon线程执行完成,但是并不会等Daemon线程执行完成,即Daemon线程会随着主线程执行完成而释放
Thread.join()
如果想等Daemon线程执行完成之后主线程再退出,可以使用线程对象的 join() 方法
import logging
import time
import threading
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s %(levelname)s [%(threadName)s] %(message)s')
def worker():
logging.info('starting')
time.sleep(2)
logging.info('stopping')
if __name__ == '__main__':
logging.info('starting')
t1 = threading.Thread(target=worker, name='worker1', daemon=False)
t1.start()
time.sleep(1)
t2 = threading.Thread(target=worker, name='worker2', daemon=True)
t2.start()
logging.info('stopping')
t1.join()
t2.join()
# 输出
2017-03-20 23:41:07,217 INFO [MainThread] starting
2017-03-20 23:41:07,243 INFO [worker1] starting
2017-03-20 23:41:08,245 INFO [worker2] starting
2017-03-20 23:41:08,246 INFO [MainThread] stopping
2017-03-20 23:41:09,243 INFO [worker1] stopping
2017-03-20 23:41:10,248 INFO [worker2] stopping
使用join函数只有主线程就需要等待Daemon线程执行完成在推出。
join函数的原型: join(self, timeout=None)
join方法会阻塞直到线程退出或者超时, timeout 是可选的,如果不设置timeout, 会一直等待线程退出。如果设置了timeout,会在超时之后退出或者线程执行完成退出。
因为join函数总是返回None,因此在超时时间到达之后如果要知道线程是否还是存活的,可以调用is_alive()方法判断线程是否存活。
threading常用方法
enumerate