心得:当我们在享受当天充实的生活时候,你应该静下心想想,你今天到底收获了多少,还只是忙忙碌碌,也只是所谓的忙。
为什么会有多线程的,多进程的存在,我们平时在运行程序的时候,它是一条线在执行,当它阻塞的时候,后面的程序就必须等待阻塞程序执行完成之后才可以运行,这样的话就大大减少了我们程序运行的效率。所以我们今天学习一个多线程,多进程运行。
核数:理解为一个工厂,能提供的电力
进程:理解为一个车间,完成一项任务,进程之间资源独立。多进程模式最大的优点就是稳定性高,因为一个子进程崩溃了,不会影响主进程和其他子进程。
线程:理解为车间的一个工人,线程之间资源共享,分工完成一项工作,多线程模式通常比多进程快一点,但是也快不到哪去,而且,多线程模式致命的缺点就是任何一个线程挂掉都可能直接造成整个进程崩溃,因为所有线程共享进程的内存。
一、多线程
普通一条线程运行时间,为4秒
import threading
import time
def seq(string):
print(string,"star :%.f s"%(time.time()))
time.sleep(2)
print(string,"end :%.f s"%(time.time()))
t1=time.time()
seq(1)
seq(2)
print("all finished :%.f s"%(time.time()-t1))
结果:
PS C:\Users\TianJian\Desktop\python> & C:/Users/TianJian/AppData/Local/Microsoft/WindowsApps/python.exe c:/Users/TianJian/Desktop/python/多线程,多进程/多线程.py
1 star :1574427731 s
1 end :1574427733 s
2 star :1574427733 s
2 end :1574427735 s
all finished :4 s
PS C:\Users\TianJian\Desktop\python>
1.start
当加入了多线程之后,music,movie,共运行了2秒,主线程只运行了0秒,因为相当于开辟了三条线,同时执行主线程和两条副线程,主线程直接打印了当前时间,没有任何程序阻塞它,导致只有0秒运行时间。这就是所谓的并行运行
import threading
import time
def seq(string,s):
print(string,"star :%.f s"%(time.time()))
time.sleep(s)
print(string,"end :%.f s"%(time.time()))
th1=threading.Thread(target=seq,args=("music",2,))
th2=threading.Thread(target=seq,args=("movie",2,))
print("start :%.f s"%(time.time()))
t1=time.time()
th1.start()
th2.start()
print("all finished :%.f s"%(time.time()-t1))
结果:
PS C:\Users\TianJian\Desktop\python> & C:/Users/TianJian/AppData/Local/Microsoft/WindowsApps/python.exe c:/Users/TianJian/Desktop/python/多线程,多进程/多线程.py
start :1574428266 s
music star :1574428266 s
movie star :1574428266 s
all finished :0 s
music end :1574428268 s
movie end :1574428268 s
PS C:\Users\TianJian\Desktop\python>
2.join
如何让主线程最后再结束,等待其他线程完成之后再结束收尾。
给线程1加入了一条阻塞语句,当线程th1结束之后,它之后的程序才可以运行,才导致了主线程又比th2线程快,所以当我们程序在运行时候,可以将全部的线程都加入阻塞。
def seq(string,s):
print(string,"star :%.f s"%(time.time()))
time.sleep(s)
print(string,"end :%.f s"%(time.time()))
th1=threading.Thread(target=seq,args=("music",2,))
th2=threading.Thread(target=seq,args=("movie",3,))
print("start :%.f s"%(time.time()))
t1=time.time()
th1.start()
th2.start()
th1.join()
print("all finished :%.f s"%(time.time()-t1))
结果:
start :1574428631 s
music star :1574428631 s
movie star :1574428631 s
music end :1574428633 s
all finished :2 s
movie end :1574428634 s
3.setDaemon
当setDaemon设置为True时,将此线程设置为守护线程,大概含义是守护在主线程旁边,当它结束,其守护线程也得结束,所以当遇到,一些子线程在写,读操作中,主线程突然结束,会导致文件不完整,这就需要将守护线程设置为false,设置守护线程必须在start之前。
def seq(string,s):
print(string,"star :%.f s"%(time.time()))
time.sleep(s)
print(string,"end :%.f s"%(time.time()))
th1=threading.Thread(target=seq,args=("music",2,))
th2=threading.Thread(target=seq,args=("movie",3,))
print("start :%.f s"%(time.time()))
t1=time.time()
th1.setDaemon(True)
th2.setDaemon(True)
th1.start()
th2.start()
time.sleep(1)
print("all finished :%.f s"%(time.time()-t1))
结果:
start :1574429027 s
music star :1574429027 s
movie star :1574429027 s
all finished :1 s
4.锁
为什么需要线程锁?
多线程同时在运行时,修改同一个数据时,互相竞争会达不到我们的预期,反而会出错,所以引出了锁的概念,当持锁的线程才又资格去修改此数据。
如何实现线程锁:
1 lock = threading.Lock() 实例化一个锁对象;
2 lock.acquire() 操作变量之前进行加锁
3 lock.release() 操作变量之后进行解锁
两条线程同时修改一个值,只要循环次数够多,赋值过程中,改此值,很容易造成问题
import threading
import time
lock=threading.Lock()
def seq(x):
global num
num+=x
time.sleep(0.1)
num-=x
print("num :",num)
def fun(n):
for i in range(10000):
seq(n)
num=0
th1=threading.Thread(target=fun,args=(3,))
th2=threading.Thread(target=fun,args=(8,))
th1.start()
th2.start()
th1.join()
th2.join()
print("finished",num)
加锁后不会出现这样的问题了
import threading
import time
lock=threading.Lock() #实例化一把锁
def seq(x):
global num
lock.acquire() #获得锁
num+=x
time.sleep(0.1)
num-=x
lock.release() #释放锁
num=0
i_obj=[]
for i in range(100):
t=threading.Thread(target=seq,args=(i,)) #声明线程数
t.start() #开始线程
i_obj.append(t)
for t in i_obj: #加入阻塞
t.join()
print("finished",num)
5.信号量
互斥锁同时只允许一个线程更改数据,而Semaphore是同时允许一定数量的线程更改数据
import threading
import time
def run(n):
semaphore.acquire() # 获取信号,信号可以有多把锁
time.sleep(1) # 等待一秒钟
print("run the thread: %s\n" % n)
semaphore.release() # 释放信号
t_objs = []
if __name__ == '__main__':
semaphore = threading.BoundedSemaphore(5) # 声明一个信号量,最多允许5个线程同时运行
for i in range(20): # 运行20个线程
t = threading.Thread(target=run, args=(i,)) # 创建线程
t.start() # 启动线程
t_objs.append(t)
for t in t_objs: #加入阻塞
t.join()
print('>>>>>>>>>>>>>')
6.线程池用法
实例化线程池,再使用map方法进行调用。
import threading
import time
from multiprocessing.dummy import Pool
def seq(string):
print(string,"star :%.f s"%(time.time()))
time.sleep(2)
print(string,"end :%.f s"%(time.time()))
t1=time.time()
lis=["A","B","C","D"]
pool=Pool(4)
pool.map(seq,lis)
print("all finished :%.f s"%(time.time()-t1))
结果:
A star :1574563019 s
B star :1574563019 s
C star :1574563019 s
D star :1574563019 s
A end :1574563021 s
D end :1574563021 s
B end :1574563021 s
C end :1574563021 s
all finished :2 s
7.使用多线程并获取多线程返回值
import threading
# 判断值是否为偶数
def is_even(value):
if value % 2 == 0:
return True
else:
return False
class MyThread(threading.Thread):
def __init__(self, func, args=()):
super(MyThread, self).__init__()
self.func = func
self.args = args
def run(self):
self.result = self.func(*self.args) # 在执行函数的同时,把结果赋值给result,
# 然后通过get_result函数获取返回的结果
def get_result(self):
try:
return self.result
except Exception as e:
return None
result = []
threads = []
for i in range(10):
t = MyThread(is_even, args=(i,))
t.start()
threads.append(t)
for t in threads:
t.join() # 一定执行join,等待子进程执行结束,主进程再往下执行
result.append(t.get_result())
二、多进程
创建进程时需要在if name == ‘main’: 下进行:一个python文件通常有两种使用方法,第一是作为脚本直接执行,第二是 import 到其他的 python 脚本中被调用(模块重用)执行,文件调用时,只会调用if函数之上的内容。
在使用多进程时,会将使用多进程的函数进行序列化操作,所以要保证函数的装饰器等能被序列化
1 创建一个多进程
from multiprocessing import Process
import time
import os
def f(name):
time.sleep(2)
print('hello', name)
print("PID",os.getpid())
if __name__ == '__main__':
for n in range(10): # 创建一个进程
p = Process(target=f, args=('bob %s' % (n),))
# 启动
p.start()
# 等待进程执行完毕
结果:
hello bob 0
PID 69500
hello bob 1
PID 69548
hello bob 2
PID 69612
hello bob 3
PID 78400
hello bob 4
PID 78448
hello bob 5
PID 69744
hello bob 6
PID 69808
hello bob 7
PID 78480
hello bob 8
PID 69980
hello bob 9
PID 70000
2.进程池
2.1 如果要启动大量的子进程,可以用进程池的方式批量创建子进程:
from multiprocessing import Pool
import os, time, random
def long_time_task(name):
print('Run task %s (%s)...' % (name, os.getpid()))
start = time.time()
time.sleep(random.random() * 3)
end = time.time()
print('Task %s runs %0.2f seconds.' % (name, (end - start)))
if __name__=='__main__':
print('Parent process %s.' % os.getpid())
p = Pool(4) #实例化进程迟数量为4,依据我们的要求去启动对应的进程数
for i in range(5):
p.apply_async(long_time_task, args=(i,)) #传入函数参数,一次启动四个进程
print('Waiting for all subprocesses done...')
p.close()
p.join()
print('All subprocesses done.')
结果:
Waiting for all subprocesses done...
Run task 0 (68132)...
Run task 1 (68152)...
Run task 2 (68200)...
Run task 3 (68264)...
Task 0 runs 0.75 seconds.
Run task 4 (68132)...
Task 4 runs 0.07 seconds.
Task 1 runs 1.69 seconds.
Task 2 runs 1.96 seconds.
Task 3 runs 2.18 seconds.
All subprocesses done.
2.2 使用map函数生成多进程
import time
from multiprocessing import Pool
t1 = time.time()
def seq(string):
print(string, "star :%.f s" % (time.time()))
time.sleep(2)
print(string, "end :%.f s" % (time.time()))
t1 = time.time()
return string,string
lis = ["A", "B", "C", "D"]
pool = Pool(4)
ret = pool.map(seq, lis)
print("all finished :%.f s" % (time.time() - t1))
print(ret)
结果:
A star :1598882546 s
B star :1598882546 s
C star :1598882546 s
D star :1598882546 s
A end :1598882548 s
D end :1598882548 s
B end :1598882548 s
C end :1598882548 s
all finished :2 s
[('A', 'A'), ('B', 'B'), ('C', 'C'), ('D', 'D')]
pool.map若是要传如多个变量:
from functools import partial
import time
from multiprocessing import Pool
t1 = time.time()
def seq(string,name,age):
return string+name+str(age)
lis = ["A", "B", "C", "D"]
pool = Pool(4)
#使用偏函数将多余的参数隐藏掉
fun = partial(seq,name="tian",age=18)
ret = pool.map(fun, lis)
print("all finished :%.f s" % (time.time() - t1))
print(ret)
3 使用多进程获取进程结果
3.1 进程间交互
from multiprocessing import Process, Queue
import os, time, random
# 写数据进程执行的代码:
def write(q):
print('Process to write: %s' % os.getpid())
for value in ['A', 'B', 'C']:
print('Put %s to queue...' % value)
q.put(value) #发送数据给q
time.sleep(random.random())
# 读数据进程执行的代码:
def read(q):
print('Process to read: %s' % os.getpid())
while True:
value = q.get(True) #通过q接收数据再赋值给value
print('Get %s from queue.' % value)
if __name__=='__main__':
# 父进程创建Queue,并传给各个子进程:
q = Queue() #实例化对象q,通过q这个参数对进程之间的数据进行交互
pw = Process(target=write, args=(q,))
pr = Process(target=read, args=(q,))
# 启动子进程pw,写入:
pw.start()
# 启动子进程pr,读取:
pr.start()
# 等待pw结束:
pw.join()
# pr进程里是死循环,无法等待其结束,只能强行终止:
pr.terminate()
结果:
Process to write: 44176
Put A to queue...
Put B to queue...
Process to read: 44632
Get A from queue.
Get B from queue.
Put C to queue...
Get C from queue.
3.2 多进程获取进程返回值
https://blog.csdn.net/springlustre/article/details/88703947
python中使用多进程multiprocessing并获取子进程的返回值
Python中的multiprocessing包是一个多进程管理包,可以用来创建多进程。
multiprocessing包下的Queue是多进程安全的队列,我们可以通过该Queue来进行多进程之间的数据传递。
我们可以通过下面这段代码演示多进程的使用,并将每个进程的结果保存到queue中,最后统一进行输出
import random
import time
import multiprocessing
def worker(name, q):
t = 0
for i in range(10):
print(name + " " + str(i))
x = random.randint(1, 3)
t += x
time.sleep(x * 0.1)
#将内容放入队列中
q.put(t)
q = multiprocessing.Queue()
jobs = []
for i in range(10):
p = multiprocessing.Process(target=worker, args=(str(i), q))
jobs.append(p)
p.start()
for p in jobs:
p.join()
#通过q.get()取出内容
results = [q.get() for j in jobs]
print(results)
结果:
......
6 8
9 8
2 7
1 9
4 9
7 9
6 9
2 8
5 9
9 9
2 9
[17, 18, 18, 18, 18, 19, 20, 20, 20, 22]
Process finished with exit code 0
4 concurrent 多进程用法
from concurrent.futures import ProcessPoolExecutor
import time, os
from functools import partial
def piao(n, name):
print('%s is piaoing %s' % (name, os.getpid()))
time.sleep(1)
return n ** 2
def muiti_fun():
p = ProcessPoolExecutor(5)
objs = []
start = time.time()
for i in range(5):
obj = p.submit(piao, i, 'safly %s' % i) # 异步调用
objs.append(obj)
p.shutdown(wait=True)
print('主', os.getpid())
for obj in objs:
print(obj.result())
stop = time.time()
print(stop - start)
def muiti_func_map():
p = ProcessPoolExecutor(5)
start = time.time()
f = partial(piao, name="tian")
res = p.map(f, [1, 2, 3, 4, 5])
print(list(res))
p.shutdown(wait=True)
stop = time.time()
print(stop - start)
safly 0 is piaoing 35608
safly 1 is piaoing 35609
safly 2 is piaoing 35610
safly 3 is piaoing 35611
safly 4 is piaoing 35612
主 35607
0
1
4
9
16
1.0186498165130615
tian is piaoing 35613
tian is piaoing 35614
tian is piaoing 35615
tian is piaoing 35616
tian is piaoing 35617
[1, 4, 9, 16, 25]
1.0142982006072998```