python 网络编程并发编程2(信号量,队列,进程)

最新推荐文章于 2022-03-04 09:42:24 发布

EdVzAs

最新推荐文章于 2022-03-04 09:42:24 发布

阅读量206

点赞数

文章标签：队列 python 并发编程多线程多进程

本文链接：https://blog.csdn.net/weixin_46131409/article/details/104360786

版权

Python 同时被 2 个专栏收录

135 篇文章 3 订阅

订阅专栏

网络编程

22 篇文章 0 订阅

订阅专栏

一.同步条件对象(Event):使两个线程同步

An event is a simple synchronization object;the event represents an internal flag,and threads can wait for the flag to be set, or set or clear the flag themselves.

#创建一个event对象:
event = threading.Event()   

#event对象管理一个内部标志flag,初始为False
#阻塞,直到标志为True:
event.wait()
#将flag设置为True:
event.set()
#将flag设置为False:
event.clear()

If the flag is set, the wait method doesn’t do anything.
If the flag is cleared, wait will block until it becomes set again.
Any number of threads may wait for the same event.*

import threading,time

class Boss(threading.Thread):
    def run(self):
        print("BOSS：今晚大家都要加班到22:00。")
        print(event.isSet())
        event.set()
        time.sleep(5)
        print("BOSS：<22:00>可以下班了。")
        print(event.isSet())
        event.set()
        
class Worker(threading.Thread):
    def run(self):
        event.wait()
        print("Worker：哎……命苦啊！")
        time.sleep(1)
        event.clear()
        event.wait()
        print("Worker：OhYeah!")
        
if __name__=="__main__":
    event=threading.Event()
    threads=[]
    for i in range(5):
        threads.append(Worker())
    threads.append(Boss())
    for t in threads:
        t.start()
    for t in threads:
        t.join()
        
#结果:
#BOSS：今晚大家都要加班到22:00。
#False
#Worker：哎……命苦啊！
#Worker：哎……命苦啊！
#Worker：哎……命苦啊！
#Worker：哎……命苦啊！
#Worker：哎……命苦啊！
#BOSS：<22:00>可以下班了。
#False
#Worker：OhYeah!
#Worker：OhYeah!
#Worker：OhYeah!
#Worker：OhYeah!
#Worker：OhYeah!

二.信号量Semaphore:用来控制线程并发数(锁的一种)
BoundedSemaphore/Semaphore管理一个内置的计数器,每当调用acquire()时-1,调用release()时+1

semaphore=threading.BoundedSemaphore/Semaphore(value=1)   
#设定信号量大小为value(同时最多允许value个线程运行),生成对象
semaphore.acquire/release()
#调用/释放锁

计数器不能小于0;当计数器为0时,acquire()将阻塞线程至同步锁定状态,直到其他线程调用release()(类似于停车位的概念)
BoundedSemaphore与Semaphore的唯一区别在于前者在调用release()时检查计数器的值是否超过了计数器的初始值,如果超过将抛出一个异常

import threading,time

class myThread(threading.Thread):
    def run(self):
        if semaphore.acquire():
            print(self.name)
            time.sleep(5)
            semaphore.release()
            
if __name__=="__main__":
    semaphore=threading.Semaphore(5)
    thrs=[]
    for i in range(100):
        thrs.append(myThread())
    for t in thrs:
        t.start()
        
#结果:
#Thread-1
#Thread-2
#...
#Thread-99
#Thread-100   #每5s出5个

三.队列queue:一种数据结构—>多线程利器,单线程时使用list
参考:blog.csdn.net/yeyazhishang/article/details/82353846

1.列表是不安全的数据结构

import threading,time

li=[1,2,3,4,5]

def pri():
    while li:
        a=li[-1]
        print(a)
        time.sleep(1)
        try:   #没有这个异常处理,2个线程均要删除5,导致报错
            li.remove(a)
        except Exception as e:
            print('----',a,e)

t1=threading.Thread(target=pri,args=())
t1.start()
t2=threading.Thread(target=pri,args=())
t2.start()

#结果:
#5
#5
#4
#---- 5 list.remove(x): x not in list
#4
#3
#---- 4 list.remove(x): x not in list
#3
#2
#---- 3 list.remove(x): x not in list
#2
#1
#---- 2 list.remove(x): x not in list
#1
#---- 1 list.remove(x): x not in list

2.队列queue:可被同一个进程中的不同线程共享

queue is especially useful in threaded programming when information must be exchanged safely between multiple threads.

queue列队类的方法:

import queue

#创建一个"队列"对象
q = queue.Queue(maxsize=10)
#queue.Queue类即是一个队列的同步实现,此时为FIFO
#队列长度可为无限或者有限
#可通过Queue的构造函数的可选参数maxsize来设定队列长度
#如果maxsize小于1就表示队列长度无限

#将一个值放入队列中
q.put(item,block=1,timeout=None)
#调用队列对象的put()方法在队尾插入一个项目
#第一个参数item必需,为插入项目的值
#第二个block可选,默认为1
#如果队列当前为空且block为1,put()方法就使调用线程暂停,直到空出数据单元
#如果block为0,put方法将引发Full异常
#第三个参数timeout可选,默认为None
#如果timeout是个正整数,阻塞调用进程最多timeout秒,如果一直无空间可用,抛出Full异常(带超时的阻塞调用)

#将一个值从队列中取出
q.get(block,timeout=None)
#从队头(先进入的数据)删除并返回一个项目
#第一个参数block,默认为True
如果队列为空且block为True,get()就使调用线程暂停,直至有项目可用
如果队列为空且block为False,队列将引发Empty异常
#第二个参数timeout,默认为None
#如果timeout是个正整数,阻塞调用进程最多timeout秒,如果一直数据可用,抛出Full异常(带超时的阻塞调用)

#Python Queue模块有三种队列及构造函数:
·FIFO为默认模式,即先进先出:q=queue.Queue(maxsize)
·LIFO类似于堆栈,即先进后出:q=queue.LifoQueue(maxsize)
·优先级队列,级别越低(rank越小)越先出:q=queue.PriorityQueue(maxsize)
#q.put([rank,value])   #rank为优先级,value为插入项目的值
#get()获得的是[rank,value]

#常用方法:
q.qsize() #返回队列的大小(队列中有多少个值)
q.empty() #如果队列为空,返回True,反之返回False
q.full() #如果队列满了,返回True,反之返回False
q.get_nowait() #相当于q.get(block=False)
q.put_nowait(item) #相当于q.put(item, block=False)
q.task_done() #完成一项工作后,向任务已完成的队列发送信号
q.join() #实际上意味着队列为空时,再执行别的操作
#join()与task_done()成对出现
#可理解为join()会阻塞进程直到收到task_done()发来的信号,但如果队列中未完成任务数为0,join()不会阻塞进程

3.生产者-消费者模式
(1)为什么使用生产者和消费者模式:在线程世界里,生产者就是生产数据的线程,消费者就是消费数据的线程;在多线程开发当中,如果一方处理能力大于另一方,就必须等待另一方—>于是引入了生产者-消费者模式

(2)什么是生产者-消费者模式:通过一个容器来解决生产者和消费者的强耦合问题—>生产者和消费者间不直接通讯,而通过阻塞队列进行通讯;所以生产者生产完数据之后不用等待消费者处理,直接扔给阻塞队列;消费者不找生产者要数据,而是从阻塞队列里取—>阻塞队列相当于缓冲区,平衡了双方的处理能力

这就像在餐厅,厨师做好菜,不需要直接和客户交流,而是交给前台
而客户去饭菜也不需要不找厨师,直接去前台领取即可
这也是一个结耦的过程

import time,random,queue,threading

q = queue.Queue()

def Producer(name):
    count = 0
    while count <10:
        print("making........")
        time.sleep(random.randrange(3))   #sleep 1 or 2s
        #q.join()
        #q.task_done()
        q.put(count)
        print('Producer %s has produced %s baozi..' %(name, count))
        count +=1
        print("ok......")
        
def Consumer(name):
    count = 0
    while count <10:
        time.sleep(random.randrange(4))
        if not q.empty():
            #q.join()
            #q.task_done()
            data = q.get()
            print(data)
            print('\033[32;1mConsumer %s has ate %s baozi...\033[0m' %(name, data))
        else:
            print("-----no baozi anymore----")
        count +=1

p1 = threading.Thread(target=Producer, args=('A',))
c1 = threading.Thread(target=Consumer, args=('B',))
#c2 = threading.Thread(target=Consumer, args=('C',))
#c3 = threading.Thread(target=Consumer, args=('D',))
p1.start()
c1.start()
# c2.start()
# c3.start()

四.多进程模块multiprocessing

Multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency,effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.

1.作用:由于GIL,python中的多线程不是真正的多线程,如果想充分使用多核CPU的资源,在python中一般使用多进程+多核CPU

multiprocessing包是Python中的多进程管理包,与threading.Thread类似
可以利用multiprocessing.Process对象来创建一个进程,该进程可以运行在Python程序内部编写的函数;该Process对象与Thread对象的用法相同,也有start(),run(),join()方法
此外multiprocessing包中也有Lock/Event/Semaphore/Condition类(这些对象可以像多线程那样通过参数传递给各个进程),用以同步进程,其用法与threading包中的同名类一致
所以,multiprocessing的很大一部份与threading使用同一套API,不过换到了多进程的情境

2.进程的调用

#方式1:直接调用
from multiprocessing import Process
import time

def f(name):
    time.sleep(1)
    print('hello', name,time.ctime())

if __name__ == '__main__':
    p_list=[]
    for i in range(3):
        p = Process(target=f, args=('alvin',))
        p_list.append(p)
        p.start()   #注意,进程资源不共享！！！
    for i in p_list:
        p.join()
    print('end')

#方式2:继承调用
from multiprocessing import Process
import time

class MyProcess(Process):
    def __init__(self):
        super(MyProcess, self).__init__()
        #用父类的初始化方法来初始化继承的属性
        #self.name = name
    def run(self):
        time.sleep(1)
        print ('hello', self.name,time.ctime())

if __name__ == '__main__':
    p_list=[]
    for i in range(3):
        p = MyProcess()
        #p.daemon=True   #设为守护进程,是一个属性(线程中为一个方法)
        #主进程结束后整体结束,不管子进程是否结束
        p.start()
        p_list.append(p)
    for p in p_list:
        p.join()
    print('end')

To show the individual process IDs involved, here is an expanded example:

from multiprocessing import Process
import os,time

def info(title):
    print("title:",title)
    print('parent process:', os.getppid())
    print('process id:', os.getpid())

def f(name):
    info('function f')
    print('hello', name)

if __name__ == '__main__':
    info('main process line')
    time.sleep(1)
    print("------------------")
    p = Process(target=info, args=('yuan',))
    p.start()
    p.join()
    
#结果:
#title: main process line
#parent process: 17228
#process id: 19536
#------------------
#title: yuan
#parent process: 19536
#process id: 15752

3.Process类
(1)构造方法:

Process([group,target,name,args,kwargs)
group:线程组;目前还没有实现(支持不好),库引用中提示必须是None
target:要执行的方法
name:进程名;默认为 类名-n
*args/**kwargs:要传入方法的参数

(2)实例方法:

is_alive():返回进程是否在运行
join(timeout):阻塞当前上下文环境的进程程,直到调用此方法的进程终止或到达指定的timeout(可选)
start():进程准备就绪,等待CPU调度
run():strat()调用run方法
#如果实例进程时未制定传入target,start执行默认run()方法
terminate():不管任务是否完成,立即停止工作进程
#线程池中会用到

(3)属性:

daemon:和线程的setDeamon功能一样
name:进程名字
pid:进程号

import time
from  multiprocessing import Process

class MyProcess(Process):
    def __init__(self,num):
        super(MyProcess,self).__init__()
        self.num=num
    def run(self):
        print(self.num,self.is_alive,self.pid)
      
if __name__ == '__main__':
    p_list=[]
    for i in range(10):
        p = MyProcess(i)
        #p.daemon=True
        p_list.append(p)

    for p in p_list:
        p.start()
        #for p in p_list:
        #    p.join()

    print('main process end')
    
#结果:
#main process end
#1 <bound method BaseProcess.is_alive of <MyProcess(MyProcess-2, started)>> 22068
#2 <bound method BaseProcess.is_alive of <MyProcess(MyProcess-3, started)>> 20544
#7 <bound method BaseProcess.is_alive of <MyProcess(MyProcess-8, started)>> 1240
#5 <bound method BaseProcess.is_alive of <MyProcess(MyProcess-6, started)>> 17828
#3 <bound method BaseProcess.is_alive of <MyProcess(MyProcess-4, started)>> 18496
#0 <bound method BaseProcess.is_alive of <MyProcess(MyProcess-1, started)>> 2368
#8 <bound method BaseProcess.is_alive of <MyProcess(MyProcess-9, started)>> 7612
#6 <bound method BaseProcess.is_alive of <MyProcess(MyProcess-7, started)>> 24280
#4 <bound method BaseProcess.is_alive of <MyProcess(MyProcess-5, started)>> 25440
#9 <bound method BaseProcess.is_alive of <MyProcess(MyProcess-10, started)>> 13224

4.进程间通讯:共享数据
(1)进程队列Queue:

from multiprocessing import Process, Queue
import queue

def f(q,n):
    #q.put([123, 456, 'hello'])
    q.put(n*n+1)
    print("son process",id(q))

if __name__ == '__main__':
    q = Queue()  #q=queue.Queue()为线程队列,无法在进程间使用
    print("main process",id(q))

    for i in range(3):
        p = Process(target=f, args=(q,i))   #传入队列,否则子进程无法放入数据
        p.start()

    print(q.get())
    print(q.get())
    print(q.get())
    
#结果:
#main process 2030024234504   #在Linux中id均相同
#son process 2364223883208
#2
#son process 2442119452552
#son process 2731530815432
#5
#1

在这里插入图片描述
传入的queue与原queue数据被拷贝一次,指针指向的不是同一块内存空间(深拷贝),但依靠一个映射保持数据一致

(2)管道Pipe:

The Pipe() function returns a pair of connection objects connected by a pipe which by default is duplex (two-way).

from multiprocessing import Process, Pipe

def f(conn):
    conn.send([12, {"name":"yuan"}, 'hello'])
    response=conn.recv()
    print("response",response)
    conn.close()
    print("q_ID2:",id(conn))

if __name__ == '__main__':
    #Pipe类实例化后会得到2个(双向)管道   #拥有者可接收也可发送的管道
    parent_conn, child_conn = Pipe()   #此时均为主进程的管道
    print("q_ID1:",id(child_conn))
    p = Process(target=f, args=(child_conn,))
    p.start()
    print(parent_conn.recv())   # prints "[42, None, 'hello']"
    parent_conn.send("儿子你好!")
    p.join()
    
#结果:
#q_ID1: 2750677559240
#[12, {'name': 'yuan'}, 'hello']
#response 儿子你好!
#q_ID2: 1447312178184

The two connection objects returned by Pipe() represent the two ends of the pipe. Each connection object has send() and recv() methods (among others). Note that data in a pipe may become corrupted if two processes (or threads) try to read from or write to the same end of the pipe at the same time. Of course there is no risk of corruption from processes using different ends of the pipe at the same time.

(3)Managers:

Queue和pipe只是实现了数据交互(收发),并没实现数据共享,即一个进程去更改另一个进程的数据
A manager object returned by Manager() controls a server process which holds Python objects and allows other processes to manipulate them using proxies.
A manager returned by Manager() will support types list, dict,Namespace(指变量名), Lock, RLock, Semaphore, BoundedSemaphore, Condition,Event, Barrier, Queue, Value and Array.

from multiprocessing import Process, Manager

def f(d, l,n):
    d[n] = '1'
    d['2'] = 2
    d[0.25] = None
    l.append(n)
    #print(l)
    print("son process:",id(d),id(l))

if __name__ == '__main__':
    with Manager() as manager:   #manager=Manager()需要关闭
        d = manager.dict()   #创建一个(可共享的)字典,通过Manager()封装
        l = manager.list(range(5))
        print("main process:",id(d),id(l))
        p_list = []
        for i in range(10):
            p = Process(target=f, args=(d,l,i))
            p.start()
            p_list.append(p)
        for res in p_list:
            res.join()
        print(d)
        print(l)
        
#结果:
#main process: 3113766402440 3113767635336
#son process: 2220250913224 2220250974344
#son process: 2519730510344 2519730571464
#son process: 1710022810248 1710022871368
#son process: 1255400888968 1255400950088
#son process: 2246107617800 2246107678920
#son process: 1965141351048 1965141412168
#son process: 1730165561800 1730165622920
#son process: 1704584108552 1704584168456
#son process: 2476917152392 2476917213512
#son process: 2749181680200 2749181741320
#{2: '1', '2': 2, 0.25: None, 3: '1', 1: '1', 0: '1', 7: '1', 8: '1', 5: '1', 4: '1', 6: '1', 9: '1'}
#[0, 1, 2, 3, 4, 2, 3, 1, 0, 7, 8, 5, 4, 6, 9]

5.进程同步:使用同步锁Lock

Without using the lock output from the different processes is liable to get all mixed up.

from multiprocessing import Process, Lock

def f(l, i):
    #with l:
        print('hello world %s'%i)
    #或:
    l.acquire()   #调用锁
    print('hello world %s'%i)
    l.release()   #释放锁

if __name__ == '__main__':
    lock = Lock()
    for num in range(10):
        Process(target=f, args=(lock, num)).start()

防止:
在这里插入图片描述
6.进程池:内部维护一个进程序列;使用时则去进程池中获取一个进程;如果进程池序列中没有可供使用的进程,程序就会等待,直到进程池中有可用进程为止

方法:
Pool(x):产生进程池,x为最大容量,默认为CPU核心数(包括物理核心与虚拟核心)
apply(func=,args=,kwargs=):同步方法;进程池对该方法无意义
apply_async(func=,args=,kwargs=,callback=):异步方法
#func为要调用的函数;args/kwargs为传入函数的参数
#callback为回调函数,即func执行成功后再去执行的函数;该函数为主进程调用
#callback在callback=bar的情况下默认传入func的返回值;如为callback=bar()需手动传入参数
#callback常用于日志logger等共用操作

from  multiprocessing import Process,Pool
import time,os

def Foo(i):
    time.sleep(1)
    print(i)
    return i+100

def Bar(arg):
    print(os.getpid())
    print(os.getppid())
    print('logger:',arg)
    
if __name__=='__main__':
    pool = Pool(5)   #产生一个进程池,最大容量为5;5个进程实际上在不断切换
    Bar(1)
    print("----------------")

    for i in range(2):
        #pool.apply(func=Foo, args=(i,))---1
        #pool.apply_async(func=Foo, args=(i,))---2
        pool.apply_async(func=Foo, args=(i,),callback=Bar)---3

    pool.close()   #进程池中close()必须在join()前;且均必须存在
    pool.join()
    print('end')

在这里插入图片描述

EdVzAs

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python 网络编程并发编程2(信号量,队列,进程)

一.锁1.python的全局解释锁(GIL):(1)概念:对一个进程来说,无论开启多少个线程/有多少个CPU核心,Python在执行的时候同一时刻只允许一个线程运行,即使有多个CPU核心In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from ...
复制链接

扫一扫