python process&thread

bgape002

于 2024-02-25 15:51:35 发布

阅读量1.4k

点赞数 20

分类专栏： python3 文章标签： python 网络

本文链接：https://blog.csdn.net/bgape002/article/details/136283352

版权

python3 专栏收录该内容

22 篇文章 0 订阅

订阅专栏

title: python process&thread
top: 43
date: 2022-07-05 11:29:57
tags:

process
thread
categories:
python

GIL

GIL全称Global Interpreter Lock，全局解释器锁。GIL并不是Python语言的特性，它是在实现Python解释器时引用的一个概念。GIL只在CPython解释器上存在。

在使用互斥锁解决代码中的资源竞争问题时，当一个线程执行时，会将全局共享的资源上锁，当线程执行完成后，将锁解开，释放资源，其他线程才能够使用。GIL的作用与互斥锁的作用相似，是为了解决解释器中多个线程资源竞争的问题。

GIL对程序的影响

因为GIL的存在，在Python中同一时刻有且只有一个线程会执行。
因为线程是存在于进程中的，线程是CPU调度和分派的基本单位，Python中的多线程由于GIL锁的存在无法利用多核 CPU。
GIL在程序中有IO操作时才切换到其他线程，所以Python中的多线程不适合计算密集型的程序，只适合IO密集型的程序。
GIL保证单个字节码的执行不会受到其他线程的任何干扰，但是任何字节码间都可能发生线程切换。

并行与并发

几乎所有的操作系统都支持同时运行多个任务，每个任务通常是一个程序，每一个运行中的程序就是一个进程，即进程是应用程序的执行实例。

并行

并行指在同一时刻有多条指令在多个处理器上同时执行；

并发

并发是指在同一时刻只能有一条指令执行，但多个进程指令被快速轮换执行，使得在宏观上具有多个进程同时执行的效果

线程与进程

进程和线程的关系是这样的：操作系统可以同时执行多个任务，每一个任务就是一个进程，进程可以同时执行多个任务，每一个任务就是一个线程。

线程

线程是程序执行的最小单位，是进程的组成部分，一个进程可以拥有多个线程。在多线程中，会有一个主线程来完成整个进程从开始到结束的全部操作，而其他的线程会在主线程的运行过程中被创建或退出。

**线程安全：**多线程竞争同一个资源保证数据一致性和完整性的一个过程，通常一般是采取加锁机制，进行相关数据临界点的互斥。
**线程安全一般是：**多线程环境下对一些【全局变量及静态变量】引起的同时具有【写操作权限】的时候会出现数据不一致或数据污染的情况，此时会了避免此类情况的发生，一般采取就是加锁的机制。

进程

是资源分配的最小单位，它是操作系统进行资源分配（CPU，内存，磁盘，网络）和调度运行的基本单位，一个程序就是一个进程

多线程

`threading`

import time
import threading


def func1(p1, p2):
    print(p1)
    time.sleep(2)
    print(p2)


def func2():
    print('print 2-1th')
    time.sleep(4)
    print('print 2-2th')


if __name__ == '__main__':
    thd1 = threading.Thread(target=func1, args=('1-1th', '1-2th',))
    thd2 = threading.Thread(target=func2)

    # 设置守护进程，在主进程结束时，守护进程会直接结束
    thd2.setDaemon(True)

    thd1.start()
    thd2.start()

    tm1 = time.time()
    # 设置进程阻塞，等带join进程运行完成
    thd1.join()
    tm2 = time.time()
    print('程序结束,进程阻塞时间为：{}'.format(round(tm2-tm1, 2)))

"""output
1-1th
print 2-1th
1-2th
程序结束,进程阻塞时间为：2.02
"""

由于th2设置了daemon守护进程，主线程3s内运行完成，故无法输出print 2-2th
由于th1设置了join阻塞主进程，th1线程运行完成需要2s，故最后一个打印会阻塞2.02s

"""多线程编程模式2：
继承thread类
---
适用于进程代码较为复杂的程序编程
"""
import time
import threading


class Thread1(threading.Thread):
    def __init__(self, args, name):
        super().__init__(name=name)  # 设置进程名称
        self.p1 = args[0]
        self.p2 = args[1]

    def run(self):
        print(self.p1)
        time.sleep(2)
        print(self.p2)


class Thread2(threading.Thread):
    def run(self):
        print('print 2-1th')
        time.sleep(4)
        print('print 2-2th')


if __name__ == '__main__':
    thd1 = Thread1(('1-1th', '2-1th'), 'thread1')
    thd2 = Thread2()

    # 设置守护进程，在其他进程结束时，守护进程会直接结束
    thd2.setDaemon(True)

    thd1.start()
    thd2.start()

    tm1 = time.time()
    # 设置进程阻塞，等带join进程运行完成
    thd1.join()
    tm2 = time.time()
    print('程序结束,进程阻塞时间为：{}'.format(round(tm2 - tm1, 2)))

"""
1-1th
print 2-1th
2-1th
程序结束,进程阻塞时间为：2.0
"""

线程通信

线程间的通信方式

共享变量，设置一个可共享的全局变量（通信过程中，线程不安全）
queue、q = queue.Queue() ，常用方法有

Queue.qsize() 返回队列的大小
Queue.empty() 如果队列为空，返回True,反之False
Queue.full() 如果队列满了，返回True,反之False
Queue.full 与 maxsize 大小对应
Queue.get([block[, timeout]])获取队列，timeout等待时间
Queue.get_nowait() 相当Queue.get(False)
Queue.put(item) 写入队列，timeout等待时间
Queue.put_nowait(item) 相当Queue.put(item, False)
Queue.task_done() 在完成一项工作之后，Queue.task_done()函数向任务已经完成的队列发送一个信号
Queue.join() 实际上意味着等到队列为空，再执行别的操作

import queue
import threading
import time

exitFlag = 0

class myThread (threading.Thread):
    def __init__(self, threadID, name, q):
        threading.Thread.__init__(self)
        self.threadID = threadID
        self.name = name
        self.q = q
    def run(self):
        print ("开启线程：" + self.name)
        process_data(self.name, self.q)
        print ("退出线程：" + self.name)

def process_data(threadName, q):
    while not exitFlag:
        queueLock.acquire()
        if not workQueue.empty():
            data = q.get()
            queueLock.release()
            print ("%s processing %s" % (threadName, data))
        else:
            queueLock.release()
        time.sleep(1)

threadList = ["Thread-1", "Thread-2", "Thread-3"]
nameList = ["One", "Two", "Three", "Four", "Five"]
queueLock = threading.Lock()
workQueue = queue.Queue(10)
threads = []
threadID = 1

# 创建新线程
for tName in threadList:
    thread = myThread(threadID, tName, workQueue)
    thread.start()
    threads.append(thread)
    threadID += 1

# 填充队列
queueLock.acquire()
for word in nameList:
    workQueue.put(word)
queueLock.release()

# 等待队列清空
while not workQueue.empty():
    pass

# 通知线程是时候退出
exitFlag = 1

# 等待所有线程完成
for t in threads:
    t.join()
print ("退出主线程")


"""output
开启线程：Thread-1
开启线程：Thread-2
开启线程：Thread-3
Thread-2 processing One
Thread-3 processing Two
Thread-1 processing Three
Thread-1 processing FourThread-3 processing Five

退出线程：Thread-3退出线程：Thread-1
退出线程：Thread-2
"""

线程同步

线程锁Lock

Lock会影响性能，Lock锁一个进程使用后必须释放，否则其他进程无法执行会一直等待
Lock引起死锁（资源未释放、循环等待）

"""
 加锁可以保证一次计算和赋值操作完成后再让出资源
 ---
 线程锁的概念有点类似于数据库中的事务锁，让可能不安全的操作具有原子性
"""
import threading
lock=threading.Lock()  # 创建一个锁
number=0
def addNumber():
    global number
    for i in range(1000000):
       
        lock.acquire() # 上锁
        number+=1
        lock.release()  # 释放锁
 
def downNumber():
    global number
    for i in range(1000000):
        lock.acquire()
        number-=1
        lock.release()
 
if __name__ == '__main__':
    print('start')
    t=threading.Thread(target=addNumber)
    t2=threading.Thread(target=downNumber)
    t.start()
    t2.start()
    t.join()
    t2.join()
    print(number)
    print('stop')

Rlock可重入锁，同一线程内可多次调用acquire,但需注意调用几次对应release几次

from threading import Thread, RLock
from time import sleep

lock = RLock()
fridges = [0 for _ in range(10)]    #10个冰箱

def e2f():
    for x in range(10):
        if fridges[x]==0:
            sleep(0.1)      #装冰箱需要一点时间→_→
            fridges[x] += 1
            return x

def E2FwithLock(L):
    while(True):
        if L.acquire(False):	#把冰箱门打开
            e2f()
            L.release()			#把冰箱门关上

ths = [Thread(target=E2FwithLock,args=[lock]) for _ in range(10)]
for t in ths: t.start()

条件变量condition
- 类内实现了_enter_(self)和_exit_(self)方法，便可以使用with对类进行上下文管理
信号量semaphore
- 用于控制进入数量的锁，允许程序同时刻的并发数

线程池

ThreadPoolExecutor:–>from concurrent import futures//from concurrent.futures import ThreadPoolExecutor

多进程

`multiprocess`ing

进程池

ProcessPoolExecutor:–>from concurrent.futures import ProcessPoolExecutor
from multiprocessing import pool

进程通信

多进程之间的数据是隔离的，相互独立的

Queue

queque.Queue不能用于multiprocessing.process()；可以使用multiprocessing.Queue
multiprocessing中的Queue不能用于pool进程池；pool中的进程间通信可以使用``multiprocessing.manager中的Queue`方法
```
from queue import Queue

from multiprocessing import Queue

from multiprocessing import manager
Manager().Queue
```

通过Pile管道通信，只能用于两个进程间的通信

from multiprocessing import Process, Pipe


def producer(pipe):
    pipe.send("bgape")


def consumer(pipe):
    print(pipe.recv())


if __name__ == "__main__":
    recvPipe, sendPipe = Pipe()
    prod = Process(target=producer, args=(sendPipe,))
    cons = Process(target=consumer, args=(recvPipe,))
    prod.start()
    cons.start()
    prod.join()
    cons.join()
    
    
"""output
bgape

"""

利用Manager维护进程间的共享内存

from multiprocessing import Process, Manager


def producer(pDict, key, value):
    pDict[key] = value


if __name__ == "__main__":
    procDict = Manager().dict()
    proc1 = Process(target=producer, args=(procDict, 1, 'bgape002'))
    proc2 = Process(target=producer, args=(procDict, 2, 'bgape003'))

    proc1.start()
    proc2.start()
    proc1.join()
    proc2.join()

    print(procDict)

"""output
{1: 'bgape002', 2: 'bgape003'}

"""

Manager可维护的变量类型
- BoundedSemaphore(self, value: Any = ...)
- Condition(self, lock: Any = ...)
- Event(self)
- Lock(self)
- Namespace(self)
- Queue(self, maxsize: int = ...)
- RLock(self)
- Semaphore(self, value: Any = ...)
- Array(self, typecode: Any, sequence: Sequence[_T])_
- Value(self, typecode: Any, value: _T)_
- dict(self, sequence: Mapping[_KT, _VT] = ...)
- list(self, sequence)