【Python进阶】并发编程

可口的冰可乐

于 2024-09-03 00:50:37 发布

阅读量852

点赞数 18

分类专栏： Python 文章标签： python 开发语言数据库

本文链接：https://blog.csdn.net/weixin_44745770/article/details/141833602

版权

Python 专栏收录该内容

6 篇文章 0 订阅

订阅专栏

并发编程

Python 并发编程是指在同一时间段内同时处理多个任务的编程方式。并发编程可以加速程序的执行速度，特别是在处理 I/O 密集型任务（如文件操作、网络请求等）时，能够更有效地利用系统资源。Python 提供了多种并发编程的方式，包括多线程、多进程、协程等。

线程与进程：

线程是计算机中可以被CPU调度的最小单元。
进程是计算机资源分配的最小单元。一个进程可以包含多个线程，同一个线程可以共享该进程的资源。

使用场景：

计算密集型任务：适合多进程开发，因为GIL锁限制了多线程的CPU密集型任务的并行性，多进程能够充分利用多核CPU，提高执行效率。
I/O密集型任务：适合多线程开发，因为I/O操作等待期间CPU可以切换到其他线程执行，如网络请求、文件操作。
高并发I/O密集型任务：适合协程开发，通过单线程管理多个并发任务，节省系统资源。

1. 多线程（Threading）

多线程实现：

Python的threading模块提供了线程创建和管理的功能。

import threading

def print_numbers():
    for i in range(5):
        print(i)

# 创建线程
thread1 = threading.Thread(target=print_numbers)
thread2 = threading.Thread(target=print_numbers)

# 启动线程
thread1.start()
thread2.start()

# 等待线程结束
thread1.join()
thread2.join()

print("Threads finished executing")

GIL（全局解释器锁）：

GIL是CPython解释器特有的锁，限制了一个进程中同一时刻只能有一个线程被CPU调度执行Python字节码。
影响：在CPU密集型任务中，GIL限制了多线程的实际性能提升，但在I/O密集型任务中，效果不大。

线程安全：

在多线程环境中，共享数据可能会导致数据冲突，需要通过加锁来确保数据安全。

同步锁：Lock = threading.Lock()
递归锁：Lock = threading.RLock()

使用示例：

import threading

lock = threading.Lock()
counter = 0

def increment():
    global counter
    with lock:
        counter += 1

threads = [threading.Thread(target=increment) for _ in range(100)]
for thread in threads:
    thread.start()
for thread in threads:
    thread.join()

print(counter)  # 预期输出 100

死锁：多个线程或进程互相等待对方释放资源，导致程序无法继续执行。避免方法包括避免嵌套锁定、使用超时机制等。

线程池：

线程池可以限制线程数量，避免线程过多导致的资源浪费。

import time
import random
from concurrent.futures import ThreadPoolExecutor

def task(url):
    print(f"开始执行任务 {url}")
    time.sleep(5)
    return random.randint(0, 10)

def done(response):
    print(f"根据task的结果，执行done {response.result()}")

# 创建线程池，最多维护10个线程。
pool = ThreadPoolExecutor(10)
url_list = [f"www.example.com/{i}" for i in range(300)]

future_list = []
for url in url_list:
    future = pool.submit(task, url)
    future.add_done_callback(done)  # 回调，任务完成后执行done
    future_list.append(future)  # 存储任务结果

pool.shutdown(wait=True)  # 等待所有任务执行完毕
for fut in future_list:
    print(fut.result())
print("END")

单例模式：

在类的实例化过程中，希望使用单例模式，即创建对象时，始终使用最开始创建的那个，这样可以确保所有实例都指向同一个对象（地址相同，且共用变量）

import threading

class Singleton:
    instance = None
    lock = threading.RLock()
    
    def __init__(self, name):
        self.name = name
    
    @classmethod
    def _new__(cls, *args, **kwargs):
        with cls.lock:
            if cls.instance is None:
                cls.instance = object.__new__(cls)
            return cls.instance

obj1 = Singleton('alex')  # 当obj1创建时，初始化了instance
obj2 = Singleton('alex')  # 当obj2创建时，由于instance已经被创建，此时不会再次创建

2. 多进程（Multiprocessing）

进程与进程间通信：

进程是计算机资源分配的最小单元，每个进程有自己独立的内存空间。
进程间通信（IPC）：由于进程之间不共享内存，需使用IPC机制（如Queue、Pipe等）来交换数据。

多进程实现：

Python的multiprocessing模块提供了创建和管理进程的功能。

import multiprocessing

def print_numbers():
    for i in range(5):
        print(i)

# 创建进程
process1 = multiprocessing.Process(target=print_numbers)
process2 = multiprocessing.Process(target=print_numbers)

# 启动进程
process1.start()
process2.start()

# 等待进程结束
process1.join()
process2.join()

print("Processes finished executing")

多进程创建机制：

fork：拷贝模式，将父进程拷贝一份。适用于类Unix系统（如Linux和MacOS）。
spawn：解释器模式，创建一个新的Python解释器来运行任务。适用于Windows系统。
forkserver：进程模板模式，使用进程模板创建子进程。更灵活，适用于需要更高控制的场景。

进程间通信：

使用Queue进行进程间数据传递示例：

from multiprocessing import Process, Queue

def worker(q):
    q.put("Hello from worker")

if __name__ == '__main__':
    q = Queue()
    p = Process(target=worker, args=(q,))
    p.start()
    print(q.get())
    p.join()

使用Pipe进行进程间数据传递示例：

from multiprocessing import Process, Pipe

def worker(conn):
    conn.send("Hello from worker")
    conn.close()

if __name__ == '__main__':
    parent_conn, child_conn = Pipe()
    p = Process(target=worker, args=(child_conn,))
    p.start()
    print(parent_conn.recv())
    p.join()

进程池：

进程池可以限制进程数量，避免进程过多导致的资源浪费。Pool提供了多进程的并发执行。

from multiprocessing import Pool
import os

def task(n):
    print(f"Process ID: {os.getpid()}, Task: {n}")
    return n * n

if __name__ == '__main__':
    with Pool(4) as pool:
        results = pool.map(task, range(10))
    print(results)

3. 协程（Coroutines）

协程是一种轻量级的并发模型，允许在函数内部进行异步操作而不阻塞执行。协程适合处理大量的I/O密集型任务，如高并发的网络请求、爬虫等。协程通过async和await关键字实现。

使用asyncio模块创建协程：

import asyncio

async def print_numbers():
    for i in range(5):
        print(i)
        await asyncio.sleep(1)

async def main():
    await asyncio.gather(print_numbers(), print_numbers())

# 运行事件循环
asyncio.run(main())

事件循环:

事件循环负责调度协程。它在协程的await表达式上暂停当前协程，并切换到其他可执行的协程，直到所有协程完成。

4. Python并发编程底层机制

线程和进程的底层机制

线程：线程是共享进程资源的执行单元，通过操作系统调度器进行调度。Python的threading模块使用系统调用（如pthread_create）来创建线程。GIL限制了同一时刻只有一个线程执行Python字节码，但在进行I/O操作时，GIL的影响较小。
进程：进程是操作系统资源分配的基本单位。每个进程有独立的内存空间，通过系统调用（如fork）进行创建。Python的multiprocessing模块使用系统调用（如fork、spawn）来创建进程，进程间通过IPC机制进行通信

GIL（全局解释器锁）

GIL作用：CPython中的GIL确保在同一时刻只有一个线程执行Python字节码。它防止了多线程并发执行导致的内存破坏，但也限制了多线程的并行能力。
解决方案：在CPU密集型任务中，可以使用多进程来绕过GIL限制。使用multiprocessing模块创建多个进程，每个进程都有自己的GIL。

内存管理

内存分配：Python使用malloc和free进行内存分配和释放。Python有一个内存池机制，以减少内存分配和释放的开销。
垃圾回收：Python使用引用计数和垃圾回收来管理内存。引用计数机制会在对象不再被引用时释放内存，垃圾回收机制用于检测和处理循环引用的对象。

可口的冰可乐

关注

18
点赞
踩
11

收藏

觉得还不错? 一键收藏
打赏
0
评论
【Python进阶】并发编程

Python 并发编程是指在同一时间段内同时处理多个任务的编程方式。并发编程可以加速程序的执行速度，特别是在处理 I/O 密集型任务（如文件操作、网络请求等）时，能够更有效地利用系统资源。Python 提供了多种并发编程的方式，包括多线程、多进程、协程等。
复制链接

扫一扫