py 并发,并行,多线程与多进程

yichudu

已于 2023-06-15 16:49:57 修改

阅读量717

点赞数

分类专栏： python 文章标签： python 多进程

于 2021-11-17 16:25:16 首次发布

天天开心

本文链接：https://blog.csdn.net/chuchus/article/details/121381433

版权

python 专栏收录该内容

54 篇文章 2 订阅

订阅专栏

一. 多线程

1.1 Thread api

本文说的线程指 threading.Thread.

Thread#__init__(self, group=None, target=None, name=None, args=(), kwargs=None, *, daemon=None)
构造函数.
- target. callable object.
- name, 线程名字.
- args, the argument tuple for the target invocation. 如果只有一个参数, 应该是 (x,) 而非 (x), 一次满足 tuple 的要求.
- kwargs, a dictionary of keyword arguments for the target invocation.
Thread#run(self)
同步方法, 它又调用了 target(), 一般在子类中会重写该方法, 完成想要做的事情.
Thread#start(self)
异步方法, 开启新线程并在新线程中执行run()方法. 主线程结束后进程并不会退出, 等待所有线程结束后才退出.
Thread#join(self, timeout=None)
同步方法, 在线程执行完毕前一直等待. timeout 单位是秒.

注意: 一个线程对象只能调用一次 start() 方法, 不能复用.

实践用法

以下二选一:

手动创建 Thread 对象, 构造函数传入 target 参数, 执行 start() 方法.
手动定义 Thread 子类, 重写 run() 方法, 执行 start() 方法.

以下是方法2的一个例子.

import threading
import time
from threading import Thread

import numpy as np


class YichuThread(threading.Thread):

    def __init__(self, object_mode=False):
        self.object_mode = object_mode
        self.result = []
        super().__init__()

    def run(self) -> None:
        if self.object_mode:
            random_obj = np.random.RandomState(seed=0)
            for _ in range(3):
                self.result.append(random_obj.choice(a=range(10), size=1))
        else:
            for _ in range(3):
                self.result.append(np.random.choice(a=range(10), size=1))


def multi_thread_safe_eval(object_mode):
    thread_arr = []
    for _ in range(3):
        thread_arr.append(YichuThread(object_mode=object_mode))
        thread_arr[-1].start()
    for t in thread_arr:
        t.join()
    for t in thread_arr:
        print(t.result)

print('multi_thread_safe_eval(True)')
multi_thread_safe_eval(True)
time.sleep(3)
print('\nmulti_thread_safe_eval(False)')
multi_thread_safe_eval(False)

"""
multi_thread_safe_eval(True)
[array([5]), array([0]), array([3])]
[array([5]), array([0]), array([3])]
[array([5]), array([0]), array([3])]

multi_thread_safe_eval(False)
[array([9]), array([4]), array([1])]
[array([1]), array([5]), array([4])]
[array([2]), array([2]), array([1])]
"""

1.2 线程池

线程频繁创建销毁也是有代价的, 所以搞个池子常驻.
待弄懂: 线程对象不能复用, 这里池子是怎么复用的呢?

concurrent.futures.ThreadPoolExecutor(Executor) 类
线程池类, 继承自 Executor.
- __init__(self, max_workers=None, thread_name_prefix=‘’,…)
  可以指定最大线程数和线程名字的前缀.
- Executor.submit(self, fn,*args)
  传入可执行的方法与参数, 非阻塞, 立即返回 future 对象.
  注意当提交任务数超出 max_workers 时, 该方法也不会阻塞, 任务会再后台排队.
concurrent.futures.Future 类
- Future.result(self, timeout=None)
  返回提交任务对应的函数返回.

例子见下:

import random
import time
from concurrent.futures import ThreadPoolExecutor, Future
from typing import Any


def do_something(name):
    """ 模拟干活, 生成一个随机数, 并按这个数作 sleep"""
    res = random.randint(0, 3)
    time.sleep(res)
    return f'name={name}, res={res}'


executor = ThreadPoolExecutor(max_workers=10)
task_list = []
for i in range(6):
    x: Future = executor.submit(do_something, (f'task_{i}'))
    task_list.append(x)

for task in task_list:
    # 返回的就是原函数的返回, 没有任何的类型包装
    task_result_str: Any = task.result()
    print(task_result_str)
"""
name=task_0, res=0
name=task_1, res=3
name=task_2, res=0
name=task_3, res=1
name=task_4, res=0
name=task_5, res=3

"""

1.3 持续稳定的并发

场景: 预热一个远程的服务, 20个线程不停歇.
方案: 因为线程池不会阻塞, 所以引入信号量来控制每个时刻都有20个请求在发送.

# coding:utf-8
import json
from concurrent.futures import ThreadPoolExecutor
from threading import Thread
from threading import Semaphore

import requests


url = 'http://abc.com/webservice/hello'

semaphore = Semaphore(value=20)

def do_rpc():
    try:
        api_res: requests.models.Response = requests.get(url)
        res_dict = json.loads(api_res.text)
        print("res_dict['body']", res_dict['body'])
    except Exception as e:
        print(e)
    finally:
        semaphore.release()

executor = ThreadPoolExecutor(max_workers=10)
i = 0
while True:
    i += 1
    print(i)
    executor.submit(do_rpc)
    semaphore.acquire()

1.4 得到线程任务的返回值

不使用线程池
那就手动定义 Thread 子类, 方法运行结果可以赋值给对象的字段.
使用线程池
通过 concurrent.futures._base.Future.result() 拿到的对象就是传入target 方法返回的对象.

二. 多进程

2.1 Process api

class Process(process.BaseProcess) 代表一个单独的活动进程. 我们着重看父类的 api.

BaseProcess.__init__(self, group=None, target=None, name=None, args=(), kwargs=None, *, daemon=None)
同 Thread 构造函数一致.
run(), start(), join(), 与 thread 相似, 不再赘述.
is_alive(),
terminate()

OS api

os.getpid(), 当前进程的 id 号.
os.getppid(), 即 parent process id. If the parent process has already exited, Windows machines will still return its id; others systems will return the id of the ‘init’ process (1).

2.2 daemon 进程

即后台,守护进程. 特殊在不与键鼠交互, 后台常驻.
看到网上文章说, daemon 进程会在主进程退出后自动退出. 但下文发现并不是这样, 难道正常退出和 kill 退出机制不同?

2.3 进程池

pool

三. 进程间通信

demo 任务描述:
{读数据, 模型预测, 结果持久化} 三个任务以多进程方式利用多核优势完全并行. 具体设计:

多进程读数据, 放入 in_queue
主进程从 in_queue 读数据, 利用多个GPU 作模型预测, 结果写入 out_queue
单进程从 out_queue 读预测结果, 写入数据库

"""
demo 任务描述:
{读数据, 模型预测, 结果持久化} 三个任务以多进程方式利用多核优势完全并行. 具体设计:
5. 多进程读数据, 放入 in_queue
6. 主进程从 in_queue 读数据, 利用多个GPU 作模型预测, 结果写入 out_queue
7. 单进程从 out_queue 读预测结果, 写入数据库
"""
from multiprocessing import Pool, Manager, Queue
import logging

logger = logging.getLogger(__name__)
parallel_pool_size = 10
parallel_queue_size = 500
reader_in_parallel_cnt = 8
GENERATE_TASK_ALL_FINISH_MESSAGE = "generate_task_all_finish_message"
GENERATE_TASK_PARTIAL_FINISH_MESSAGE = "generate_task_partial_finish_message"


def write_to_in_queue_task(table_name, slice_id, slice_count, in_queue):
    """
    从数据库读预测样本
    """
    generator = foo(table_name=table_name, batch_size=2048,
                    slice_id=slice_id, slice_count=slice_count)
    while True:
        try:
            data = next(generator)
            in_queue.put(data)
        except Exception as e:
            # 局部完成
            in_queue.put(GENERATE_TASK_PARTIAL_FINISH_MESSAGE)
            break


def __queue_generator(upstream_producer_cnt: int, in_queue: Queue):
    """
    从队列读预测样本
    """
    partial_finish_cnt = 0
    while True:
        # features{'valid_token_ids'}
        data = in_queue.get()
        if isinstance(data, type(GENERATE_TASK_PARTIAL_FINISH_MESSAGE)):
            if data == GENERATE_TASK_PARTIAL_FINISH_MESSAGE:
                partial_finish_cnt += 1
                logger.info('__queue_generator 依赖的生产者有 %d 个, 现已结束 %d 个', upstream_producer_cnt, partial_finish_cnt)
                if partial_finish_cnt == upstream_producer_cnt:
                    logger.info('__queue_generator 依赖的生产者有 %d 个, 现均已结束', upstream_producer_cnt)
                    break
        else:
            yield data


def write_from_queue_task(out_table_name, out_partition, out_queue: Queue):
    """
    将预测结果持久化
    """
    with foo(table_name) as writer:
        while True:
            values = out_queue.get()
            if isinstance(values, type(GENERATE_TASK_ALL_FINISH_MESSAGE)):
                if values == GENERATE_TASK_ALL_FINISH_MESSAGE:
                    logger.info('接收到 GENERATE_TASK_FINISH_MESSAGE , 依靠with环境自动关闭 odps_writer')
                    break
            else:
                writer.write(values=values)


if __name__ == '__main__':
    线程池 = Pool(parallel_pool_size)
    manager = Manager()
    # episode-in
    in_queue = manager.Queue(parallel_queue_size)
    out_queue = manager.Queue(parallel_queue_size)
    for i in range(reader_in_parallel_cnt):
        线程池.apply_async(func=write_to_in_queue_task, args=('table_name', i, reader_in_parallel_cnt, in_queue))

    # episode-out
    线程池.apply_async(func=write_from_queue_task, args=('out_table_name', out_queue))
    logger.info('mc_write.write_from_queue_task 进程已提交')

    # model infers here
    do_predict(__queue_generator, out_queue)
    # gracefully
    线程池.close()
    logger.info('预测结束, writer 可能尚未写完, join 等待')
    线程池.join()
    logger.info("线程池.join() 结束, 主进程退出")

四. 父子进程的退出关系

进程间是有父子关系的.
linux 中, ps 命令可以看到 pid=18460 的进程是由 ppid=18314 的父进程创建的.

$ps -u yichu.dyc -lf
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
0 S yichu.d+ 18314     1  0  80   0 - 1160548 poll_s 17:42 ?      00:00:12 python local_entry.py
1 S yichu.d+ 18460 18314  1  80   0 - 1161772 poll_s 17:42 ?      00:00:32 python local_entry.py

子进程退出

会有通知机制, 告诉父进程.
待补充.

父进程退出

正常结束

import multiprocessing, os, time


def worker_loop_fn(worker_id):
    while True:
        print(f"worker_id={worker_id}, pid={os.getpid()}, 父id={os.getppid()}")
        time.sleep(1)


class MultiProgressPractice:
    @staticmethod
    def _clean_up_worker(w: multiprocessing.Process):
        try:
            print(f'enter _clean_up_worker, pid is {os.getpid()}')
            w.join(timeout=1)
        finally:
            if w.is_alive():
                w.terminate()

    def __init__(self, num_workers):
        self._workers = []
        print(f"主进程, pid={os.getpid()}")
        for i in range(num_workers):
            w = multiprocessing.Process(target=worker_loop_fn, args=(i,))
            w.daemon = True
            w.start()
            self._workers.append(w)
        import atexit
        for w in self._workers:
            atexit.register(MultiProgressPractice._clean_up_worker, w)
        time.sleep(1)
        print("主进程要结束啦")


if __name__ == '__main__':
    # pid = os.fork()
    MultiProgressPractice(2)

"""
主进程, pid=30928
worker_id=0, pid=32008, 父id=30928
worker_id=1, pid=30932, 父id=30928
主进程要结束啦
enter _clean_up_worker, pid is 30928
worker_id=0, pid=32008, 父id=30928
worker_id=1, pid=30932, 父id=30928
enter _clean_up_worker, pid is 30928
worker_id=0, pid=32008, 父id=30928
"""

被意外 kill

当 kill 主进程后, 子进程并不会自动退出, 而是交由 init 这个pid=1的进程托管, 证据见下.

$ps -u yichu.dyc -lf
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
1 S yichu.d+ 18460     1  1  80   0 - 1161772 poll_s 17:42 ?      00:00:38 python local_entry.py