用举例来学习Python中的并行性、并发性和异步性

Parallelism, Concurrency, and AsyncIO in Python - by example

用举例来学习Python中的并行性、并发性和异步性

一篇中文译文,来自:

This tutorial looks at how to speed up CPU-bound and IO-bound operations with multiprocessing, threading, and AsyncIO. 本教程介绍如何通过多进程、线程和 AsyncIO 来加速 CPU 密集型和 IO 密集型操作。

Concurrency vs Parallelism 并发与并行

Concurrency and parallelism are similar terms, but they are not the same thing. 并发和并行是相似的术语,但它们不是同一件事。

Concurrency is the ability to run multiple tasks on the CPU at the same time. Tasks can start, run, and complete in overlapping time periods. In the case of a single CPU, multiple tasks are run with the help of context switching, where the state of a process is stored so that it can be called and executed later. 并发性是指在 CPU 上同时运行多个任务的能力。任务可以在重叠的时间段内启动、运行和完成。在单CPU的情况下,多个任务在上下文切换的帮助下运行,其中(上下文中)存储了进程的状态,以便稍后调用和执行。

Parallelism, meanwhile, is the ability to run multiple tasks at the same time across multiple CPU cores. 与此同时,并行(Parallelism)是指在多个 CPU 核心上同时运行多个任务(multiple tasks)的能力。

Though they can increase the speed of your application, concurrency and parallelism should not be used everywhere. The use case depends on whether the task is CPU-bound or IO-bound. 尽管它们可以提高应用程序的速度,但并发(concurrency)和并行(Parallelism)不应该在所有的地方使用。是否使用取决于任务是 CPU 密集型(CPU-bound)还是 IO 密集型(IO-bound)。

Tasks that are limited by the CPU are CPU-bound. For example, mathematical computations are CPU-bound since computational power increases as the number of computer processors increases. Parallelism is for CPU-bound tasks. 受 CPU 限制的任务是 CPU 密集型的。例如,数学计算受 CPU 限制,因为计算能力随着计算机处理器数量的增加而增加。并行(Parallelism)适用于 CPU 密集型任务。

In theory, If a task is divided into n-subtasks, each of these n-tasks can run in parallel to effectively reduce the time to 1/n of the original non-parallel task. Concurrency is preferred for IO-bound tasks, as you can do something else while the IO resources are being fetched. 理论上,如果一个任务被划分为n个子任务,那么这n个子任务中的每一个都可以并行运行,从而有效地将运行时间减少到原来非并行运行时的1/n。对于 IO 密集型(IO-bound)任务来说,并发(Concurrency)是首选,因为您可以在 IO 资源正在被获取的同时去做其他操作。

The best example of CPU-bound tasks is in data science. Data scientists deal with huge chunks of data. For data preprocessing, they can split the data into multiple batches and run them in parallel, effectively decreasing the total time to process.Increasing the number of cores results in faster processing. CPU 密集型(CPU-bound)任务的最佳示例是数据科学。数据科学家处理大量数据。对于数据预处理,他们可以将数据分成多个批次并行(parallel)运行,从而有效减少总处理时间。增加核心数量可以加快处理速度。

Web scraping is IO-bound. Because the task has little effect on the CPU since most of the time is spent on reading from and writing to the network. Other common IO-bound tasks include database calls and reading and writing files to disk.Web applications, like Django and Flask, are IO-bound applications. Web 抓取是 IO 密集型(IO-bound)。因为该任务对 CPU 的影响很小,因为大部分时间都花在网络读写上。其他常见的 IO 密集型任务包括数据库调用以及向磁盘读取和写入文件。Web 应用程序(例如 Django 和 Flask)是 IO 密集型应用程序。

If you’re interested in learning more about the differences between threads, multiprocessing, and async in Python, check out the Speeding Up Python with Concurrency, Parallelism, and asyncio article. 如果您有兴趣了解有关 Python 中线程、多处理和异步之间差异的更多信息,请查看通过[并发、并行和异步加速 Python 文章]

Scenario 设想

With that, let’s take a look at how to speed up the following tasks: 接下来,让我们看看如何加快以下任务的速度:

# tasks.py

import os
from multiprocessing import current_process
from threading import current_thread

import requests

def make_request(num):
    # io-bound

    pid = os.getpid()
    thread_name = current_thread().name
    process_name = current_process().name
    print(f"{pid} - {process_name} - {thread_name}")

    requests.get("https://httpbin.org/ip")

async def make_request_async(num, client):
    # io-bound

    pid = os.getpid()
    thread_name = current_thread().name
    process_name = current_process().name
    print(f"{pid} - {process_name} - {thread_name}")

    await client.get("https://httpbin.org/ip")

def get_prime_numbers(num):
    # cpu-bound

    pid = os.getpid()
    thread_name = current_thread().name
    process_name = current_process().name
    print(f"{pid} - {process_name} - {thread_name}")

    numbers = []

    prime = [True
  • 29
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值