python 并行处理
The idea of creating a practical guide for Python parallel processing with examples is actually not that old for me. We know that this is not really one of the main contents for Python. However, when the time comes to work with big data, we cannot be very patient about the time. And at this point, it’s no secret that we need new equipment to take on big tasks.
创建带有示例的Python并行处理实用指南的想法实际上对我来说并不那么古老。 我们知道,这实际上并不是Python的主要内容之一。 但是,当需要处理大数据时,我们不能对时间非常耐心。 在这一点上,我们需要新设备来承担重大任务已不是什么秘密。
This article will help you understand:
本文将帮助您了解:
Why is parallel processing and what is parallel processing?
为什么要进行并行处理,什么是并行处理?
Which function is used and how many processors can be used?
使用哪个功能以及可以使用多少个处理器?
What should I know before starting parallelization?
开始并行化之前我应该知道些什么?
How is any function parallelization?
函数如何并行化?
How to parallelize Pandas DataFrame?
如何并行化Pandas DataFrame?
1.为什么和什么? (1. Why and What?)
为什么需要并行处理? (Why do i need parallel processing?)
- A single process covers a separately executable piece of code 单个过程涵盖了单独的可执行代码段
- Some sections of code can be run simultaneously and allow, in principle, parallelization 某些代码段可以同时运行,并且原则上允许并行化
- Using the features of modern processors and operating systems, we can shorten the total execution time of a program, for example by using each core of a processor. 利用现代处理器和操作系统的功能,我们可以缩短程序的总执行时间,例如通过使用处理器的每个内核。
- You may need this to reduce the complexity of your program / code and outsource the workpieces to specialist agents who act as sub-processes. 您可能需要这样做,以减少程序/代码的复杂性,并将工件外包给充当子流程的专业代理商。
什么是并行处理? (What is the Parallel Processing?)
- Parallel processing is a mode of operation in which instructions are executed simultaneously on multiple processors on the same computer to reduce overall processing time. 并行处理是一种操作模式,其中指令在同一台计算机上的多个处理器上同时执行,以减少总体处理时间。
- It allows you to take advantage of multiple processors in one machine. In this way, your processes can be run in completely separate memory locations. 它使您可以在一台计算机上利用多个处理器。 这样,您的进程可以在完全独立的内存位置中运行。
2.使用什么功能? (2. What function is used?)
Using the standard multiprocessing
module, we can effi