Python进程和线程

最新推荐文章于 2023-03-01 13:08:35 发布

熊航

最新推荐文章于 2023-03-01 13:08:35 发布

阅读量117

点赞数

本文链接：https://blog.csdn.net/qq_33320337/article/details/86438090

版权

多进程

Unix/Linux操作系统提供了一个fork()系统调用，它非常特殊。fork()调用一次，返回两次，因为操作系统自动把当前进程（称为父进程）复制了一份（称为子进程），然后，分别在父进程和子进程内返回。子进程永远返回0，而父进程返回子进程的ID。这样做的理由是，一个父进程可以fork出很多子进程，所以，父进程要记下每个子进程的ID，而子进程只需要调用getppid()就可以拿到父进程的ID。

import os

print('Process (%s) start...' % os.getpid())
pid = os.fork()
if pid == 0:
    print('I am child process (%s) and my parent is %s.' % (os.getpid(), os.getppid()))
else:
    print('I (%s) just created a child process (%s).' % (os.getpid(), pid))

运行结果：

Process (10076) start...
I (10076) just created a child process (10077).
I am child process (10077) and my parent is 10076.

由于Windows没有fork调用，上面的代码在Windows上无法运行，但是在Mac下运行是没有问题的。

multiprocessing

multiprocessing模块就是跨平台版本的多进程模块。

import os
from multiprocessing import Process
def run(name):
    print('Run child process %s (%s)...' % (name, os.getpid()))

if __name__ == '__main__':
    print('Parent process %s.' % os.getpid())
    p = Process(target = run, args = ('test',) )
    print('child process will start')
    p.start()
    p.join()
    print('chile process end')

结果:

Parent process 10138.
child process will start
Run child process test (10139)...
chile process end

创建子进程时，只需要传入一个执行函数和函数的参数，用start()方法启动,join()方法是子进程结束后再继续往下运行。

pool

pool可以批量创建子进程:

import os, time, random
from multiprocessing import Process,Pool

def time_task(name):
    print('run task %s (%s)' % (name, os.getpid()))
    start = time.time()
    time.sleep(random.random() * 5)
    end = time.time()
    print('Taks %s runs %0.2f seconds' % (name, end-start))


if __name__ == '__main__':
    print('Parent process is %s' % os.getpid())
    p = Pool(3)
    for i in range(4):
        p.apply_async(time_task, args = (i,))
    print('waiting for all processes......')
    p.close()
    p.join()
    print('all done')

结果如下:

Parent process is 10230
waiting for all processes......
run task 0 (10231)
run task 1 (10232)
run task 2 (10233)
Taks 1 runs 0.12 seconds
run task 3 (10232)
Taks 2 runs 0.64 seconds
Taks 3 runs 1.48 seconds
Taks 0 runs 3.68 seconds
all done

对Pool对象调用join()方法会等待所有子进程执行完毕，调用join()之前必须先调用close()，调用close()之后就不能继续添加新的进程了。

进程间的通信

Python的multiprocessing模块包装了底层的机制，提供了Queue、Pipes等多种方式来交换数据。以Queue为例，在父进程中创建两个子进程，一个往Queue里写数据，一个从Queue里读数据：

import os, time, random
from multiprocessing import Process,Pool, Queue
# 写数据进程
def write(q):
    print('Process to write: %s', os.getpid())
    for i in ['A', 'B', 'C']:
        print('Put %s to queue....' % i)
        q.put(i)
        time.sleep(random.random())
# 读数据进程
def read(q):
    print('Process to read: %S' ,os.getpid())
    while True:
        value = q.get(True)
        print('Get %s from queue' % value)

if __name__ == '__main__':
# 父进程创建Queue，并传给各个子进程：
    q = Queue()
    pw = Process(target = write, args = (q,))
    pr = Process(target = read, args=(q,))
    # 启动子进程pw，写入:
    pw.start()
    # 启动子进程pr，读取:
    pr.start()
    pw.join()
    pr.terminate()

运行结果如下：

Process to write: %s 10362
Put A to queue....
Process to read: %S 10363
Get A from queue
Put B to queue....
Get B from queue
Put C to queue....
Get C from queue

多线程

Python的标准库提供了两个模块：_thread和threading，_thread是低级模块，threading是高级模块，对_thread进行了封装。绝大多数情况下，只需要使用threading这个高级模块:

import time, threading

# 新线程执行的代码:
def loop():
    print('thread %s is running...' % threading.current_thread().name)
    n = 0
    while n < 5:
        n = n + 1
        print('thread %s >>> %s' % (threading.current_thread().name, n))
        time.sleep(1)
    print('thread %s ended.' % threading.current_thread().name)

print('thread %s is running...' % threading.current_thread().name)
t = threading.Thread(target=loop, name='LoopThread')
t.start()
t.join()
print('thread %s ended.' % threading.current_thread().name)

执行结果：

thread MainThread is running...
thread LoopThread is running...
thread LoopThread >>> 1
thread LoopThread >>> 2
thread LoopThread >>> 3
thread LoopThread >>> 4
thread LoopThread >>> 5
thread LoopThread ended.
thread MainThread ended.

任何进程默认就会启动一个线程，该线程称为主线程，主线程又可以启动新的线程，Python的threading模块有个current_thread()函数，它永远返回当前线程的实例。主线程实例的名字叫MainThread。

Lock

多线程和多进程最大的不同在于，多进程中，同一个变量，各自有一份拷贝存在于每个进程中，互不影响，而多线程中，所有变量都由所有线程共享，所以，任何一个变量都可以被任何一个线程修改，因此，线程之间共享数据最大的危险在于多个线程同时改一个变量，把内容给改乱了。

import time, threading

balance = 0
def change_it(n):
    # 先加后减，结果应该为0:
    global balance
    balance = balance + n
    balance = balance - n

def run_thread(n):
    for i in range(100000):
        change_it(n)

t1 = threading.Thread(target=run_thread, args=(5,))
t2 = threading.Thread(target=run_thread, args=(8,))
t1.start()
t2.start()
t1.join()
t2.join()
print(balance)

我们定义了一个共享变量balance，初始值为0，并且启动两个线程，先存后取，理论上结果应该为0，但是，由于线程的调度是由操作系统决定的，当t1、t2交替执行时，只要循环次数足够多，balance的结果就不一定是0了。
要确保balance计算正确，就要给change_it()上一把锁，当某个线程开始行change_it()时，该线程因为获得了锁，因此其他线程不能同时执行change_it()，只能等待，直到锁被释放后，获得该锁以后才能改。由于锁只有一个，无论多少线程，同一时刻最多只有一个线程持有该锁，所以，不会造成修改的冲突。创建一个锁就是通过threading.Lock()来实现：

lock = threading.Lock()

def run_thread(n):
    for i in range(100000):
        # 先要获取锁:
        lock.acquire()
        try:
            # 放心地改吧:
            change_it(n)
        finally:
            # 改完了一定要释放锁:
            lock.release()

熊航

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Python进程和线程

多进程Unix/Linux操作系统提供了一个fork()系统调用，它非常特殊。fork()调用一次，返回两次，因为操作系统自动把当前进程（称为父进程）复制了一份（称为子进程），然后，分别在父进程和子进程内返回。子进程永远返回0，而父进程返回子进程的ID。这样做的理由是，一个父进程可以fork出很多子进程，所以，父进程要记下每个子进程的ID，而子进程只需要调用getppid()就可以拿到父进程的I...
复制链接

扫一扫