【Python】理解多线程基本原理

最新推荐文章于 2023-04-10 13:33:29 发布

__淡墨青衫__

最新推荐文章于 2023-04-10 13:33:29 发布

阅读量498

点赞数 1

分类专栏：爬虫小白文章标签：多线程 GIL

本文链接：https://blog.csdn.net/qq_38934189/article/details/109997976

版权

爬虫小白专栏收录该内容

8 篇文章 0 订阅

订阅专栏

多线程的含义

线程是操作系统进行运算调度的最小单位，是进程中的一个最小运行单元。

多线程就是一个进程中同时执行多个线程，比如，打开一个浏览器，是一个进程，浏览器里面打开多个页面，有的页面打开音乐，有的打开视频，它们可以同时运行，互不干扰，这就是多线程在工作。

并行和并发

并发

并发，是指同一时刻只能有一条指令执行，但是多个线程的对应的指令被快速轮换地执行。比如说一个处理器，先执行线程A的指令一段时间，再执行线程B的指令一段是时间，再切回到线程A执行一段时间。由于处理器执行指令的速度和切换速度非常快，人是完全感知不到计算机在这个过程中有多个线程切换上下文执行的操作，使得宏观上看起来多个线程在同时执行。但是在同一时刻，只有一个线程在执行。

并行

并行，是指同一时刻，有多条指令在多个处理器上同时执行，并行必须要依赖于多个处理器，多个线程都是在同一时刻一起执行的。

*并行只能在多处理器系统中存在，如果我们处理器只有一个核，那就不可以实现并行，而并发在单处理器和多处理器系统中都是可以存在的，因为仅靠一个核，就可以实现并发。

Python实现多线程

在python中，实现多线程的模块叫threading，是python自带的模块，下面threading实现多线程的方法。

创建子线程方法一

Thread直接创建子线程

首先，可以使用Thread类来创建一个线程，创建时需要指定target参数为运行的方法名称，如果被调用的方法需要传入额外的参数，则可以通过Thread的args参数来指定。

import threading
import time

def target(second):
    print(f'Threading {threading.current_thread().name} is running')
    print(f'Threading {threading.current_thread().name} sleep {second}s')
    time.sleep(second)
    print(f'Threading {threading.current_thread().name} is ended')

#线程名字可以通过threading.current_thread().name来获取，主线程的话其值就是，MainThread，如果是子线程的话，其值就是 Thread-*
print(f'Threading {threading.current_thread().name} is running')

for i in [1, 5]:
    thread = threading.Thread(target=target, args=[i])
    thread.start()
print(f'Threading {threading.current_thread().name} is ended')

该代码在运行后，其实可以发现，主线程首先运行结束，子线程才接连运行结束，这说明主线程并没有等待子线程运行结束运行，而是直接退出来。

如果想要主线程等待子线程运行完毕后才退出，可以让每一个子线程对象都调用join方法，如下：

#方法一（推荐方法一）
threads = []
for i in [1, 5]:
    thread = threading.Thread(target=target, args=[i])
    threads.append(thread)
    thread.start()

for thread in threads:
    thread.join()
#方法二
for i in [1, 5]:
    thread = threading.Thread(target=target, args=[i])
    threads.append(thread)
    thread.start()
    thread.join()

这样，主线程必须等子线程运行结束，主线程才继续运行并结束。

创建子线程方法二

继承Thread类创建子线程

可以通过继承Thread类的方式创建一个线程，该线程需要执行的方法写在类的run方法即可，上面的例子，修改如下：

import threading

import time

class MyThread(threading.Thread):

    def __init__(self, second):
        threading.Thread.__init__(self)
        self.second = second

    def run(self):
        print(f'Threading {threading.current_thread().name} is running')
        print(f'Threading {threading.current_thread().name} sleep {self.second}s')
        time.sleep(self.second)
        print(f'Threading {threading.current_thread().name} is ended')

print(f'Threading {threading.current_thread().name} is running')

threads = []

for i in [1, 5]:
    thread = MyThread(i)
    threads.append(thread)
    thread.start()

for thread in threads:
    thread.join()

print(f'Threading {threading.current_thread().name} is ended')

在运行后，两种方式其运行效果是相同的。

守护线程

在线程中有一个叫做守护线程的概念，如果一个线程被设置为守护线程，那么意味着这个线程是“不重要的”，如果主线程结束了而该守护线程还没有运行完，那么它将会被强制结束，在Python中我们可以通过setDaemon方法将某个线程设置为守护线程。

【示列】

import threading
import time

def target(second):
    print(f'Threading {threading.current_thread().name} is running')
    print(f'Threading {threading.current_thread().name} sleep {second}s')
    time.sleep(second)
    print(f'Threading {threading.current_thread().name} is ended')

print(f'Threading {threading.current_thread().name} is running')
t1 = threading.Thread(target=target, args=[2])
t1.start()
t2 = threading.Thread(target=target, args=[5])
t2.setDaemon(True)
t2.start()

print(f'Threading {threading.current_thread().name} is ended')

这里通过setDeamon方法将t2设置为守护线程，这样主线程在运行完毕时，t2线程会随着主线程的结束而结束。这里没有调用join方法，如果都调用join方法，主线程仍然会等待各个子线程执行完毕再退出，不论是否是守护线程。

互斥锁

在一个进程中的多个线程是共享的，如果现在有一个进程中，有一个全局变量count用来计数，每一个线程运行时都给count加1，实现代码如下：

'''
加锁GIL
'''
import threading
import time

count = 0

class MyThread(threading.Thread):

    def __init__(self):
        threading.Thread.__init__(self)

    def run(self):
        global count
        lock.acquire()  # 加锁
        temp = count + 1
        time.sleep(0.001)
        count = temp
        lock.release()  # 释放锁


lock = threading.Lock()
threads = []

for _ in range(1000):
    thread = MyThread()
    thread.start()
    threads.append(thread)

for thread in threads:
    thread.join()

print(f'Final count: {count}')

加锁保护是什么意思呢？就是说，某一个线程在对数据操作时，需要先加锁，这样其他线程发现被加锁时，就无法继续向下执行，会一只等待锁被释放，只有锁释放了，其他线程才能继续加锁对数据做修改，这样才能确保同一时间只有一个线程操作数据，多个线程不会在同一时间读取或修改同一数据，这样的运行的结果就是正确的。

如果没有加锁保护，则最后的结果是错误的，多次的运行的结果也是不相同的。因为count值是共享的，每个线程执行拿到当前count值，但是这些线程中的一些线程可能是并发或者并行执行的，这就导致不同的线程拿到的可能是同一个count值，最后导致有些线程的count的加1操作并没有生效，导致最后的结果偏小。（大家可以试试，看看效果）

Python多线程的问题

由于Python中GIL的限制，导致不论是在单核还是多核的条件下，在同一时刻只能运行一个线程，导致Python多线程无法发挥多核并行的优势。