Python多线程编程之线程间通信

最新推荐文章于 2024-04-28 19:05:10 发布

小宇不内向

最新推荐文章于 2024-04-28 19:05:10 发布

阅读量1.2k

点赞数

分类专栏： Python并发、并行与异步编程文章标签： python多线程 Queue 线程间通信

本文链接：https://blog.csdn.net/xiaoyu_wu/article/details/102784565

版权

Python并发、并行与异步编程专栏收录该内容

10 篇文章 2 订阅

订阅专栏

python线程间通信，有两种常用的方法：

1. 共享变量：

定义一个全局变量，然后在不同的线程函数中，使用 global 关键字声明为全局变量：

detail_url_list = []    # 全局变量

def get_detail_html():
    # 爬取文章详情页
    global detail_url_list    # 用 global 声明为全局变量
    while True:
        if len(detail_url_list):
            url = detail_url_list.pop()
            # for url in detail_url_list:
            print("get detail html started")
            time.sleep(2)
            print("get detail html end")

def get_detail_url():
    global detail_url_list    # 用 global 声明为全局变量
    while True:
        # 爬取文章列表页
        print("get url started")
        time.sleep(2)
        for i in range(20):
            detail_url_list.append("http://projectstedu.com/{id}".format(id=i))
        print("get detail url end")

共享变量方法要考虑到线程间安全性。

2. 通过 Queue 的方式进行线程间同步

from queue import Queue

Queue 本身就是线程安全的。

from queue import Queue
import time
import threading


def get_detail_html(queue):
    # 爬取文章详情页
    while True:
        url = queue.get()
        # for url in detail_url_list:
        print("get detail html started")
        time.sleep(2)
        print("get detail html end")


def get_detail_url(queue):
    # 爬取文章列表页
    while True:
        print("get detail url started")
        time.sleep(4)
        for i in range(20):
            queue.put("http://projectsedu.com/{id}".format(id=i))
        print("get detail url end")

detail_url_queue = Queue(maxsize=1000)    # 设置 Queue的最大数

thread_detail_url = threading.Thread(target=get_detail_url, args=(detail_url_queue,))
thread_detail_url.start()

for i in range(10):    # 启动10条线程
    html_thread = threading.Thread(target=get_detail_html, args=(detail_url_queue,))
    html_thread.start()

detail_url_queue.task_done()
detail_url_queue.join()

理解 Queue 队列中 join() 与 task_done() 的关系。

看了网上的文章，一般是这样说的：

如果线程里每从队列里取一次，但没有执行 task_done()，则 join 无法判断队列到底有没有结束，在最后执行个 join() 是等不到结果的，会一直挂起。可以理解为，每 task_done() 一次就从队列里删掉一个元素，这样在最后 join 的时候根据队列长度是否为零来判断队列是否结束，从而执行主线程。

自己的理解：task_done() 给了 join()信号，告诉 join() 取出了一个元素，join() 就会判断队列长度是否为零了，如果为零，则结束线程，停止阻塞，回到主线程。

本文参考文章：

Python线程间通信方式

多线程中的 Queue队列中join()与task_done()

小宇不内向

关注

0
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
Python多线程编程之线程间通信

python线程间通信，有两种常用的方法：1.共享变量：定义一个全局变量，然后在不同的线程函数中，使用global关键字声明为全局变量：detail_url_list = [] # 全局变量def get_detail_html(): # 爬取文章详情页 global detail_url_list # 用 global 声明为全局变量 ...
复制链接

扫一扫

专栏目录