多线程、多进程

最新推荐文章于 2024-06-23 09:30:00 发布

Hereto.

最新推荐文章于 2024-06-23 09:30:00 发布

阅读量73

点赞数

本文链接：https://blog.csdn.net/kuangxie4668/article/details/91431701

版权

Python 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

多线程、多进程

python中的GIL
python多线程编程
线程间通信-Queue
- 共享变量
- Queue
线程同步（Lock、RLock、semaphores、Condition）
concurrent线程池编码
多进程编程-multiprocessing
进程间通信

python中的GIL

# GIL global interpreter lock 全局解释器锁
# python中的一个线程对应于c语言中的一个线程
# gil使得一个时刻只有一个线程在一个cpu上执行字节码，无法将多个线程映射到多个cpu上执行

# gil会根据字节码行数以及时间片释放gil，gil在遇到io的操作的时候主动释放

python多线程编程

普通样式的多线程编程

import threading
import time

def Get_Html():
    print("我开始爬取网页了")
    time.sleep(2)
    print("网页爬取完成")

def Get_Url():
    print("我开始爬取URL")
    time.sleep(4)
    print("URL爬取完成")

if __name__ == "__main__":
    thread1 = threading.Thread(target=Get_Html)
    thread2 = threading.Thread(target=Get_Url)

    start_time = time.time()

    # setDaemon方法是将该进程设置成主线程的守护进程，即如果主线程运行完成，该子线程必须结束
    # 该方法必须要在start方法之前设置
    # 该方法默认为False
    thread1.setDaemon(True) 
    thread2.setDaemon(True)

    thread1.start()
    thread2.start()

    thread1.join() # 该方法会让主线程等待子线程全部运行完成后，才会运行下面的代码
    thread2.join()

    print("工作完成，用时{}".format(time.time() - start_time))

使用继承方法实现多线程

import threading


class GetDetaiHtml(threading.Thread):
    """
    类继承Thread类必须重写run方法
    """
    def __init__(self, name):
        super().__init__(name = name)

    def run(self):
        print("我开始爬取{}网页了".format(self.name))
        time.sleep(2)
        print("网页爬取完成")

class GetDetaiUrl(threading.Thread):
    def __init__(self, name):
        super().__init__(name = name)

    def run(self):
        print("我开始爬取{}URL".format(self.name))
        time.sleep(4)
        print("URL爬取完成")

if __name__ == "__main__":
    thread1 = GetDetaiHtml("百度")
    thread2 = GetDetaiUrl("贴吧")

    thread1.start()
    thread2.start()

    thread1.join()
    thread2.join()

    print("运行完成")

线程间通信-Queue

线程间通信的两种方式：
共享变量
Queue

共享变量

import time
import threading

detail_url_list = [] # 共享变量

def get_detail_html(detail_url_list):
    # 爬取文章详情页
    while True:
        if len(detail_url_list):
            url = detail_url_list.pop() # pop() 函数用于移除列表中的一个元素（默认最后一个元素），并且返回该元素的值。
            print('我开始获取http:/{url}中的详情'.format(url = url))
            time.sleep(2)
            print('http:/{url}中的详情爬取完成'.format(url = url))

def get_detail_url(detail_url_list):
    # 爬取文章列表
    while True:
        print('开始获取文章url')
        time.sleep(4)
        for i in range(20):
            detail_url_list.append(i)
        print('获取文章url结束')

if __name__ == "__main__":
    thread1 = threading.Thread(target=get_detail_html, args=detail_url_list)
    thread2 = threading.Thread(target=get_detail_url, args=detail_url_list)

    thread1.start()
    thread2.start()

    thread1.join()
    thread2.join()

    print('网页爬取完成')

Queue

import time
from queue import Queue
import threading

def get_detail_html(queuq):
    # 爬取文章详情页
    while True:
        url = queuq.get() # pop() 函数用于移除列表中的一个元素（默认最后一个元素），并且返回该元素的值。
        print('我开始获取http:/{url}中的详情'.format(url = url))
        time.sleep(2)
        print('http:/{url}中的详情爬取完成'.format(url = url))

def get_detail_url(queue):
    # 爬取文章列表
    while True:
        print('开始获取文章url')
        time.sleep(4)
        for i in range(20):
            queue.put(i)
        print('获取文章url结束')

if __name__ == "__main__":

    detail_url_queue = Queue(maxsize=1000)

    thread1 = threading.Thread(target=get_detail_html, args=(detail_url_queue, ))
    for i in range(10):
        thread2 = threading.Thread(target=get_detail_url, args=(detail_url_queue, ))
        thread2.start()

    thread1.start()

    # 这两个命令是成对出现的
    # q.task_done()，每次从queue中get一个数据之后，当处理好相关问题，最后调用该方法，以提示q.join()是否停止阻塞，让线程向前执行或者退出；
    # q.join()，阻塞，直到queue中的数据均被删除或者处理。为队列中的每一项都调用一次。
    detail_url_queue.task_done()
    detail_url_queue.join()

    print('网页爬取完成')

线程同步（Lock、RLock、semaphores、Condition）

Lock

'''
    多线程操作全局变量 使用互斥锁
    重点：声明一个全局互斥锁
	
	1.用锁会影响性能
	2.锁会引起死锁
	3.
'''
import threading
import time
 
counter = 0
mutex = threading.Lock()
 
class MyThread(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
 
    def run(self):
        global counter, mutex
        time.sleep(1);
        if mutex.acquire():
            counter += 1
            print "I am %s, set counter:%s" % (self.name, counter)
            mutex.release()
 
if __name__ == "__main__":
    for i in range(0, 100):
        my_thread = MyThread()
        my_thread.start()

RLock 可重复的锁

'''
在一个线程里面，可以连续调用多次acquire
一定要注意acquire的次数和release次数一样
'''

Condition 条件变量，用于复杂的线程间同步

concurrent线程池编码

多进程编程-multiprocessing

进程间通信

Hereto.

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录