Python学习第十二天---并行编程

最新推荐文章于 2024-09-01 22:30:04 发布

史壮

最新推荐文章于 2024-09-01 22:30:04 发布

阅读量181

点赞数

本文链接：https://blog.csdn.net/weixin_41928342/article/details/84777953

版权

1、并行编程：

概念的讲解：

首先了解一下什么是非并发编程：程序由单个步骤序列构成。是一个包含独立子任务的程序执行性能低

概念：异步、高效，分解子任务、简化流程与逻辑。

进程： process ：就是一个程序的执行实例。每个进程都有自己的地址空间、内存、数据栈及辅助数据。

线程：thread：同一个进程内，可被并行激活的控制流，线程共享相同上下文（可用空间，数据结构）。
特点：便于信息共享和通信，缺点是线程访问顺序差异导致结果不一致（条件 race condition）。

Python GIL全局解释器锁（Global Interpreter Lock）：python代码由虚拟机（解释器主循环）控制，主循环同时只能有一个控制线程执行，

2、多线程Thread ：

单线程的运行：

#coding=gbk
import time
def worker(n):
    print("函数开始于：{0}".format(time.ctime()))
    time.sleep(n)
    print("函数结束于：{0}".format(time.ctime()))


if __name__ == '__main__':
   worker(10)

多线程的使用_thread 。python已经不推荐使用了：特点，没有控制进程结束机制。只有一个同步原语（锁），功能少于theading 。.start_new_thread开始线程

#coding=gbk
import  time
import  _thread
def worker(n):
    print("函数开始于：{0}".format(time.ctime()))
    time.sleep(n)
    print("函数结束于：{0}".format(time.ctime()))

def main():
    print('主函数执行开始于：{}'.format(time.ctime()))
    _thread.start_new_thread(worker,(4,))
    _thread.start_new_thread(worker,(2,))


    time.sleep(6)
    print('主函数执行结束于：{}'.format(time.ctime()))

if __name__ == '__main__':
    main()

这里面的time,sleep（6）的时间是可以跳整的！如果不写这个让线程睡上几秒的话！会首先输出主函数结束于，因为我们的线程跟主函数的结束是没有关系的！然后想要按照将woker中的都打印出来！必须time.sleep()的时候要跟上面线程的最大时间保持一致或者是大于，但是身为开发人员我们并不知道他会在哪个时间线程结束！所以这个时间就不确定了！我们可以使用线程锁的概念来完成上面的需求。线程锁的代码在第四节上面展示。下面介绍我们python中推荐使用的模块。

3、threading模块 ：

1、第一种构造：使用 .thread(target = 目标函数 , args = (参数1，参数2，.....)) 线程类比较直观的构造。

threading.current_thread().name 获取当前线程的名称。

#coding=gbk
import  time
import  threading
"""threading.Thread() 实现多线程
"""
def worker(n):
    print("函数开始于：{0}".format(time.ctime()))
    time.sleep(n)
    print("函数结束于：{0}".format(time.ctime()))

def main():
    print('主函数执行开始于：{}'.format(time.ctime()))
    # _thread.start_new_thread(worker,(4,))
    # _thread.start_new_thread(worker,(2,))

    threads = []
    t1 = threading.Thread(target=worker,args=(4,))
    threads.append(t1)

    t2 = threading.Thread(target=worker,args=(2,))
    threads.append(t2)
#这样是我们的子线程还没有结束，主线程就结束的
    for t in threads:
        t.start()

 # 实现我们的主线程等待子线程运行结束的时候在停止运行。
    for t in threads:
        t.join()  # 这个join就是等待我们的子线程结束，
    print('主函数执行结束于：{}'.format(time.ctime()))

if __name__ == '__main__':
    main()
"""output：
主函数执行开始于：Mon Dec  3 20:32:56 2018
函数开始于：Mon Dec  3 20:32:56 2018
函数开始于：Mon Dec  3 20:32:56 2018
主函数执行结束于：Mon Dec  3 20:32:56 2018
函数结束于：Mon Dec  3 20:32:58 2018
函数结束于：Mon Dec  3 20:33:00 2018
"""
#可以看出来当主函数结束的时候！线程并没有结束运行。
#也就是当子线程还没有执行完的时候，我们的主线程就执行完毕了。
#这是因为我们的主线程跟子线程并没有进行同步。

3、2：第二种方式：使用threading.Thread 派生类实现多线程。

#coding=gbk
import  time
import  threading
"""threading.Thread() 派生类实现多线程
"""
def worker(n):
    print("函数开始于：{0}".format(time.ctime()))
    time.sleep(n)
    print("函数结束于：{0}".format(time.ctime()))

#定义自己得派生线程类
class MuThread(threading.Thread):
    def __init__(self,func,args):
        threading.Thread.__init__(self)
        self.func = func
        self.args = args

    def run(self):
        self.func(*self.args)

def main():
    print('主函数执行开始于：{}'.format(time.ctime()))
    # _thread.start_new_thread(worker,(4,))
    # _thread.start_new_thread(worker,(2,))

    threads = []
    # t1 = threading.Thread(target=worker,args=(4,))
    t1 =MuThread(worker,(4,))
    threads.append(t1)

    # t2 = threading.Thread(target=worker,args=(2,))
    t2 = MuThread(worker, (2,))
    threads.append(t2)
#这样是我们的子线程还没有结束，主线程就结束的
    for t in threads:
        t.start()

 # 实现我们的主线程等待子线程运行结束的时候在停止运行。
    for t in threads:
        t.join()  # 这个join就是等待我们的子线程结束，
    print('主函数执行结束于：{}'.format(time.ctime()))

if __name__ == '__main__':
    main()

4、同步原语（锁）的学习：

为什么要用同步锁，我们举一个放鸡蛋的例子！三个人去放很多鸡蛋，会出现重复的现象!

人就代表我们的线程，这个时候我们就需要用到同步原语中的同步锁！

#coding=gbk
import threading
import time
import random

eggs = []
lock =threading.Lock()

#拿鸡蛋去放的操作
def put_egg(n,lst):
    #在放置之前拿到锁
    # lock.acquire()  拿到钥匙
    # 设置锁
    with lock:
        for i in range(1, n + 1):
            #让每一个在放置鸡蛋之前随机的停留几秒
            time.sleep(random.randint(0,2))
            lst.append(i)
    # lock.release()  释放钥匙

def main():
    threads = []

    #创建线程
    for i in range(3):
        t =  threading.Thread(target=put_egg,args=(5,eggs))
        threads.append(t)
    #启动线程
    for t in threads:
        t.start()

    #让主线程等待
    for t in threads:
        t.join()
    print(eggs)

if __name__ == '__main__':
    main()

这就是拿鸡蛋的一个操作！但是加锁以后相应的会降低性能。

5、queue队列的学习：

queue 模块下面的。先进先出队列 FIFO ，
还有一种后进先出的队列就是 LIFO ，其他语言的栈。
还有一个PriorityQueue 优先队列。

看一个fifo的队列。

#coding=gbk
import threading
import queue
import time
import random

#创建一个生产者
def produce(date_queue):
    for i in  range(5):
        time.sleep(0.5)
        item = random.randint(1,100)
        date_queue.put(item)
        print('{}在队列汇总放入数据项：{}'.format(threading.current_thread().name,item))

#定义一个消费者
def consumer(date_queue):
    while True:
        try:
            item = date_queue.get(timeout=3)
            print('{}从对列中移除了{}'.format(threading.current_thread().name,item))
        except queue.Empty:
            break
        else:
            date_queue.task_done()

def main():
    q = queue.Queue()

    threads = []
    p =  threading.Thread(target=produce,args=(q,))
    p.start()

    for i in range(2):
        c = threading.Thread(target=consumer,args=(q,))
        threads.append(c)
    for t in threads:
        t.start()

    for t in threads:
        t.join()

    q.join()

if __name__ == '__main__':
    main()

6、multiprocessing多进程模块：充分运用多核、多cpu的计算能力，适用于计算密集型任务。使用是跟多线程一致的

7、更多的并行编程：concurrent.futures 模块。他主要在模块下给我准备两个模块，一个线程池的模块，ThreadPoolExecutor。还有一个ProcessPoolExecutor：多进行的模块。

#coding=gbk
import time
import concurrent.futures
"""  concurrent.futures模块的学习。
"""
number = list(range(1,11))
def count(n):
    for i in range(1000000):
        i += i
    return i * n

def worker(x):
    result = count(x)
    print("数字：{}的计算结果是：{}".format(x,result))

#顺序执行
def sequential_execution():
    start_time = time.clock()
    for i in number:
        worker(i)
    print('顺序执行花肥时间：{}秒'.format(time.clock() - start_time))

#线程池执行
def threading_execution():
    start_time = time.clock()
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        for i in number:
            executor.submit(worker,i)
    print('线程池的花费时间：{}'.format(time.clock() - start_time))

#多进程去执行
def process_excution():
    start_time = time.clock()
    with concurrent.futures.ProcessPoolExecutor(max_workers=5) as executer:
        for i in number:
            executer.submit(worker,i)
    print('进程池执行花费的时间为：{}'.format(time.clock() - start_time))

if __name__ == '__main__':
    #sequential_execution() #顺序执行花肥时间：0.6267245815110348秒
    #threading_execution()   #线程池的花费时间：0.5682613913639004
    #process_excution()      #进程池执行花费的时间为：0.558646438043795

上面是三种方式的比较！实际工作中，你的任务是计算密集型的时候，首先多进程，如果是IO密集型的！方便随时释放，我们使用多进程。