python基础入门-深蓝学院课后习题答案（七）

本文链接：https://blog.csdn.net/weixin_40127330/article/details/103823880

1、Python中线程与进程不同点？以及Python GIL存在的问题？

进程是程序的一次执行；各个进程有自己的内存空间、数据栈等，所以只能使用进程间通讯，而不能直接共享信息。

线程是进程中执行运算的最小单位，是进程中的一个实体。它可与同属一个进程的其它线程共享进程所拥有的全部资源。

进程与线程的区别：

* （1）调度：线程作为调度和分配的基本单位，进程作为拥有资源的基本单位

* （2）并发性：不仅进程之间可以并发执行，同一个进程的多个线程之间也可并发执行

* （3）拥有资源：进程是拥有资源的一个独立单位，线程不拥有系统资源，但可以访问隶属于进程的资源.

* （4）系统开销：在创建或撤消进程时，由于系统都要为之分配和回收资源，导致系统的开销明显大于创建或撤消线程时的开销。

GIL最大的问题就是Python的多线程程序并不能利用多核CPU的优势

2、编程习题：创建一个线程完成打印随机字符串，打印10次后退出（线程方法与线程类两种）

参考：https://www.cnblogs.com/yeayee/p/4952022.html

from threading import Thread
import time
def sayhi(name):
    i = 1
    while i <= 10:
        print('%s say hello' %name)
        i=i+1

if __name__ == '__main__':
    t=Thread(target=sayhi,args=('hh',))
    t.start()
    t.join()
    print('主线程')

from threading import Thread
class Sayhi(Thread):
    def __init__(self,name):
        super().__init__()
        self.name=name
    def run(self):
        i = 1
        print('i an here ')
        while i <= 10:
            print('%s say hello' % self.name)
            i = i + 1


if __name__ == '__main__':
    t = Sayhi('hi')
    t.start()
    t.join()
    print('主线程')

3、编程习题：创建多个线程进行计数n次，使用Lock进行控制，计数结果为n

参考：https://blog.csdn.net/comprel/article/details/72798354

参考python入门——并发编程 2.1.1 互斥锁Lock

4、编程习题：员工进公司门要刷卡,设置一个线程是“门”,再创建多个线程为“员工”，员工看到门没打开，就刷卡，刷完卡，门开了，员工就可以通过。

参考：https://blog.csdn.net/fly910905/article/details/77076119

这道题对你很有启发，首先是door_swiping_event.set()可以全局设置，然后是尽量使用函数方式线程，然后是你没有考虑门的开放时间，这道题需要多看多思考

import threading
import time
import random

def door():
    door_open_time_counter = 0
    while True:
        if door_swiping_event.isSet():
            print(' door opening...')
            door_open_time_counter +=1
        else:
            print('door closed...,swipe to open')
            door_open_time_counter = 0
            door_swiping_event.wait()

        if door_open_time_counter >3:
            door_swiping_event.clear()
        time.sleep(0.5)

def staff(n):  #以后使用线程的话，还是用这种函数的方式，比构建类的方式方便
    print('staff %d is comming...'% n)
    while True:
        if door_swiping_event.isSet(): #查看事件状态 如果为True，则通过门
            print('door is opened, %d passing...'% n)
            break
        else:
            print('staff %d sees door got closed,swipping the card...'% n)
            door_swiping_event.set()  # 这是最关键的地方，door_swiping_event是可以调用set()的，你之前的错误就是不知道可以这么做
            print('after set, %d passing ...'% n)
        time.sleep(0.5)




if __name__ == '__main__':
    door_swiping_event = threading.Event() #设置事件
    door_thread = threading.Thread(target=door)  #创建门的线程
    door_thread.start()

    for i in range(5):
        p = threading.Thread(target=staff,args=(i,))#创建员工的线程
        time.sleep(random.randrange(3)) #randrange() 方法返回指定递增基数集合中的一个随机数，基数默认值为1
        p.start()

5、编程习题：线程同步：经典生产者和消费者问题

使用Queue实现

使用信号量Semaphore实现

参考：https://blog.csdn.net/u010339879/article/details/82914139 （Queue）

https://blog.csdn.net/weixin_34355559/article/details/91481850 （Semaphore）

import time
import queue

import threading
import random

def producer(queue):
    FINISHED = True
    for i in range(10):
        print('%d is producing to the queue!'% i)
        queue.put(i)
        time.sleep(random.randint(1, 10) * 0.1)

    queue.put(FINISHED)
    print('finished!')

def consumer(queue):
    FINISHED = True
    while True:
        value = queue.get()
        if value is FINISHED:
            break
        print("{} in the queue is consumed!".format(value))
    print('finished!')

if __name__ == '__main__':
    queue = queue.Queue()
    producer1 = threading.Thread(target=producer, args=(queue,))
    consumer1 = threading.Thread(target=consumer, args=(queue,))

    producer1.start()
    consumer1.start()

    producer1.join()
    consumer1.join()

import threading
import random
import time
#消费者必须等待生产者生产好商品（即释放资源），消费者才能获取消费资源（即访问资源），其余时间消费者线程都处于挂起等待（等待信号量）
semaphore = threading.Semaphore(0)

# 假设生产的资源
item_number = 0


# 消费者
def consumer():
    print('Consumer is waiting for Producer')

    # 等待获取信号量
    semaphore.acquire()

    print('get the product , number is {}'.format(item_number))


# 生产者
def producer():
    global item_number

    # 模拟生产资源过程
    time.sleep(2)
    item_number = random.randint(1, 100)
    time.sleep(2)

    print('made the product , number is {}'.format(item_number))

    # 释放信号量
    semaphore.release()


if __name__ == "__main__":
    for i in range(5):
        # 将生产者、消费者实例化为线程
        thread_consumer = threading.Thread(target=consumer)
        thread_producer = threading.Thread(target=producer)

        thread_consumer.start()
        thread_producer.start()

        thread_consumer.join()
        thread_producer.join()

    print('consumer-producer example end.')

6、编程习题：创建多个进程

参见：python基础入门——并发编程

7、编程习题：进程同步：经典生产者和消费者问题 (Queue实现，以及数据共享方式)

参见：python基础入门——并发编程

下面提供两种数据共享方式，队列方式在并发编程内已经写过

from multiprocessing import Process,Value
import time

def produce(n):
    for i in range(5):
        n.value = i
        time.sleep(1)
        print('produce %d'% n.value)
def consume(n):
    for i in range(5):
        time.sleep(1)
        print('consume %d'% n.value)



if __name__ == '__main__':
    num = Value('d',0.0)

    p1 = Process(target=produce,args=(num,)) #注意，不能少
    p2 = Process(target=consume, args=(num,))
    p1.start()
    p2.start()
    p1.join()
    p2.join()

import multiprocessing
import random
import time

item = 0


class producer(multiprocessing.Process):
    def __init__(self, n):
        multiprocessing.Process.__init__(self)
        self.n = n

    def run(self):
        for i in range(5):
            # item = random.randint(0,256)
            global item
            item = item + 1
            self.n.value = item  # 注意value首字母是小写
            print('<--process producer : item %d appended to queue %s>' % (self.n.value, self.name))
            time.sleep(1)


class consumer(multiprocessing.Process):
    def __init__(self, n):
        multiprocessing.Process.__init__(self)
        self.n = n

    def run(self):
        time.sleep(1)
        while True:
            time.sleep(1)
            try:
                item = self.n.value
                print('<--process consumer : item %d popped from queue %s>' % (item, self.name))
            except Exception as e:
                print('the queue is empty,process consumer exit')
                break
        time.sleep(1)


if __name__ == '__main__':
    num = multiprocessing.Value('d', 0.0)
    process_producer = producer(num)
    process_consumer = consumer(num)
    process_producer.start()
    process_consumer.start()
    process_producer.join()
    process_consumer.join()

8、Python3中Asyncio库协程的编写方式？以及相比线程、进程并发的优缺点

参考：https://www.liaoxuefeng.com/wiki/1016959663602400/1017968846697824

https://blog.51cto.com/13786054/2132997

https://blog.csdn.net/zheng199172/article/details/88800275

https://blog.csdn.net/twt936457991/article/details/90048234 （优缺点）

#python可以通过 yield/send 的方式实现协程。在python 3.5以后，async/await 成为了更好的替代方案。
def consume(): # consumer 协程等待接收数据
    while True:
        number = yield  #yield 是python当中的语法。当协程执行到yield关键字时，会暂停在那一行，等到主线程调用send方法发送了数据，协程才会接到数据继续执行
        #协程的暂停完全由程序控制，线程的阻塞状态是由操作系统内核来进行切换，协程的开销远远小于线程的开销。
        print('开始消费：',number)

consumer = consume() #让初始化装填的consumer协程先执行起来，在yield处停止
next(consumer)
for num in range(0,100):#主线程中生产数据，协程中消费数据
    print('开始生产：',num)
    consumer.send(num) #发送数据给consumer协程

import asyncio #python3.5版本及之后
import time

async def test(i): #指一个使用async关键字定义的函数
    print("test_1",i,time.ctime(time.time()))
    await asyncio.sleep(1)
    print("test_2",i,time.ctime(time.time()))
loop=asyncio.get_event_loop() #创建一个事件loop
tasks=[test(i) for i in range(5)] #coro组成的列表
print('main process 1:',time.ctime(time.time()))
loop.run_until_complete(asyncio.wait(tasks))
print('main process 2:',time.ctime(time.time()))
loop.close()

import time
import asyncio

now = lambda : time.time()

async def do_some_work(x):  #指一个使用async关键字定义的函数
    print('waiting:',x)

start = now()
coroutine = do_some_work(2)  # 返回一个协程对象，这个时候do_some_work函数并没有执行
print(coroutine)

loop = asyncio.get_event_loop() #创建一个事件loop
task = loop.create_task(coroutine)
print(task)
loop.run_until_complete(task) #将协程加入到事件循环loop
print(task)
print('time:',now() - start)

协程的特点在于是一个线程执行，和多线程比，协程最大的优势就是协程极高的执行效率。线程数量越多，协程的性能优势就越明显。第二大优势就是不需要多线程的锁机制。利用多进程+协程，既充分利用多核，又充分发挥协程的高效率，可获得极高的性能。协程不是被操作系统内核所管理，而完全是由程序所控制（也就是在用户态执行）。

9、编程习题：匹配文本中邮箱的正则表达式，并输出对应的邮箱

参考：python正则表达式 findall部分

import re
key = r'<user01@mail.com> <usr02@mail.com> user04@mail.com'
k = re.findall(r'(\w+@m....[a-z]{3})',key)
print(k)
 
k = re.finditer(r'(\w+@m....[a-z]{3})',key)
for i in k:
    print(type(i))
    print(i.group())
 
输出：
['user01@mail.com', 'usr02@mail.com', 'user04@mail.com']
<class '_sre.SRE_Match'>
user01@mail.com
<class '_sre.SRE_Match'>
usr02@mail.com
<class '_sre.SRE_Match'>
user04@mail.com

10、编程习题：匹配文本中类似1900-07-01的日期，并输出日期

import re

number_1 = r'今天的日期是1990-10-21日，hello everyone！1999-11-22'
pattern = r'(\d{4}-(0|1)?[0-9]-[0-3]?[0-9])'
k = re.findall(pattern,number_1)

if k:
    for i in k:
        print(i[0])
else:
    print('not search!')

11、编程习题：英文的纯文本文件，根据正则，统计其中的单词出现的个数。

参考：https://blog.csdn.net/junli_chen/article/details/49079523

#-*- coding: utf8 -*-
import re

words_dict ={}
lines_list =[]
with open('a1.txt','r',encoding='utf-8') as f:
    for line in f:
        match = re.findall(r'[^a-zA-Z0-9]+',line)
        #print(type(match))
        for i in match:
            line = line.replace(i,' ') #用空格符代替其他字符
        lines_list = line.split()
        print(lines_list)
        for i in lines_list:
            if i not in words_dict:
                words_dict[i] = 1
            else:
                words_dict[i] += 1

for k,v in words_dict.items():
    print(k,v)

12、编程习题：去除以下html文件中的标签，只显示文本信息。

<div class="article" id="article">
<p>   原标题：专家解读<a href="http://news.sina.com.cn/c/nd/2018-03-21/doc-ifysniku881460.shtml" target="_blank">机构改革方案</a>:合并归类
重组 改革力度空前</p>
<p>   中新社北京3月21日电 （记者 马海燕）备受关注的《深化党和国家机构改革方案》21日对外公布。有关专家指出，全面加强党的领导，构建系统完备、科学规范、运行高效的
党和国家机构职能体系，成为此次改革的最大特点。</p>
</div>
----------

#-*- coding: utf8 -*-
import re

with open('a1.txt','r',encoding='utf-8') as f:
    number_1 = f.read()

print(number_1)

pattern = r'\<[^\>]+\>'
#r'\<[^\>]+\>' 与 r'<[^>]+>' 结果一样
result = re.sub(pattern,'',number_1)
print(result)

13、编程习题：提取网址中的域名信息，例如news.sina.com.cn

import re

number_1 = r'http://news.sina.com.cn/c/nd/2018-03-21/doc-ifysniku8811460.shtml'


pattern = r'(http|https)://((www.)?(\w+(\.)?)+)'
# (http|https)://(www.)?(\w+(\.)?)+ 获取域名


k = re.search(pattern,number_1)
if k:
    print(k.group(2))
else:
    print('not search!')

14、编程习题：处理电话号码，格式为：三位国家代码 - 三位区号 - 八位数字，需要满足以下两个条件：

1. 国家代码，区号可以省略，也就是可以匹配 80-010-12345678，也能匹配12345678

2.支持使用圆括号或者连字符连接区号，也就是可以匹配以及(80) 010-12345678

import re

number_1 = r'080-010-12345678'
number_2 = r'12345678'
number_3 = r'(80)010-12345678'
number_4 = r'080-020-67135691'

pattern = r'(([0-9]?\d{2,3}-|\([0-9]?\d{2,3}\))0\d{2,3}-)?[1-9]\d{6,7}'
# (([0-9]?\d{2,3}-|\([0-9]?\d{2,3}\))0\d{2,3}-)?  匹配是否包含80-010- 或者(80)010- 或者 (180)010-
#[1-9]\d{6,7} 匹配12345678 电话首字符不能为0
#0\d{2,3}- 匹配010-
#(([0-9]?\d{2,3}-|\([0-9]?\d{2,3}\))  (a|b)c 匹配ac或者bc
#([0-9]?\d{2,3}- 匹配80- 或080-
#\([0-9]?\d{2,3}\) 匹配(80) 或者(180)
#\( 匹配(

k = re.match(pattern,number_4)
if k:
    print(k.group())
else:
    print('not search!')