看协程asyncio有点云里雾里,原理是明白了,但总要有点实际应用吧,协程对于IO密集有着天然的优势,aiohttp还没看,谨以此例先体验下协程的实际应用,同时了解一下分块下载的方法。
1.着先获取待下载文件的大小 size
下载文件的大小通常都在headers里的"Content-Length",所以先读取一下header获得size:
resp = requests.head(url)
size = int(resp.headers["Content-Length"])
2.根据设置的n把待下载的文件分块,并记录分块的边界
通常分块并不能等分,所以最后一块就大一些:
spos = []
fpos = []
persize = size//n
for i in range(0, size, persize):
spos.append(i)
fpos.append(i + persize - 1)
fpos[-1] = size
3.下载文件指定的区间
通过requests.get()方法的header["Range"]指定下载文件的区间,比如需下载10-20字节段的文件:
header["Range"] = "bytes=10-20"
4.定义一个协程的函数
由于一般函数不能被await所修饰,必须要用loop.run_in_executor封装一下,但是loop.run_in_executor传参数比较坑,不支持**kwagrs,所以需要把requests.get(url)再封装一下:
get = lambda:requests.get(url,headers=headers)
resp = await loop.run_in_executor(None, get)
同时,分块下载后写入文件时需要找到块的起始位置,这就需要用到f.seek(offset,where)了。
完整的代码如下:
import asyncio
import requests
import time
url = "http://xia2.kekenet.com/Sound/2018/06/bbcdqmd175_3937944FiP.mp3"
async def download(spos, fpos, f, i):
""""""
headers = {}
headers['Range'] = "bytes=%d-%d"%(spos, fpos)
# print("bytes=%d-%d"%(spos, fpos))
try:
get = lambda:requests.get(url,headers=headers)
print('part of %d is ready!'%i)
resp = await loop.run_in_executor(None, get)
f.seek(spos,0)
f.write(resp.content)
print('part of %d is completed!'%i)
except Exception as e:
print("download file error:",e)
if __name__ == '__main__':
n = 10
resp = requests.head(url)
size = int(resp.headers["Content-Length"])
spos = []
fpos = []
persize = size//n
for i in range(0, size, persize):
spos.append(i)
fpos.append(i + persize - 1)
fpos[-1] = size
print(spos)
print(fpos)
f = open("D:\\kekenet.mp3",'wb')
f.close()
f = open("D:\\kekenet.mp3",'rb+')
start_time = time.time()
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(*[download(spos[i], fpos[i], f, i+1) for i in range(n)]))
finish_time = time.time()
f.close()
print('average speed is %0.2f KB/s'%(size/1000.0*(finish_time-start_time)))
打印结果如下:
由打印结果可知,协程的开始并不是按顺序的,完成也不一定按开始的顺序的,这也是它效率高的原因吧。
[0, 207553, 415106, 622659, 830212, 1037765, 1245318, 1452871, 1660424, 1867977, 2075530]
[207552, 415105, 622658, 830211, 1037764, 1245317, 1452870, 1660423, 1867976, 2075529, 2283082]
part of 4 is ready!
part of 10 is ready!
part of 5 is ready!
part of 2 is ready!
part of 6 is ready!
part of 1 is ready!
part of 7 is ready!
part of 3 is ready!
part of 8 is ready!
part of 9 is ready!
part of 4 is completed!
part of 5 is completed!
part of 8 is completed!
part of 3 is completed!
part of 2 is completed!
part of 9 is completed!
part of 7 is completed!
part of 1 is completed!
part of 10 is completed!
part of 6 is completed!
average speed is 6390.67 KB/s