python requests 下载文件例子

1,背景

通过python的request下载文件,代码本身很简单,唯一需要说明的而即使需要通过session机制实现keep-alive的时候。

我使用python requests库中resue http conection的的session机制, 官方文档在https://requests.readthedocs.io/en/latest/user/advanced/

1.1 request Session对象

Session Objects

The Session object allows you to persist certain parameters across requests. It also persists cookies across all requests made from the Session instance, and will use urllib3’s connection pooling. So if you’re making several requests to the same host, the underlying TCP connection will be reused, which can result in a significant performance increase (see HTTP persistent connection).

1.2 Keep-Alive

https://requests.readthedocs.io/en/latest/user/advanced/#keep-alive
Keep-Alive

Excellent news — thanks to urllib3, keep-alive is 100% automatic within a session! Any requests that you make within a session will automatically reuse the appropriate connection!

Note that connections are only released back to the pool for reuse once all body data has been read; be sure to either set stream to False or read the content property of the Response object.

2, 代码

# This is a sample Python script of downloading file and writing.

from datetime import datetime
import time

import requests

"""
  提前准备好一个可以下载文件的url,并且不需要认证,因为本示例中没有添加header信息,直接通过get下载文件
"""

timeFormat = "%Y-%m-%d %H:%M:%S.%f"


def download_file(session, url, file_name):
    chrome = ""
    headers = {"User-agent": chrome}

    r = session.get(url, headers=headers, stream=True)
    with open(file_name, 'wb') as f:
        f.write(r.content)
        f.flush()
        f.close()


def downloadAction(downloadCount):

    url = "https://a.b.c/dw?file=a.txt"

    startTime = datetime.now()
    startTimeStr = datetime.strptime(str(startTime), timeFormat)
    print("file download startTime:%s" % startTimeStr)

    session = requests.Session()
	# 保存到当前目录下data文件夹下,文件名以file开始,
    for index in range(downloadCount):
        filePath = "./data/file{0}.txt".format(index)
        download_file(session, url, filePath, isEnabledKeepAlive)

    session.close()

    endTime = datetime.now()
    endTimeStr = datetime.strptime(str(endTime), timeFormat)
    print("file download   endTime:%s" % endTimeStr)
    consumedTimeBySecond = (endTime - startTime).seconds
	#  假设文件为10M大小
	totalFileSize = downloadCount * 10
    avgSpeed = totalFileSize / consumedTimeBySecond
    print(" %d times downlaod file, consumed(second):%d, avgSpeed:%f" %
          downloadCount, consumedTimeBySecond, avgSpeed))

   
if __name__ == '__main__':
    downloadAction(10, True)


  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值