python爬虫下载重试_python爬虫多次请求超时的几种重试方法

Respect yourself

于 2021-02-04 06:02:29 发布

阅读量369

点赞数

文章标签： python爬虫下载重试

本文链接：https://blog.csdn.net/weixin_29216957/article/details/113672574

版权

标签：lte www 很多 eem zip ret exchange try coding

第一种方法

headers = Dict()

url = ‘https://www.baidu.com‘

try:

proxies = None

response = requests.get(url, headers=headers, verify=False, proxies=None, timeout=3)

except:

# logdebug(‘requests failed one time‘)

try:

proxies = None

response = requests.get(url, headers=headers, verify=False, proxies=None, timeout=3)

except:

# logdebug(‘requests failed two time‘)

print(‘requests failed two time‘)

总结：代码比较冗余，重试try的次数越多，代码行数越多，但是打印日志比较方便

第二种方法

def requestDemo(url，):

headers = Dict()

trytimes = 3 # 重试的次数

for i in range(trytimes):

try:

proxies = None

response = requests.get(url, headers=headers, verify=False, proxies=None, timeout=3)

#注意此处也可能是302等状态码

if response.status_code == 200:

break

except:

# logdebug(f‘requests failed {i}time‘)

print(f‘requests failed {i} time‘)

总结：遍历代码明显比第一个简化了很多，打印日志也方便

第三种方法

def requestDemo(url， times=1):

headers = Dict()

try:

proxies = None

response = requests.get(url, headers=headers, verify=False, proxies=None, timeout=3)

html = response.text()

#todo 此处处理代码正常逻辑

pass

return html

except:

# logdebug(f‘requests failed {i}time‘)

trytimes = 3 # 重试的次数

if times

总结：迭代显得比较高大上，中间处理代码时有其它错误照样可以进行重试；缺点不太好理解，容易出错，另外try包含的内容过多时，对代码运行速度不利。

第四种方法

@retry(3)#重试的次数 3

def requestDemo(url):

headers = Dict()

proxies = None

response = requests.get(url, headers=headers, verify=False, proxies=None, timeout=3)

html = response.text()

#todo 此处处理代码正常逻辑

pass

return html

def retry(times):

def wrapper(func):

def inner_wrapper(*args, **kwargs):

i = 0

while i

总结：装饰器优点多种函数复用，使用十分方便

第五种方法

#!/usr/bin/python

# -*-coding=‘utf-8‘ -*-

import requests

import time

import warnings

warnings.filterwarnings("ignore")

def get_xiaomi():

try:

# for n in range(5): # 重试5次

# print("第"+str(n)+"次")

for _ in range(5): # 重试5次

url = "https://www.mi.com/22"

headers = {

"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",

"Accept-Encoding": "gzip, deflate, br",

"Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",

"Connection": "keep-alive",

# "Cookie": "xmuuid=XMGUEST-D80D9CE0-910B-11EA-8EE0-3131E8FF9940; Hm_lvt_c3e3e8b3ea48955284516b186acf0f4e=1588929065; XM_agreement=0; pageid=81190ccc4d52f577; lastsource=www.baidu.com; mstuid=1588929065187_5718; log_code=81190ccc4d52f577-e0f893c4337cbe4d|https%3A%2F%2Fwww.mi.com%2F; Hm_lpvt_c3e3e8b3ea48955284516b186acf0f4e=1588929099; mstz=||1156285732.7|||; xm_vistor=1588929065187_5718_1588929065187-1588929100964",

"Host": "www.mi.com",

"Upgrade-Insecure-Requests": "1",

"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.90 Safari/537.36"

}

response = requests.get(url,headers=headers,timeout=10,verify=False)

res= response.text

# print(res)

print(response.status_code)

if response.status_code==200:

break

return res

except:

result = "异常"

return result

if __name__ == ‘__main__‘:

print(get_xiaomi())

第六种方法

python爬虫多次请求超时的几种重试方法

标签：lte www 很多 eem zip ret exchange try coding

Respect yourself

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
python爬虫下载重试_python爬虫多次请求超时的几种重试方法

标签：ltewww很多eemzipretexchangetrycoding第一种方法headers = Dict()url = ‘https://www.baidu.com‘try:proxies = Noneresponse = requests.get(url, headers=headers, verify=False, proxies=Non...
复制链接

扫一扫