乍一看,全是运行时错误,RunTimError:Timeout context manager should be used in task
# 异步抓取代码
async def fetch(session, url, headers=None, timeout=10, binary=False):
_headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.81 Safari/537.36 SE 2.X MetaSr 1.0",
}
if headers:
_headers = headers
try:
async with session.get(url, headers=_headers, timeout=timeout)as response:
status = response.status
html = await response.read()
if not binary:
encoding = cchardet.detect(html)['encoding']
html = html.decode(encoding, errors='ignore')
redirected_url = str(response.url)
except Exception as e:
msg = "Failed to download url:{}, time:{},\n\t exception:{},\n{}".\format(url, time.strftime('%H:%M:%S'),
str(type(e)), str(e))
print(msg)
html = ""
status = 0
redirected_url = url
return status, html, redirected_url
解决
请教了经验丰富的大佬,
1.
一开始,拿事件循环传给fetch异步请求函数,是放在__init__
构造函数里面
在python3.7
以后版本,所有涉及异步的代码都要放到async
函数里面
于是放到了这里,
2.
python3.7以后
,aiohttp
的timeout
不能直接是整数,需要用aiohttp
的timeout
类型
timeout = aiohttp.ClientTimeout(total=10)
再解决了其他几个bug后,就好了