url ="baidu.com"
query_hearders ={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.20 Safari/537.36"}
response = requests.get(url)
responsePost = requests.post(url, data = query_headers,)print(response)print(response.content.decode())
3. response的方法
response.text
该方式往往出现乱码,出现乱码使用response.encoding=‘utf-8’
该方法获取网页的HTML字符串
response.content.decode()
该方法把响应的二进制字节流转化为str类型
response.request.url
发送请求的url地址
response.url
response 响应的 url 地址
response.reques.headers
请求头
response.headers
响应头
4. 获取网页源码的正确方法(通过下面三种方式一定能获取到网页的正确解码后的字符串)
response.content.decode()
response.content.decode('gbk')
response.text
5. header 的使用
为了模拟浏览器,获取和浏览器一模一样的内容
headers ={"User-Agent":"Mozilla/5.0 (iPhone; CPU iPhone OS 11_0 like Mac OS X) AppleWebKit/604.1.38 (KHTML, like Gecko) Version/11.0 Mobile/15A372 Safari/604.1","Referer":"https://fanyi.baidu.com/translate?aldtype=16047"}
response = requests.get(url, headers)
6. 使用超时参数
requests.get(url, headers=headers, timeout=3)
3秒内必须返回响应,否则会报错
7. Retrying
安装
pip install retrying
from retrying import retry
@retry(stop_max_attempt_number=3)deffun1():print('this is func1')raise ValueError('this is test error')