使用事前
发送get, post请求, 获取响应
- response = requests.get(url) #发送get请求,请求url地址对应的响应
- response = requests.post(url,data={请求体的字典}) #发送post请求
response的方法
- response.text
- 该方式往往会出现乱码.出乱码使用response.encoding=“utf-8”
- response.content.decode()
获取网页源码的正确打开方式(移动)
- 1.response.content.decode()
- 2.response.content.decode(“gbk”)
- 3.response.text
发送带header的请求
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"}
response = requests.get(url,headers=headers)
使用超时参数
- requests.get(url,headers=headers,timeout=3) #3秒内必须返回响应,否则报错
retrying模块的学习
import requests
from retrying import retry
'''
专门请求url地址的方法
'''
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'}
@retry(stop_max_attempt_number=3)
def _parse_url(url):
print("*"*100)
response = requests.get(url,headers=headers,timeout=5)
return response.content.decode()
def parse_url(url):
try:
html_str = _parse_url(url)
except:
html_str = None
return html_str
if __name__ == '__main__':
url = "http://www.baidu.com"
print(parse_url(url))