python处理http请求-requests模块
文章目录
requests 安装
python -m pip install requests
发送请求
requests支持 get、post、head、put、delete、options等http请求方法。
requests.get(url,params=None,kwargs)
requests.post(url,data=None,json=None,kwargs)
requests.head(url,kwargs)
requests.put(url,kwargs)
requests.delete(url,kwargs)
requests.options(url,kwargs)
requests.request(method,kwargs) 手动指定请求方法method
参数值 url、params、data、json、kwargs
其中url,为http请求地址,使用的是字符串。
url = "https://api.github.com/events"
params为查询参数(如user=admin&password=123),使用的是字典类型。
params = {"user":"admin","password":"123"}
>>> url = "https://api.github.com/events"
>>> params = {"user":"admin","password":"123"}
>>> respond = requests.get(url = url,params = params)
>>> respond.url
'https://api.github.com/events?user=admin&password=123'
>>>
data为post请求体数据,如何是字典类型,默认也是按照查询参数进行转换。当然data可以是其他任何内容,如字符串data=“hello world”,二进制数据等等。
>>> data = {"username":"admin","password":"123"}
>>> url = "http://httpbin.org/post"
>>> respond = requests.post(url = url, data=data)
>>>respond.request.body
'username=admin&password=123'
json参数将json格式的字典数据序列化,即我们看到的json字符串。
可以使用json.dumps(data_dict)来代替。
>>> json_data = {"username":"admin","password":"123"}
>>> url = "http://httpbin.org/post"
>>> respond = requests.post(url = url,json = json_data)
>>> respond.request.body
b'{"username": "admin", "password": "123"}'
>>>
>>> import json
>>> data = {"username":"admin","password":"123"}
>>> data_json = json.dumps(data)
>>> url = "http://httpbin.org/post"
>>> respond = requests.post(url = url,data = data_json)
>>> respond.request.body
'{"username": "admin", "password": "123"}'
>>>
操作请求头 headers、cookies
可选参数里面 headers 传入请求头,值是一个字典类型
headers = {“Content-Type”:“123”}
>>> import requests
>>> url = "http://httpbin.org/get"
>>> headers = {"Content-Type":"123"}
>>> respond = requests.get(url = url,headers = headers)
>>> respond.text
'{\n "args": {}, \n "headers": {\n "Accept": "*/*", \n "Accept-Encoding": "gzip, deflate", \n "Content-Type": "123", \n "Host": "httpbin.org", \n "User-Agent": "python-requests/2.25.1", \n "X-Amzn-Trace-Id": "Root=1-65542d54-75dde17108146dd8099e305e"\n }, \n "origin": "", \n "url": "http://httpbin.org/get"\n}\n'
>>> respond.request.headers
{'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Content-Type': '123'}
当然我们可以直接使用headers来配置cookie,对应http头是Cookie: xx1=xxx;xx2=xxx。
headers = {“Cookie”:“username=admin;info=test”}
>>> url = "http://httpbin.org/cookies"
>>> headers = {"Cookie":"username=admin;info=test"}
>>> respond = requests.get(url = url,headers = headers)
>>> respond.text
'{\n "cookies": {\n "info": "test", \n "username": "admin"\n }\n}\n'
>>> respond.request.headers
{'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Cookie': 'username=admin;info=test'}
>>>
也可以使用cookies参数,使用的是一个字典类型。
cookies = {“username”:“admin”,“info”:“test”}
>>> cookies = {"username":"admin","info":"test"}
>>> url = "http://httpbin.org/cookies"
>>> respond = requests.get(url,cookies = cookies)
>>> respond.text
'{\n "cookies": {\n "info": "test", \n "username": "admin"\n }\n}\n'
>>>
其他可选字段 verify、proxies、timeout
verify=False #忽略https证书错误
proxies = {
“http”: “http://10.10.1.10:3128”,
“https”: “http://10.10.1.10:1080”,
} #使用代理 http
proxies = {
‘http’: ‘socks5://user:pass@host:port’,
‘https’: ‘socks5://user:pass@host:port’
} #使用代理socks
timeout=0.001 单位为秒
获取响应
发送http请求后,会返回一个http响应对象-requests.Respond
respond = requests.get()
响应体 content、text、json()
响应体即为响应的数据,不包括响应头。
content为响应体的数据二进制字节值,text为响应体解码后的文本值(即根据respond.encoding 编码方法对二进制字节值content解码为文本形式)
对于json格式的数据,可以直接使用json()方法获取一个json对象
>>> url = "http://httpbin.org/get"
>>> respond = requests.get(url = url)
>>> respond.content
b'{\n "args": {}, \n "headers": {\n "Accept": "*/*", \n "Accept-Encoding": "gzip, deflate", \n "Host": "httpbin.org", \n "User-Agent": "python-requests/2.25.1", \n "X-Amzn-Trace-Id": "Root=1-65543a3d-39c7ec8424b8f6cd0bd56861"\n }, \n "origin": "", \n "url": "http://httpbin.org/get"\n}\n'
>>> respond.text
'{\n "args": {}, \n "headers": {\n "Accept": "*/*", \n "Accept-Encoding": "gzip, deflate", \n "Host": "httpbin.org", \n "User-Agent": "python-requests/2.25.1", \n "X-Amzn-Trace-Id": "Root=1-65543a3d-39c7ec8424b8f6cd0bd56861"\n }, \n "origin": "", \n "url": "http://httpbin.org/get"\n}\n'
>>> respond.encoding
'utf-8'
>>> respond.json()
{'args': {}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.25.1', 'X-Amzn-Trace-Id': 'Root=1-65543a3d-39c7ec8424b8f6cd0bd56861'}, 'origin': '', 'url': 'http://httpbin.org/get'}
响应头 headers、cookies、status_code、reason
响应头包括响应行和其他响应头,headers只是其他响应头。
status_code响应行的状态码,reason为响应状态描述
respond.headers 返回的是一个字段
{'Date': 'Wed, 15 Nov 2023 03:25:49 GMT', 'Content-Type': 'application/json', 'Content-Length': '305', 'Connection': 'keep-alive', 'Server': 'gunicorn/19.9.0', 'Access-Control-Allow-Origin': '*', 'Access-Control-Allow-Credentials': 'true'}
>>> respond.status_code
200
>>> respond.reason
'OK'
>>>
cookies为一个对象,可以获取其字典表示方式。
cookies.get_dict()
响应对应的请求 request
>>> respond.request
<PreparedRequest [GET]>
>>>
request对象主要的值有 url、header、body
>>> respond.request.url
'http://httpbin.org/get'
>>> respond.request.headers
{'User-Agent': 'python-requests/2.25.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
>>> respond.request.body
>>>