requests的使用

最新推荐文章于 2023-09-15 14:15:03 发布

weixin_43143740

最新推荐文章于 2023-09-15 14:15:03 发布

阅读量167

点赞数

分类专栏：大神文章标签：爬虫

本文链接：https://blog.csdn.net/weixin_43143740/article/details/100671169

版权

大神专栏收录该内容

6 篇文章 0 订阅

订阅专栏

Request
1,什么是requets?
requests 的底层实现其实就是 urllib,Requests的文档非常完备，中文文档也相当不错。Requests能完全满足当前网络的需求，支持Python 2.6–3.5，而且能在PyPy下完美运行。
2,安装命令 pip3 install requests
3,request的get请求和post请求我们先来看下get请求
GET请求:

response = requests.get("http://www.baidu.com/")

# 也可以这么写
# response = requests.request(
"get",
"http://www.baidu.com/")

返回的response方法介绍----------->>>>>>>>

response.text 返回解码后的字符串

respones.content 以字节形式（二进制）返回。

response.status_code　响应状态码

response.request.headers　请求的请求头

response.headers　响应头

response.encoding = ‘utf-8’ 可以设置编码类型

response.encoding 获取当前的编码

response.json() 内置的JSON解码器，以json形式返回,前提返回的内容确保是json格式的，不然解析出错会抛异常
如果想添加 headers，可以传入headers参数来增加请求头中的headers信息。如果要将参数放在url中传递，可以利用 params 参数。

import requests

kw = {'wd':'长城'}

headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"
}

# params 接收一个字典或者字符串的查询参数，
# 字典类型自动转换为url编码，不需要urlencode()
response = requests.get(
"http://www.baidu.com/s?",
params = kw,
headers = headers
)

# 查看响应内容，response.text 返回的是Unicode格式的数据
print (response.text)

这里有个问题需要注意：注意有时候我们直接使用response.text会出现乱码，原因:
当收到一个响应时，Requests 会猜测响应的编码方式，用于在你调用response.text 方法时对响应进行解码。Requests 首先在 HTTP 头部检测是否存在指定的编码方式，如果不存在，则会使用chardet.detect来尝试猜测编码方式（存在误差）,故更推荐使用response.content.deocde()

POST请求:

最基本post方法response = requests.post(url=url, data = data)
url:post请求的目标url
data:post请求的表单数据
传入data数据
对于 POST 请求来说，我们一般需要为它增加一些参数。那么最基本的传参方法可以利用 data 这个参数。

举例：import requests

form_data = {
‘sex’: ‘f’,
‘key’: ‘’,
‘stc’: ‘1:11,2:20.28,23:1’,
‘sn’: ‘default’,
‘sv’: ‘1’,
‘p’: ‘1’,
‘f’: ‘search’,
‘listStyle’: ‘bigPhoto’,
‘pri_uid’:‘0’,
‘jsversion’: ‘v5’,
}

header = {
‘Referer’: ‘http://search.jiayuan.com/v2/index.php?sex=f&stc=1:11,2:20.28,23:1&f=search’,
‘User-Agent’: ‘Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36’,
}

url = ‘http://search.jiayuan.com/v2/search_v2.php’

response = requests.post(url,headers=header,data=form_data)

print(response.text)

如果是json文件可以直接显示

print (response.json())

再说一个就是post请求上传文件，这个需要了解下,我们需要用到files参数
url = ‘https://httpbin.org/post’
files = {‘file’: open(‘image.png’, ‘rb’)}
response = requests.post(url, files=files)
print(response.text)

还有就是我们还可以用post请求来做web客户端验证,如果是Web客户端验证，需要添加 auth = (账户名, 密码)
import requests

auth=('test', '123456')

response = requests.get(
'http://192.168.199.107',
auth = auth
)

print (response.text)

设置代理
1,如果需要使用代理，你可以通过为任意请求方法提供 proxies 参数来配置单个请求：

import requests

# 根据协议类型，选择不同的代理
proxies = {
"http": "http://12.34.56.79:9527",
"https": "http://12.34.56.79:9527",
}

response = requests.get(
"http://www.baidu.com",
proxies = proxies
)
print(response.text)

获取Cookie
如果一个响应中包含了cookie，那么我们可以利用 cookies参数拿到：

import requests

response = requests.get("https://www.kuaidaili.com/login/")

# 返回CookieJar对象:
cookiejar = response.cookies

# 将CookieJar转为字典：
cookiedict = requests.utils.dict_from_cookiejar(
cookiejar
)

print (cookiejar)

print (cookiedict)

Session的使用
在 requests 里，session对象是一个非常常用的对象，这个对象代表一次用户会话：从客户端浏览器连接服务器开始，到客户端浏览器与服务器断开。
会话能让我们在跨请求时候保持某些参数，比如在同一个 Session 实例发出的所有请求之间保持 cookie 。
为了对比这里同样使用快代理进行模拟登录实现:

import requests

def kdl_login(url,data):
# 1.创建session对象，可以保存Cookie值
session = requests.session()
# 2. 处理 headers
headers = {
'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36',
}

try:
#表示不允许重定向allow_redirects=False
# 3.模拟登录发起请求，获取cookies
login_response = session.post(url=url, data=data, headers=headers,allow_redirects=False)
if login_response.status_code == 302:
print(login_response.headers,login_response.cookies)
# 将CookieJar转为字典：
logincookie = requests.utils.dict_from_cookiejar(
login_response.cookies
)
print("登录后的cookies",logincookie)
# 4.拿到cookies后获取个人主页的信息
response = session.get(
url='https://www.kuaidaili.com/usercenter/',
headers=headers,
)
if response.status_code == 200:
html = response.text
if '2295808193' in html:
print('登录成功')

except Exception as err:
print(err)

if __name__ == '__main__':
data = {
'next':'',
'kf5_return_to':'',
'username': '18518753265',
'passwd': 'ljh123456',
}
url = 'https://www.kuaidaili.com/login/'
kdl_login(url,data)

处理HTTPS请求 SSL证书验证
Requests也可以为HTTPS请求验证SSL证书：
要想检查某个主机的SSL证书，你可以使用 verify 参数（Defaults to True）
import requests
response = requests.get(“https://www.baidu.com/”, verify=True)

如果出现以下错误，表示验证证书出错：
SSLError: (“bad handshake: Error([(‘SSL routines’, ‘ssl3_get_server_certificate’, ‘certificate verify failed’)],)”,)
要想跳过书验证，把 verify 设置为 False 就可以正常请求了。

import requests
response = requests.get("https://www.12306.cn/mormhweb/", verify = False)
print (response.text)

requests的基本内容就这些

weixin_43143740

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
requests的使用

Request1,什么是requets?requests 的底层实现其实就是 urllib,Requests的文档非常完备，中文文档也相当不错。Requests能完全满足当前网络的需求，支持Python 2.6–3.5，而且能在PyPy下完美运行。2,安装命令 pip3 install requests3,request的get请求和post请求我们先来看下get请求GE...
复制链接

扫一扫