requests库的使用

最新推荐文章于 2024-01-27 20:59:56 发布

soud11111111

最新推荐文章于 2024-01-27 20:59:56 发布

阅读量201

点赞数 5

分类专栏：爬虫文章标签： tag

本文链接：https://blog.csdn.net/qq_44167916/article/details/100187656

版权

爬虫专栏收录该内容

7 篇文章 0 订阅

订阅专栏

简介

Requests是一个优雅而简单的Python HTTP库，专为人类而构建
Requests是有史以来下载次数最多的Python软件包之一，每天下载量超过400000次
之前的urllib作为Python的标准库，因为历史原因，使用方式可以说是非常的麻烦而复杂的，而且官方文档也十分的简陋，常常需要去查看源码
相反，Requests的使用方式非常的简单，直观，人性化，让程序员的精力完全从库的使用中解放出来

以下就以简洁代码块简单明了

请求方法

Requests的请求不再像urllib一样需要去构造各种Request、opener和handler，使用Requests构造的方法，并在其中传入需要的参数即可。

get请求方法

示例

每一个请求方法都有一个对应的API，比如GET请求就可以使用get()方法：
import requests
url = 'http://httpbin.org/get'
response = requests.get(url=url)
print(response.content.decode())

输出结果

{
  "args": {}, 
  "headers": {
    "Accept": "*/*", 
    "Accept-Encoding": "gzip, deflate", 
    "Host": "httpbin.org", 
    "User-Agent": "python-requests/2.21.0"
  }, 
  "origin": "183.64.80.162, 183.64.80.162", 
  "url": "https://httpbin.org/get"
}

而POST请求就可以使用post()方法，并且将需要提交的数据传递给data参数即可

import requests
url = 'http://httpbin.org/post'
data = {'name':'Nanfeng'}
response = requests.post(url=url,data=data)

而其他的请求类型，都有各自对应的方法：

put方法

import requests
url = 'http://httpbin.org/put'
data = {'name':'Nanfeng'}
response = requests.put(url=url,data=data)

delete方法

import requests
url = 'http://httpbin.org/delete'
response = requests.delete(url=url)

head方法

import requests
url = 'http://httpbin.org/get'
response = requests.head(url=url)

options方法

mport requests
url = 'http://httpbin.org/get'
response = requests.options(url=url)

传递URL参数

传递URL参数也不用再像urllib中那样需要去拼接URL，而是简单的，构造一个字典，并在请求时将其传递给params参数：

import requests
url = 'http://httpbin.org/get'
params = {'name':'Nanfeng','age':18}
response = requests.get(url=url,params=params)
print(response.url)

输出结果

http://httpbin.org/get?name=Nanfeng&age=18

并且，有时候我们会遇到相同的url参数名，但有不同的值，而python的字典又不支持键的重名，那么我们可以把键的值用列表表示：

import requests
url = 'http://httpbin.org/get'
params = {'name':['Nanfeng','Xizhou'],'age':18}
response = requests.get(url=url,params=params)
print(response.url)

输出结果

http://httpbin.org/get?name=Nanfeng&name=Xizhou&age=18

自定义Headers

如果想自定义请求的Headers，同样的将字典数据传递给headers参数。

import requests
url = 'http://httpbin.org/get'
headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",
    "Accept-Encoding": "gzip, deflate",
    "Accept-Language": "zh-CN,zh;q=0.9",
    "Host": "httpbin.org",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36"
}
response = requests.get(url=url,headers=headers)
print(response.url)

自定义Cookies

Requests中自定义Cookies也不用再去构造CookieJar对象，直接将字典递给cookies参数。

import requests
url = 'http://httpbin.org/cookies'
cookies = {'cookies_are':'working'}
response = requests.get(url=url,cookies=cookies)
print(response.text)

输出结果

{
  "cookies": {
    "cookies_are": "working"
  }
}

设置代理

当我们需要使用代理时，同样构造代理字典，传递给proxies参数。

import requests
url = 'http://httpbin.org/get'
proxies = {
    'http':'http://10.10.10.1:8888',
    'https':'http://10.10.10.1:8888'
}
response = requests.get(url=url,proxies=proxies)
print(response.text)

重定向

在网络请求中，我们常常会遇到状态码是3开头的重定向问题，在Requests中是默认开启允许重定向的，即遇到重定向时，会自动继续访问。

import requests
url = 'http://www.baidu.com'
response = requests.get(url=url,allow_redirects=False)
print(response.status_code)

200 状态码

禁止证书验证

有时候我们使用了抓包工具，这个时候由于抓包工具提供的证书并不是由受信任的数字证书颁发机构颁发的，所以证书的验证会失败，所以我们就需要关闭证书验证。

在请求的时候把verify参数设置为False就可以关闭证书验证了。

import requests
url = 'http://httpbin.org/post'
response = requests.get(url=url,verify=False)

设置超时

设置访问超时，设置timeout参数即可。

import requests
url = 'http://httpbin.org/get'
response = requests.get(url=url,timeout=0.01)
print(response.content.decode())

在这里插入图片描述
响应内容

通过Requests发起请求获取到的，是一个requests.models.Response对象。通过这个对象我们可以很方便的获取响应的内容。之前通过urllib获取的响应，读取的内容都是bytes的二进制格式，需要我们自己去将结果decode()一次转换成字符串数据。而Requests通过text属性，就可以获得字符串格式的响应内容。

import requests
url = 'https://github.com/events'
response = requests.get(url=url)
print(response.text)

字符编码

Requests会自动的根据响应的报头来猜测网页的编码是什么，然后根据猜测的编码来解码网页内容，基本上大部分的网页都能够正确的被解码。而如果发现text解码不正确的时候，就需要我们自己手动的去指定解码的编码格式。

import requests
url = 'https://github.com/events'
response = requests.get(url=url)
response.encoding = 'utf-8'
print(response.text)

二进制数据

而如果你需要获得原始的二进制数据，那么使用content属性即可。

import requests
url = 'https://github.com/events'
response = requests.get(url=url)
print(response.content)

json****数据

如果我们访问之后获得的数据是JSON格式的，那么我们可以使用json()方法，直接获取转换成字典格式的数据。

import requests
url = 'http://httpbin.org/get'
response = requests.get(url=url)
print(response.json())

输出结果

{'args': {}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.21.0'}, 'origin': '183.64.80.162, 183.64.80.162', 'url': 'https://httpbin.org/get'}

响应报头

import requests
url = 'http://httpbin.org/get'
response = requests.get(url=url)
print(response.headers)

Session****对象

在Requests中，实现了Session(会话)功能，当我们使用Session时，能够像浏览器一样，在没有关闭关闭浏览器时，能够保持住访问的状态。这个功能常常被我们用于登陆之后的数据获取，使我们不用再一次又一次的传递cookies。

import requests
session = requests.Session()
url = 'http://httpbin.org/cookies/sessioncookie/123456789'
response = requests.get(url=url)
print(response.text)

soud11111111

关注

5
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
requests库的使用

简介Requests是一个优雅而简单的Python HTTP库，专为人类而构建Requests是有史以来下载次数最多的Python软件包之一，每天下载量超过400000次之前的urllib作为Python的标准库，因为历史原因，使用方式可以说是非常的麻烦而复杂的，而且官方文档也十分的简陋，常常需要去查看源码相反，Requests的使用方式非常的简单，直观，人性化，让程序员的精力完全从库的使...
复制链接

扫一扫

专栏目录