python的requests库介绍_Requests库介绍

最新推荐文章于 2024-08-09 23:34:20 发布

weixin_39805529

最新推荐文章于 2024-08-09 23:34:20 发布

阅读量2.6k

点赞数 9

文章标签： python的requests库介绍

Requests 是用

Beautiful is better than ugly.(美丽优于丑陋)

Explicit is better than implicit.(清楚优于含糊)

Simple is better than complex.(简单优于复杂)

Complex is better than complicated.(复杂优于繁琐)

Readability counts.(重要的是可读性)

requests库常用的7种方法：

requests.requests()

requests.get(‘https://github.com/timeline.json’) #GET请求

requests.post(“http://httpbin.org/post”) #POST请求

requests.put(“http://httpbin.org/put”) #PUT请求(提交修改全部的数据)

requests.delete(“http://httpbin.org/delete”) #DELETE请求

requests.head(“http://httpbin.org/get”) #HEAD请求

requests.patch(“http://httpbin.org/get”) #PATCH请求(提交修改部分数据)

剩下六种方法都是由requests()方法实现的，因此我们也可以说requests()方法是最基本的

在网络上，对服务器数据进行修改是比较困难的，在实际中get()方法是最为常用的方法

1.requests()方法：

requests.requests(method, url, **kwargs)

method:请求方式：GET, PUT,POST,HEAD, PATCH, delete, OPTIONS7种方式

url：网络链接

**kwargs: (13个可选参数)(下面演示这些参数如何使用)

params: 字典或者字节序列，作为参数增加到url中

>>> payload = {'key1': 'value1', 'key2': 'value2'}

>>> r = requests.get("http://httpbin.org/get", params=payload)

通过打印输出该URL，你能看到URL已被正确编码:

>>> print r.url

u'http://httpbin.org/get?key2=value2&key1=value1'

json: JSON格式的数据，作为requests的内容

>>> import requests

>>> r = requests.get('https://github.com/timeline.json')

>>> r.json()

[{u'repository': {u'open_issues': 0, u'url': 'https://github.com/...

headers: 字典，HTTP定制头

data: 是第二个控制参数，向服务器提交数据

import requests

import json

data = {'some': 'data'}

headers = {'content-type': 'application/json',

'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0'}

r = requests.post('https://api.github.com/some/endpoint', data=data, headers=headers)

print(r.text)

cookies: 字典或CookieJar, Requests中的cookie

如果某个响应中包含一些Cookie，你可以快速访问它们：

import requests

r = requests.get('http://www.google.com.hk/')

print(r.cookies['NID'])

print(tuple(r.cookies))

要想发送你的cookies到服务器，可以使用 cookies 参数：

import requests

url = 'http://httpbin.org/cookies'

cookies = {'testCookies_1': 'Hello_Python3', 'testCookies_2': 'Hello_Requests'}

# 在Cookie Version 0中规定空格、方括号、圆括号、等于号、逗号、双引号、斜杠、问号、@，冒号，分号等特殊符号都不能作为Cookie的内容。

r = requests.get(url, cookies=cookies)

print(r.json())

auth: 元组，支持HTTP认证功能

import requests

from requests.auth import HTTPBasicAuth

r = requests.get('https://httpbin.org/hidden-basic-auth/user/passwd', auth=HTTPBasicAuth('user', 'passwd'))

# r = requests.get('https://httpbin.org/hidden-basic-auth/user/passwd', auth=('user', 'passwd')) # 简写

print(r.json())

files: 字典类型，传输文件

import requests

url = 'http://127.0.0.1:5000/upload'

files = {'file': open('/home/lyb/sjzl.mpg', 'rb')}

#files = {'file': ('report.jpg', open('/home/lyb/sjzl.mpg', 'rb'))} #显式的设置文件名

r = requests.post(url, files=files)

print(r.text)

timeout: 设置的超时时间，秒为单位

>>> requests.get('http://github.com', timeout=0.001)

Traceback (most recent call last):

File "", line 1, in

requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request timed out. (timeout=0.001)

proxies: 字典类型，设定访问代理服务器，可以增加登录认证

import requests

proxies = {

"http": "http://10.10.1.10:3128",

"https": "http://10.10.1.10:1080",

}

requests.get("http://www.zhidaow.com", proxies=proxies)

如果代理需要账户和密码，则需这样：

proxies = {

"http": "http://user:pass@10.10.1.10:3128/",

}

allow_redirects: True/False，默认为True, 重定向开关

stream: True/False，默认为True，获取内容立即下载开关

verity: True/False，默认为True, 认证SSL证书

cert: 本地SSL证书路径

2.get()方法：

requests.get(url, params=None, **kwargs)

url: 拟获取页面的url链接

params: url中的额外参数，字典或字节流，可选择

**kwargs：12个控制访问的参数，就是requests中除params参数

3.head()方法

requests.head(url, **kwargs)

url: 拟获取页面的url链接

**kwargs：13个控制访问的参数

4.post()方法

requests.post(url, data=None, json=None, **kwargs)

url: 拟获取页面的url链接

data: 字典，字节序列或文件，Requests的内容

json: JSON格式的数据，Requests的内容

**kwargs：11个控制访问的参数

5.put()方法

requests.put(url, data=None, **kwargs)

url: 拟获取页面的url链接

data: 字典，字节序列或文件，Requests的内容

**kwargs：12个控制访问的参数

6.patch()方法

requests.patch(url, data=None, **kwargs)

url: 拟获取页面的url链接

data: 字典，字节序列或文件，Requests的内容

**kwargs：12个控制访问的参数

7.delete()方法

requests.delete(url, **kwargs)

url: 拟删除页面的url链接

**kwargs：13个控制访问的参数

Response对象

使用requests方法后，会返回一个response对象，其存储了服务器响应的内容，

常用属性：

r.status_code #HTTP响应状态码，200表示响应成功，404表示失败

r.content #HTTP响应内容的二进制形式

r.text #字符串方式的响应体，

r.headers #以字典对象存储服务器响应头，但是这个字典比较特殊，字典键不区分大小写，若键不存在则返回None

(要注意区分r.headers，与前面的headers参数字段，前者只是Response对象的一个属性，后者是传递的参数)

r.encoding#从HTTP头header中提取响应内容的编码方式(这个编码方式不一定存在)

r.apparent_encoding#从内容中分析出响应内容的编码方式(这个编码方式是绝对正确的)

r.raw #返回原始响应体，也就是 urllib 的 response 对象，使用 r.raw.read() 读取

在这里有一个比较特殊的属性: r.request.headers可以查看HTTP请求的头部，注意区分r.headers

常用方法：

r.raise_for_status() #失败请求(非200响应)抛出requests.HTTPError异常

Requests库的异常：

requests.ConnectionError: 网络连接错误异常，如DNS查询失败，拒接连接等

requests.HTTPError: HTTP错误异常

requests.URLRequired: URL缺失异常

requests.TooManyRedirects: 超过最大重定向次数，产生的重定向异常

requests.ConnectTimeout: 远程连接服务器异常超时

requests.Timeout: 请求URL超时，产生的超时异常

weixin_39805529

关注

9
点赞
踩
40

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫