requests
是一个用于发送 HTTP 请求的第三方库,提供了简单优雅的 API,支持各种常见的网络请求方法。以下是 requests
的完整用法指南,包括常见场景和高级功能。
1. 安装 requests
在使用之前,需要先安装 requests
:
bash
Copy code
pip install requests
2. 基本用法
2.1 GET 请求
python
Copy code
import requests response = requests.get("https://api.github.com") # 查看状态码 print(response.status_code) # 查看返回内容 print(response.text) # JSON 响应解析 print(response.json())
2.2 POST 请求
python
Copy code
data = {'key': 'value'} response = requests.post("https://httpbin.org/post", data=data) print(response.json())
2.3 传递 URL 参数
使用 params
参数传递查询字符串:
python
Copy code
params = {'q': 'python', 'sort': 'stars'} response = requests.get("https://api.github.com/search/repositories", params=params) print(response.url) # 打印最终的 URL print(response.json())
3. 请求头和其他参数
3.1 自定义请求头
python
Copy code
headers = {'User-Agent': 'my-app/1.0'} response = requests.get("https://api.github.com", headers=headers) print(response.headers)
3.2 传递 JSON 数据
python
Copy code
json_data = {'key': 'value'} response = requests.post("https://httpbin.org/post", json=json_data) print(response.json())
3.3 文件上传
python
Copy code
files = {'file': open('example.txt', 'rb')} response = requests.post("https://httpbin.org/post", files=files) print(response.json())
3.4 Cookies
python
Copy code
# 设置 Cookies cookies = {'session_id': '12345'} response = requests.get("https://httpbin.org/cookies", cookies=cookies) print(response.json()) # 获取 Cookies print(response.cookies)
4. 超时和重试
4.1 设置超时
python
Copy code
# 超时单位是秒,可以是浮点数 response = requests.get("https://api.github.com", timeout=5)
4.2 使用重试
结合 urllib3
的 Retry
类:
python
Copy code
from requests.adapters import HTTPAdapter from requests.packages.urllib3.util.retry import Retry session = requests.Session() retries = Retry(total=5, backoff_factor=1, status_forcelist=[500, 502, 503, 504]) session.mount('https://', HTTPAdapter(max_retries=retries)) response = session.get("https://httpbin.org/status/500") print(response.status_code)
5. 认证和安全
5.1 基本认证
python
Copy code
from requests.auth import HTTPBasicAuth response = requests.get("https://httpbin.org/basic-auth/user/pass", auth=HTTPBasicAuth('user', 'pass')) print(response.status_code)
5.2 Token 认证
python
Copy code
headers = {'Authorization': 'Bearer YOUR_TOKEN'} response = requests.get("https://api.example.com/protected", headers=headers)
5.3 SSL 证书验证
默认情况下,requests
会验证 SSL 证书。如果想跳过验证,可以设置 verify=False
。
python
Copy code
response = requests.get("https://expired.badssl.com", verify=False)
6. 响应处理
6.1 常用属性
python
Copy code
response = requests.get("https://api.github.com") print(response.status_code) # 状态码 print(response.headers) # 响应头 print(response.text) # 原始文本 print(response.json()) # JSON 解析
6.2 检查状态码
python
Copy code
if response.status_code == requests.codes.ok: print("Request successful")
6.3 响应流
python
Copy code
response = requests.get("https://httpbin.org/stream/20", stream=True) for line in response.iter_lines(): if line: print(line)
7. 进阶功能
7.1 会话(Session)管理
Session
可以跨请求保持某些参数(如 Cookies)。
python
Copy code
session = requests.Session() session.headers.update({'User-Agent': 'my-app/1.0'}) response = session.get("https://httpbin.org/get") print(response.headers)
7.2 使用代理
python
Copy code
proxies = { 'http': 'http://10.10.1.10:3128', 'https': 'http://10.10.1.10:1080', } response = requests.get("https://httpbin.org/ip", proxies=proxies) print(response.text)
7.3 自定义请求方法
python
Copy code
response = requests.request("PATCH", "https://httpbin.org/patch", data={'key': 'value'}) print(response.text)
8. 错误处理
捕获请求中的异常:
python
Copy code
try: response = requests.get("https://api.github.com", timeout=0.001) response.raise_for_status() # 如果响应状态码不是 200,会抛出 HTTPError except requests.Timeout: print("The request timed out") except requests.RequestException as e: print(f"An error occurred: {e}")
9. 高级用法
9.1 流式下载文件
python
Copy code
url = "https://example.com/largefile.zip" with requests.get(url, stream=True) as response: with open("largefile.zip", "wb") as f: for chunk in response.iter_content(chunk_size=8192): f.write(chunk)
9.2 多线程并发请求
结合 concurrent.futures
:
python
Copy code
from concurrent.futures import ThreadPoolExecutor import requests urls = ["https://httpbin.org/get"] * 10 def fetch(url): response = requests.get(url) return response.status_code with ThreadPoolExecutor(max_workers=5) as executor: results = executor.map(fetch, urls) print(list(results))
10. 小结
requests
是一个功能丰富的 HTTP 客户端库,支持从简单的 GET 请求到复杂的多线程下载。通过灵活使用各种参数和方法,你可以轻松实现大部分 Web 请求场景。