requests库
requests是一个第三方HTTP库,其使用比起原生的urllib方便很多
安装requests
pip install requests
发送get请求
使用requests.get
方法即可,示例如下:
# -*- coding:utf-8 -*-
import requests
url = 'https://www.baidu.com/s'
# 查询参数
params = {
'wd': '中国'
}
# 请求头
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/\
537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36'
}
# 发送get请求
response = requests.get(url, params=params, headers=headers)
# 指定响应的编码格式,默认的编码是ISO-8859-1,该编码不支持中文
response.encoding = 'utf-8'
# 响应的内容(编码后)
print(response.text)
# 响应的内容(编码前)
print(response.content)
# 响应状态码
print(response.status_code)
# 请求地址
print(response.url)
with open('中国.html', 'w', encoding='utf-8') as f:
f.write(response.text)
发送post请求
和get类似,使用requests.post
即可,不同的是,get的参数是params,而post的参数是data,例如:
# -*- coding:utf-8 -*-
import requests
url = 'https://www.lagou.com/jobs/positionAjax.json?needAddtionalResult=false&isSchoolJob=1'
# 请求数据
data = {
'first': 'true',
'pn': 1,
'kd': '游戏'
}
# 请求头
headers = {
'referer': 'https://www.lagou.com/jobs/list_%E6%B8%B8%E6%88%8F/p-city_0?&cl=false&fromSearch=true&labelWords=&suginput=&isSchoolJob=1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36',
'cookie': 'user_trace_token=20200919203627-7b8cf5e1-78a4-4213-9e64-6ce460468c76; LGUID=20200919203629-4d75fd56-325b-452c-907c-7525c7327b12; _ga=GA1.2.527659171.1600518989; RECOMMEND_TIP=true; index_location_city=%E5%85%A8%E5%9B%BD; __lg_stoken__=1c74fbadb2f1da5c77effc9cd3ffe9ca452d00300ae34e94e6111a5216fa4f8c99f70eb99738a0ac3106d5415605ff9848408d413d28ac093e8ecec0fc7736175a42a9d1f8f3; JSESSIONID=ABAAAECABFAACEA9390993A5C8E8BE30C4B076EB2B9E10C; WEBTJ-ID=20210128%E4%B8%8B%E5%8D%883:32:14153214-17747e82f2d1ea-0361634da4227f-13e3563-1327104-17747e82f2e9e4; PRE_UTM=; PRE_HOST=; PRE_LAND=https%3A%2F%2Fwww.lagou.com%2F; LGSID=20210128153216-50a2d58b-f4ff-48dd-9b92-046eb640bfd2; PRE_SITE=https%3A%2F%2Fwww.lagou.com; TG-TRACK-CODE=index_search; LGRID=20210128155856-e7a7b6a2-3575-4644-9e84-1778457ebd81; X_MIDDLE_TOKEN=ca905f071f494dbd571c019c6b2bbda9; X_HTTP_TOKEN=09a6287fb7a05f5301802811615ab6431eb3d0a917; SEARCH_ID=a0fbf355d8e747fdbd1952e11c459f8d'
}
response = requests.post(url, data=data, headers=headers) # 发送post请求
print(response.json()) # 将返回的json数据转为字典或列表
使用代理
只需要再请求方法里,加一个proxies参数,传入代理即可,例如:
# -*- coding:utf-8 -*-
import requests
url = 'http://httpbin.org/ip'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36',
}
# 代理
proxy = {
'http': '120.232.150.110:80'
}
# 使用代理去访问
response = requests.get(url, headers=headers, proxies=proxy)
print(response.json())