Python中用Request实现HTTP请求的方式

最新推荐文章于 2024-04-29 14:34:14 发布

临久

最新推荐文章于 2024-04-29 14:34:14 发布

阅读量1.5k

点赞数 4

分类专栏：笔记文章标签： python pycharm 爬虫 http cookie

本文链接：https://blog.csdn.net/s741026400/article/details/113033336

版权

笔记专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Python中用Request实现HTTP请求的方式

安装request。
Requests库是第三方模块，需要额外安装。源码链接
（1）使用pip进行安装，安装命令为：pip install requests(可能不是最新版)
（2）直接到GitHub上下载Requests的源代码。下载链接
（3）pycharm可通过这个方法下载；
1.实现一个完整的请求与相应模型
（1）GET请求

import requests
r = requests.get('http://www.baidu.com')
print(r.content)

（2）POST请求

import requests
postdata={'key':'value'}
r = requests.get('http://www.baidu.com',data=postdata)
print(r.content)

2.响应与编码

import requests
r = requests.get('http://www.baidu.com')
print(r.content)
print("+"*50)
print(r.text)
print("+"*50)
r.encoding='utf-8'#这里手动检测较为麻烦，下面会有一个技巧改进
print(r.text)

r.content返回字节形式；r.text返回文本形式；r.encoding返回根据HTTP头猜测的编码格式

技巧：自行设置编码格式
首先需要安装chardet库(pip install chardet安装)
完成后使用 chardet.detect()返回字典。其中confidence是检测精度，encoding是编码形式

import requests
import chardet
r=requests.get('http://www.baidu.com')
print(chardet.detect(r.content))
r.encoding=chardet.detect(r.content)['encoding']
print(r.encoding)
print(r.text)

3.请求头headers处理

import requests
import chardet
user_agent='Mozilla/4.0(compatible;MSIE 5.5;Windows NT)'
headers={'User_Agent':user_agent}
r = requests.get('http://www.baidu.com',headers=headers)
print(r.content)
r.encoding=chardet.detect(r.content)['encoding']
print(r.text)

4.响应码code和响应头headers的处理
获取响应码是使用Requests中的status_code字段，获取响应头是使用Requests中的headers字段

import requests
r =requests.get('http://www.baidu.com')
if r.status_code==requests.codes.ok:
    print(r.status_code)#响应码
    print(r.headers)#响应头
    print(r.headers.get('content-type'))#如果字段中没有会返回None
    print(r.headers['content-type'])#没有会抛出异常，故不推荐
else:
    r.raise_for_status()#主动产生一个异常，当响应码是4XX或5XX时，抛出异常，而相应码为200时，返回None

5.Cookie处理
(1)响应中如果包含Cookie的值，可以按如下方式获取Cookie的值

import requests
user_agent='Mozilla/4.0(compatible;MSIE 5.5;Windows NT)'
headers={'User_Agent':user_agent}
r = requests.get('http://www.zhihu.com',headers=headers)
# 遍历所有的cookie字段
for cookie in r.cookies.keys():
    print(cookie+':'+r.cookies.get(cookie))

（2）自定义Cookie值发送出去

import requests
user_agent='Mozilla/4.0(compatible;MSIE 5.5;Windows NT)'
headers={'User_Agent':user_agent}
cookies = dict(name='xxx',age='xx')
r = requests.get('http://www.baidu.com',headers=headers,cookies=cookies)
print(r.text)

（3）更高级的自动处理Cookie的方法，requests提供了一个session的概念，在连续访问网页时，登录跳转时不需要关注细节。

import requests
loginurl='http://www.zhihu.com/login'
s=requests.Session()
r=s.get(loginurl,allow_redirects=True)
datas={'username':'xxxxx','password':'xxxxxx'}
r=s.post(loginurl,data=datas,allow_redirects=True)
print(r.text)

6.重定向与历史信息
处理重定向需要设置allow_redirects字段
将allow_redirects设置为True即为允许重定向，false即禁止。可以通过r.history字段查看历史信息

import requests
r=requests.get('http://github.com')
print(r.url)
print(r.status_code)
print(r.history)

输出如下

https://github.com/
200
[<Response [301]>]

7.超时设置
超时选项是通过参数timeout来进行设置的

import requests
requests.get('http://github.com',timeout=2)

8.代理设置
使用代理Proxy

import requests
proxies={
    "http":"http://0.10.1.10:3128",
    "http":"http://10.10.1.10:1080"
}
requests.get("http://example.org",proxies=proxies)

临久

关注

4
点赞
踩
8

收藏

觉得还不错? 一键收藏
2
评论
Python中用Request实现HTTP请求的方式

Python中用Request实现HTTP请求的方式1.安装request。Requests库是第三方模块，需要额外安装。源码位于：（1）使用pip进行安装，安装命令为：pip install requests(可能不是最新版)（2）直接到GitHub上下载Requests的源代码，下载链接为：（3）pycharm可通过这个方法下载；2响应与编码import requestsr = requests.get(‘http://www.baidu.com’)print(r.content)p
复制链接

扫一扫