在使用requests中难免需要对URL进行请求,今天遇到这样一条URL:
http://127.0.0.1/?test=%25test
凭经验,我使用requests.get进行请求,并把参数写入params中:
>>> import requests
>>> url = 'http://127.0.0.1'
>>> params = {
'test':'%25test'
}
>>> r = requests.get(url, params = params)
# 使用nc接收请求while true; do echo -e "HTTP/1.1 200 OK\n\n ❤" | nc -l -p 8080 -q 1; done
GET /?test=%2525test HTTP/1.1
Host: 127.0.0.1:8080
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
可以看到nc输出的url为 /?test=%2525test而非 /?test=%25test
搜索相关资料后发现,URL中的百分号是个特殊字符,用来对需要进行百分号编码的字符的ASCII码进行转义,具体参考维基百科
python的requests库会将params中的百分号进行转义
所以解决办法就是禁止对%进行编码
>>> import requests
>>> from urllib.parse import unquote
>>> url = 'http://127.0.0.1'
>>> params = {
'test':'%25test'
}
>>> params['test'] = unquote(params['test'])
>>> r = requests.get(url, params = params)
# 使用nc接收请求while true; do echo -e "HTTP/1.1 200 OK\n\n ❤" | nc -l -p 8080 -q 1; done
GET /?test=%25test HTTP/1.1
Host: 127.0.0.1:8080
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
注:正常的URL中百分号都是作为转义字符出现,所以如果出现了百分号,只需要把‘%xx’替换为其转义后的字符即可🤦🏼♂️
http://127.0.0.1/?test=%25test
>>> url = 'http://127.0.0.1'
>>> params = {
'test':'%test'
}
ASCII编码表
+----+-----+----+-----+----+-----+----+-----+
| Hx | Chr | Hx | Chr | Hx | Chr | Hx | Chr |
+----+-----+----+-----+----+-----+----+-----+
| 00 | NUL | 20 | SPC | 40 | @ | 60 | ` |
| 01 | SOH | 21 | ! | 41 | A | 61 | a |
| 02 | STX | 22 | " | 42 | B | 62 | b |
| 03 | ETX | 23 | # | 43 | C | 63 | c |
| 04 | EOT | 24 | $ | 44 | D | 64 | d |
| 05 | ENQ | 25 | % | 45 | E | 65 | e |
| 06 | ACK | 26 | & | 46 | F | 66 | f |
| 07 | BEL | 27 | ' | 47 | G | 67 | g |
| 08 | BS | 28 | ( | 48 | H | 68 | h |
| 09 | TAB | 29 | ) | 49 | I | 69 | i |
| 0A | LF | 2A | * | 4A | J | 6A | j |
| 0B | VT | 2B | + | 4B | K | 6B | k |
| 0C | FF | 2C | , | 4C | L | 6C | l |
| 0D | CR | 2D | - | 4D | M | 6D | m |
| 0E | SO | 2E | . | 4E | N | 6E | n |
| 0F | SI | 2F | / | 4F | O | 6F | o |
| 10 | DLE | 30 | 0 | 50 | P | 70 | p |
| 11 | DC1 | 31 | 1 | 51 | Q | 71 | q |
| 12 | DC2 | 32 | 2 | 52 | R | 72 | r |
| 13 | DC3 | 33 | 3 | 53 | S | 73 | s |
| 14 | DC4 | 34 | 4 | 54 | T | 74 | t |
| 15 | NAK | 35 | 5 | 55 | U | 75 | u |
| 16 | SYN | 36 | 6 | 56 | V | 76 | v |
| 17 | ETB | 37 | 7 | 57 | W | 77 | w |
| 18 | CAN | 38 | 8 | 58 | X | 78 | x |
| 19 | EM | 39 | 9 | 59 | Y | 79 | y |
| 1A | SUB | 3A | : | 5A | Z | 7A | z |
| 1B | ESC | 3B | ; | 5B | [ | 7B | { |
| 1C | FS | 3C | < | 5C | \ | 7C | | |
| 1D | GS | 3D | = | 5D | ] | 7D | } |
| 1E | RS | 3E | > | 5E | ^ | 7E | ~ |
| 1F | US | 3F | ? | 5F | _ | 7F | DEL |
+----+-----+----+-----+----+-----+----+-----+
参考:
https://stackoverflow.com/a/6182386/7151777
https://stackoverflow.com/a/23471183/7151777