最近在运用requests库登陆网页,并获取个人信息时,出现了很多问题,特在此记录一下。
- 关于请求参数(数据)
查看请求标头(Requests Headers),其中
content-type: application/json...
则无论是post请求和get请求,数据部分需以python字典形式传递。
如果为application/x-www-form-urlencoded,则需要使用将其转换为a=xxx&b=xxx形式
- 关于参数中%2B,%2F等形式数据,并且以application/x-www-form-urlencoded形式传递
以application/x-www-form-urlencoded形式传参,需借用urllib库中urllib.parse.urlencode()函数将字典转换为urlencode形式。需要注意的是,urlencode函数是要对字典中的数据再编码处理的。
所以,如果参数中含有%2B或%2F等形式的数据,说明该参数已经被编码了,所以需要先还原,再转urlencode形式。如果%2B或%2F等形式的数据量少可以按下表手动替换后再运用urlencode函数转换。
序号 | 特殊字符(编码后) | 原代号 |
1 | %2B | + |
2 | %20 | 空格 |
3 | %2F | / |
4 | %3F | ? |
5 | %25 | % |
6 | %23 | # |
7 | %26 | & |
8 | %3D | = |
如果量大则需先用urllib.parse.unquote()函数一次性转换后,再运用urllib.parse.urlencode()函数转换为urlencode形式参数。
比如
#原来的__VIEWSTATE值已经编码,后续parse.urlencode还要编码,所以需要先解码
viewitem = '%2FwEPDwUKLTM0ODE2Mjg4Ng8WKB4HU3RyU3FsMgUFIDE9MSAeBHR5cGUFAnliHgdTdHJTcWwxBQUgMT0xIB4LU3RyU2VhblNxbDNlHgpPcmRlckJ5U3FsBQdJRCBERVNDHgtTdHJTZWFuU3FsMmUeClN0clNlYW5TcWxlHg9Qb3N0TWFzdGVyVGFibGUFDFhUX1pCR0xfWkJDUR4JU3RyU3FsU2VsBQUgMT0xIB4OU29ydEV4cHJlc3Npb24FBjEgZGVzYx4HU3RyU3FsMwUFIDE9MSAeC0N1cnJlbnRHdWlkBQZOb0d1aWQeCElzRXhwb3J0aB4JVGFibGVUb1NwBTtYVF9aQkdMX1pCQ1FAU0VMRUNUICogRlJPTSBWWFRfWEhQRCBXSEVSRSBJU05VTEwoVFlQRVMsMCk9MR4GU3RyU3FsBTAgYW5kICggMT0xICkgYW5kICggMT0xICkgYW5kICggMT0xICkgYW5kICggMT0xICkeC1N0clNlYW5TcWwxZR4VRGF0YUdyaWRTZWxlY3RlZEluZGV4Av%2F%2F%2F%2F8PHg9Db2RlTWFzdGVyVGFibGUFDFhUX1pCR0xfWkJDUR4FUGtleXMymAcAAQAAAP%2F%2F%2F%2F8BAAAAAAAAAAwCAAAASVN5c3RlbSwgVmVyc2lvbj0yLjAuMC4wLCBDdWx0dXJlPW5ldXRyYWwsIFB1YmxpY0tleVRva2VuPWI3N2E1YzU2MTkzNGUwODkFAQAAADJTeXN0ZW0uQ29sbGVjdGlvbnMuU3BlY2lhbGl6ZWQuTmFtZVZhbHVlQ29sbGVjdGlvbgcAAAAIUmVhZE9ubHkMSGFzaFByb3ZpZGVyCENvbXBhcmVyBUNvdW50BEtleXMGVmFsdWVzB1ZlcnNpb24AAwMABgUAATJTeXN0ZW0uQ29sbGVjdGlvbnMuQ2FzZUluc2Vuc2l0aXZlSGFzaENvZGVQcm92aWRlcipTeXN0ZW0uQ29sbGVjdGlvbnMuQ2FzZUluc2Vuc2l0aXZlQ29tcGFyZXIICAIAAAAACQMAAAAJBAAAAAEAAAAJBQAAAAkGAAAAAgAAAAQDAAAAMlN5c3RlbS5Db2xsZWN0aW9ucy5DYXNlSW5zZW5zaXRpdmVIYXNoQ29kZVByb3ZpZGVyAQAAAAZtX3RleHQDHVN5c3RlbS5HbG9iYWxpemF0aW9uLlRleHRJbmZvCQcAAAAEBAAAACpTeXN0ZW0uQ29sbGVjdGlvbnMuQ2FzZUluc2Vuc2l0aXZlQ29tcGFyZXIBAAAADW1fY29tcGFyZUluZm8DIFN5c3RlbS5HbG9iYWxpemF0aW9uLkNvbXBhcmVJbmZvCQgAAAARBQAAAAEAAAAGCQAAAAJJRBAGAAAAAQAAAAkKAAAABAcAAAAdU3lzdGVtLkdsb2JhbGl6YXRpb24uVGV4dEluZm8GAAAAD21fbGlzdFNlcGFyYXRvcgxtX2lzUmVhZE9ubHkRY3VzdG9tQ3VsdHVyZU5hbWULbV9uRGF0YUl0ZW0RbV91c2VVc2VyT3ZlcnJpZGUNbV93aW4zMkxhbmdJRAEAAQAAAAEIAQgKAQrKAAAAAH8AAAAECAAAACBTeXN0ZW0uR2xvYmFsaXphdGlvbi5Db21wYXJlSW5mbwMAAAAJd2luMzJMQ0lEB2N1bHR1cmUGbV9uYW1lAAABCAh%2FAAAAfwAAAAYLAAAAAAQKAAAAHFN5c3RlbS5Db2xsZWN0aW9ucy5BcnJheUxpc3QDAAAABl9pdGVtcwVfc2l6ZQhfdmVyc2lvbgUAAAgICQwAAAABAAAAAQAAABAMAAAAAQAAAAkLAAAACx4JSXNXZWJPZGJjaBYCZg9kFhYCAQ8WAh4Dc3JjBS4vZ3gvdGhlbWUvQmx1ZUNoYXJtL1RyZWVWaWV3SW1hZ2VzL2pvdXJuYWwuZ2lmZAIDDw8WAh4EVGV4dAUJ5Lia5Yqh6KGoZGQCBw8WAh4HVmlzaWJsZWhkAggPFgQeCGRpc2FibGVkZB4Hb25jbGljawVnZm5PcGVuTW9kYWwoJ2xvY2FsaXplci5hc3B4P01hc3RlclRhYmxlPVhUX1pCR0xfWkJDUSRVc2VUeXBlPUFkZCcsIDgwMCwgNjAwLCAnc2Nyb2xsOjE7Jyk7cmV0dXJuIGZhbHNlO2QCCg8WAh8XBQhkaXNhYmxlZGQCCw8WAh8XBQhkaXNhYmxlZGQCDA8WAh8XBQhkaXNhYmxlZGQCEw8QDxYGHg1EYXRhVGV4dEZpZWxkBQljb25kaXRpb24eDkRhdGFWYWx1ZUZpZWxkBQljb25kaXRpb24eC18hRGF0YUJvdW5kZ2QQFQES6YCJ5oup6aKE6K6%2B5p2h5Lu2FQEAFCsDAWcWAWZkAhQPDxYCHxUFDOWFqOmDqOaVsOaNrmRkAhgPDxYeHg1TaG93UGFnZUluZGV4Zx4IUGFnZVNpemUCDx4NU2hvd0ZpcnN0TGFzdGceCkFsd2F5c1Nob3dnHgxTaG93UHJldk5leHRnHgxTaG93SW5wdXRCb3goKWpXdXFpLldlYmRpeWVyLlNob3dJbnB1dEJveCwgQXNwTmV0UGFnZXIsIFZlcnNpb249NC4zLjEuMCwgQ3VsdHVyZT1uZXV0cmFsLCBQdWJsaWNLZXlUb2tlbj1mYjBhMGZlMDU1ZDQwZmQ0BEF1dG8eDUlucHV0Qm94U3R5bGUFH1RFWFQtQUxJR046IGNlbnRlcjtoZWlnaHQ6MTZweDseElRleHRCZWZvcmVJbnB1dEJveAUH6L2s5YiwIB4OQ3VzdG9tSW5mb1RleHQFUjx0YWJsZT48dHI%2BPHRkPjxub2JyPlvlhbEgPGZvbnQgY29sb3I9cmVkPjM1MTwvZm9udD4g5p2hXTwvbm9icj48L3RkPjwvdHI%2BPC90YWJsZT4eEEN1cnJlbnRQYWdlSW5kZXhmHgtSZWNvcmRjb3VudALfAh4RVGV4dEFmdGVySW5wdXRCb3gFA%2BmhtR4RU3VibWl0QnV0dG9uU3R5bGUFCGRpc3BsYXk6HhBTaG93Qm94VGhyZXNob2xkAgEeE1Nob3dEaXNhYmxlZEJ1dHRvbnNnZGQCGQ8PFgIfFQUCMTUWBB4Fc3R5bGUFJElNRS1NT0RFOmRpc2FibGVkO1RFWFQtQUxJR046Y2VudGVyOx4Kb25rZXlwcmVzcwUKaW50X2tleSgpO2RkGh3qatR90Lxgs5ldHGt%2BPfW8D58%3D'
__VIEWSTATE=parse.unquote(viewitem)
data={
'__func': '',
'__detailguid': '',
'__EVENTTARGET': 'AspNetPager1',
'__EVENTARGUMENT': 2,
'__LASTFOCUS': '',
'__VIEWSTATE':__VIEWSTATE,
}
from urllib import parse
data=parse.urlencode(data)
上述代码中data即为转换后的数据,可用于“content-type"为”application/x-www-form-urlencoded“形式的requests.post或get请求。