今天学习了requests的高级功能
关键词:
上传文件
cookies
会话
ssl
上传文件的方法主要是通过post的files参数携带文件传递给目标服务器
requests里cookies的操作方法比urllib的要好用不少,直接在头中声明cookies,在get/post方法的参数中设置header参数,就可以完成一次携带cookies的通信
想要模拟在网站登陆后进一步进行其他操作,就需要用到requests.Session开启一次会话,类似于在浏览器一个页面中加分页,不用Session单纯的堆积get/post方法,相当于一直在打开新的浏览页,无法进行登陆完成后后接下去的操作
对于https的网站,需要ssl验证,这是可通过设置verify参数来完成相关操作
练习1 利用reqeuests 上传文件
# import requests
# file = {
# 'file':open('favicon.ico','rb')
# }
# r = requests.post('http://httpbin.org/post',files=file)
# print(r.text)
练习2 requests获取cookies
# import requests
# r = requests.get('https://www.baidu.com')
# print(r.cookies)
# for key,value in r.cookies.items():
# print(key,'=',value)
练习3 以知乎为例,直接利用Cookie来维持登录状态
从运行后的返回内容来看,添加了cookies并没有使我保持登陆场状态,可能是知乎网站的问题,换成书中原封不动的实例也不行。
# headers = {
# 'Cookie':'q_c1=3bf00974176f472ca72c824534cc7431|1499865451000|1496407303000; q_c1=3bf00974176f472ca72c824534cc7431|1499865451000|1496407303000; aliyungf_tc=AQAAAEoieUByaw4Akezd3VaWaXwAHvr9; d_c0="AGCC8l5TJgyPToQQUyVdBsxAuLf4tupet5g=|1501510422"; _zap=7e0d2e73-734c-4097-96fb-4de3e8bf63cb; r_cap_id="ZmIxMWIxOWNmNzZmNDMzYmE1NjU3YjE3NzA1YjBlOTU=|1501679120|9c62f3168346cab1cd571d5fc59e35e5923b62b3"; cap_id="MzNiZTgxOTkyNDhkNGQzMGFjNDY4NjliMDAxMTE5NDk=|1501679120|65cc239f4b465d8bffa0b0b38fca613cad82fa2b"; __utma=51854390.1926991723.1501510421.1501510421.1501679121.2; __utmb=51854390.0.10.1501679121; __utmc=51854390; __utmz=51854390.1501679121.2.2.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); __utmv=51854390.000--|3=entry_date=20170602=1; z_c0=Mi4wQUFEQU5xUERqUW9BWUlMeVhsTW1EQmNBQUFCaEFsVk5KMXVwV1FCZWduY09wei14UkxwOXg5SmhWejdIVHg5d1h3|1501679143|bb651c30a76390fb5e15c4628c8cf56343187b7e; _xsrf=2b9e998a-66a9-467b-9aa1-bdf2c2b181b7',
# 'Host':'www.zhihu.com',
# 'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36',
# 'Referer':'https://www.google.co.jp/'
# }
# r = requests.get('http://www.zhihu.com',headers=headers)
# print(r.text)
# print(r.cookies)
练习4 requests.Session
# import requests
# s = requests.Session()
# s.get('http://httpbin.org/cookies/set/number/123456789')
# r = s.get('http://httpbin.org/cookies')
# print(r.text)
练习5 SSL证书验证
# import requests
# r = requests.get('https://www.12306.cn',verify=False)
# print(r.status_code)
# 练习4 requests.Session
# import requests
练习6 无视警告
# import requests
# from requests.packages import urllib3
# urllib3.disable_warnings()
# r = requests.get('https://www.12306.cn',verify=False)
# print(r.status_code)