一般在訪問網頁的時候,response會向session中寫入cookie,例如:
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Length: 41
Access-Control-Allow-Methods: GET,PATCH,PUT,POST,DELETE,OPTIONS
Content-Encoding: gzip
Vary: Accept-Encoding
ETag: "fa4cf03c0ac47ca1c52ed2df2b71dfda86db6655"
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin:
Access-Control-Allow-Headers:
X-Backend-Server: zhihuaccount.account-api-web.a52568c8---10.64.91.2:31023[10.64.91.2:31023]
Server: ZWS
Expires: Mon, 05 Nov 2018 05:56:33 GMT
Cache-Control: max-age=0, no-cache, no-store
Pragma: no-cache
Date: Mon, 05 Nov 2018 05:56:33 GMT
Connection: keep-alive
Set-Cookie: tgw_l7_route=1c2b7f9548c57cd7d5a535ac4812e20e; Expires=Mon, 05-Nov-2018 06:11:33 GMT; Path=/
Set-Cookie: capsion_ticket="2|1:0|10:1541397393|14:capsion_ticket|44:ZWU2ZDA5YzNlM2I5NGQ3Y2E4ZjBlOTEwNWUyMDU2ZTI=|68d278e65ecb127845d2caab375312237cf9b9e211fb209bd9771f80248b3c20"; Domain=zhihu.com; expires=Wed, 05 Dec 2018 05:56:33 GMT; httponly; Path=/
Set-Cookie: _xsrf=KklHXAIziThtvjSTe7DSHbECOyr7IIR2; path=/; domain=zhihu.com; expires=Fri, 23-Apr-21 05:56:33 GMT
這裡設置了capsion_ticket
,tgw_l7_route
,_xsrf
三個值到cookies中,如果細心的觀察會發現_xsrf=KklHXAIziThtvjSTe7DSHbECOyr7IIR2
是沒有加雙引號的,而這個,capsion_ticket="2|1:0|10:1541397393|14:capsion_ticket|44:ZWU2ZDA5YzNlM2I5NGQ3Y2E4ZjBlOTEwNWUyMDU2ZTI=|68d278e65ecb127845d2caab375312237cf9b9e211fb209bd9771f80248b3c20"
是加了雙引號的
在Python2 中通過session.cookies.get_dict()來獲取所有cookies鍵值對,這個時候會發現,'capsion_ticket': '"2|1:0|10:1541397393|14:capsion_ticket|44:ZWU2ZDA5YzNlM2I5NGQ3Y2E4ZjBlOTEwNWUyMDU2ZTI=|68d278e65ecb127845d2caab375312237cf9b9e211fb209bd9771f80248b3c20"'
,有2個引號,導致後續的很多問題,比如說,本來模擬登陸的流程沒有問題,可是卻顯示登陸不成功,這個在登陸知乎的時候發現的
解決的辦法:手動添加cookies
session.cookies.set('capsion_ticket', None)
c = RequestsCookieJar()
c.set('capsion_ticket', dict['capsion_ticket'].replace('"',''))
session.cookies.update(c)