首先进入github登录页:https://github.com/login
输入账号密码,打开开发者工具,在Network页勾选上Preserve Log(显示持续日志),点击登录,查看Session请求,找到其请求的URL与Form Data、Headers。此时除Cookies与authenticity_token无法直接获得外,其余模拟登录所需参数皆已获得。
我们进入登录页点击登录后,浏览器会向服务器发送这些信息,所以这些信息是在登录页就已设置好的。所以我们在登录页源码中搜索authenticity_token,果然找到了它的值。在Response-Headers中观察到有一个set-cookies的字段,这个就是设置cookies的过程。下面给出代码示例。
import requests from lxml import etree class Login(object): def __init__(self): #headers设置 self.headers = { 'Referer':'https://github.com/', 'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36', 'Host':'github.com' } self.login_url = 'https://github.com/login' self.post_url = 'https://github.com/session' self.logined_url = 'https://github.com/settings/profile' #用以维持会话与处理cookies self.session = requests.Session() #获取authenticity_token参数,在登录页源码中获取 def token(self): response = self.session.get(self.login_url,headers=self.headers) selector = etree.HTML(response.text) token = selector.xpath('//input[@name="authenticity_token"]/@value') return token #登录 def login(self,email,password): #在Session的Form Data中获取 post_data = { 'commit':'Sign in', 'utf8':'✓', 'authenticity_token':self.token(), 'login':email, 'password':password } response = self.session.post(self.post_url,data=post_data,headers=self.headers) if response.status_code == 200: print(response.text) else: print(response) def main(args): login = Login()
#此处需修改为读者的邮箱与密码 login.login(email='email',password='password') return 0 if __name__ == '__main__': import sys sys.exit(main(sys.argv))