自动登录Github --网络爬虫

1、向Github登录页面发送get请求获取csrf_token
import requests
from bs4 import BeautifulSoup

# request login page to get csrfequest
resp = requests.get('https://github.com/login')
login_bs = BeautifulSoup(resp.text, 'html.parser')
token = login_bs.find(name='input', attrs={'name': 'authenticity_token'}).get('value')
get_cookies_dict = resp.cookies.get_dict()  # get cookies
print(token)

在这里插入图片描述

2、发送post请求登录Github获取cookies

获取登录需要带的参数,
在这里插入图片描述

import requests
from bs4 import BeautifulSoup

# request login page to get csrfequest
resp = requests.get('https://github.com/login')
login_bs = BeautifulSoup(resp.text, 'html.parser')
token = login_bs.find(name='input', attrs={'name': 'authenticity_token'}).get('value')
get_cookies_dict = resp.cookies.get_dict()  # get cookies
# login github with cookies and other parameters,remember save cookies
resp2 = requests.post(
    'https://github.com/session',
    data={
        'utf8': '✓',
        'authenticity_token': token,
        'login': 'username',#your username
        'password': 'password',#your password
        'webauthn-support': 'unknown',
        'commit': 'Sign in',
    },
    cookies=get_cookies_dict
)
post_cookie_dict = resp2.cookies.get_dict()
# print(get_cookies_dict)
# print(post_cookie_dict)
cookies_dict = {}
cookies_dict.update(get_cookies_dict)
cookies_dict.update(post_cookie_dict)
print(cookies_dict)

在这里插入图片描述

3、带着cookies访问页面并获取内容
import requests
from bs4 import BeautifulSoup

# request login page to get csrfequest
resp = requests.get('https://github.com/login')
login_bs = BeautifulSoup(resp.text, 'html.parser')
token = login_bs.find(name='input', attrs={'name': 'authenticity_token'}).get('value')
get_cookies_dict = resp.cookies.get_dict()  # get cookies
# login github with cookies and other parameters,remember save cookies
resp2 = requests.post(
    'https://github.com/session',
    data={
        'utf8': '✓',
        'authenticity_token': token,
        'login': 'username',#your username
        'password': 'password',#your password
        'webauthn-support': 'unknown',
        'commit': 'Sign in',
    },
    cookies=get_cookies_dict
)
post_cookie_dict = resp2.cookies.get_dict()
# print(get_cookies_dict)
# print(post_cookie_dict)
cookies_dict = {}
cookies_dict.update(get_cookies_dict)
cookies_dict.update(post_cookie_dict)
# request primary page
request_url = 'https://github.com/settings/profile'
resp3 = requests.get(url=request_url, cookies=cookies_dict)
print(resp3.text)

转载于:https://www.cnblogs.com/qikeyishu/p/10972251.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值