准备在csdn上发布文章,但是自己有个人博客网站,文章同时发布个人 网站和csdn操作发布两边操作太过麻烦,于是想到了模拟登陆,通过模拟登陆csdn网站后,再进行推送文章内容到csdn上。
1、第一步是抓取到对应的登陆地址和相关传输的信息。
csdn的登陆地址是https://passport.csdn.net/account/verify,
可以在POST的信息中查看到对应的登陆信息
username:weixin_42740498
password:你的密码
lt:LT-52961-eMo02Bz4tlj3wQzDMSDA2dsqE
execution:e15s2
_eventId:submit
除了对应的登陆账号和密码外还能才开到lt,execution,_eventId这个结果参数,参考之前写过的python模拟登陆csdn,是在登陆的时候from提交中带的参数,可以直接在页面或者获取
<!-- 该参数可以理解成每个需要登录的用户都有一个流水号。只有有了webflow发放的有效的流水号,用户才可以说明是已经进入了webflow流程。否则,没有流水号的情况下,webflow会认为用户还没有进入webflow流程,从而会重新进入一次webflow流程,从而会重新出现登录界面。 -->
<input type="hidden" name="lt" value="LT-2435240-Nu7f3yuoN1eMZvcAmDsRmPgq17xIMd" />
<input type="hidden" name="execution" value="e2s1" />
<input type="hidden" name="fkid" id="fkid" value="" />
<input type="hidden" name="_eventId" value="submit" />
<input class="logging" accesskey="l" value="登 录" tabindex="6" type="button" />
<!-- data-kuick='{"act":"csdn05","desc":"直接登录"}' -->
测试只获取这几个参数还是登陆存在异常,查看了一下登陆的时候的cookie值,发现其中几个cookie都是固定不变的
"UserName":"weixin_42740498",
"UserInfo":"FK6HcljRDYdAyejwq9BQ558Z4N5YHLP4UXz3RVLRtNUVa5irrsuv558Z4N5YHLP4UXz3RVLRtNU5uH2hewDCAS6%2BaDo1eyiQx%2FWlsARwnvVC8rg5SyXoPql%2BZ12KxAo%2FKqmjShy6U1b9DQQXzKE8HKZOj5Cay0%2Bew%3D%3D",
"UserNick":"%E5%AD%A4%7%A1%E7%e%E5%B7%A1%E7%A4%BC",
"UserToken":"FK6HcljRDYdAyejwq9BQQ%2B1Pql%2BZ12KxAo%2FKqmjSv558Z4N5YHLP4UXz3RVLRtNU5uH2hewDCAS6%2BaDo1eyiQx%2FWlsARwnvVC8rg5SyXoPql%2BZ12KxAo%2FKqmjShy6U1bubjIKZ4i5P9pepVRmPmIKdCZhxA6KuC%2Fyuola8bAelGWgqOOCmrfEW7X%2BFxAPRi5"
将对应的登陆账号信息和不变的cookie存到user.ini中方便后面模拟登陆调用。
weixin_42740498
您的密码
{"UserName":"weixin_42740498","UserInfo":"FK6HcljRDYdAy*********B1U31%2FsQAfXVa5irrsuv558Z4N5YHLP4UXz3RVLRtNU5uH2hewDCAS6%2BaDo1eyiQx%2FWlsARwnvVC8rg5SyXoPql%2BZ12KxAo%2FKqmjShy6U1b9DQQXzKE8HKZOj5Cay0%2Bew%3D%3D","UserNick":"%E5%******de%E5%B7%A1%E7%A4%BC","UserToken":"FK6HcljRDYdA*****FsQAfXVa5irrsuv558Z4N5YHLP4UXz3RVLRtNU5uH2hewDCAS6%2BaDo1eyiQx%2FWlsARwnvVC8rg5SyXoPql%2BZ12KxAo%2FKqmjShy6U1bubjIKZ4i5P9pepVRmPmIKdCZhxA6KuC%2Fyuola8bAelGWgqOOCmrfEW7X%2BFxAPRi5"}
实现模拟登陆源码如下,在访问其他页面的时候直接用session去进行请求对应页面即可
import requests,os
from bs4 import BeautifulSoup
from http import cookiejar
login_url='https://passport.csdn.net/account/verify'
headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36', 'Accept-Encoding': 'gzip, deflate','Connection': 'keep-alive','Accept': '*/*'}
def get_setting():
try:
with open(os.path.abspath('.') + '/user.ini', 'r') as f:
u_name = f.readline().strip()
u_password = f.readline().strip()
cookies = eval(f.readline().strip())
except IOError:
print("user.ini不存在,请检查!")
exit()
return(u_name, u_password,cookies)
def get_lt():
response = requests.post(login_url)
# f = open('csdnlogin.html', 'w')
# f.write(response.text)
# f.close()
soup = BeautifulSoup(response.text, "html.parser")
for input in soup.form.find_all("input"):
if input.get("name") == "lt":
lt = input.get("value")
if input.get("name") == "execution":
execution = input.get("value")
return(lt,execution)
def get_session():
global session
session = requests.Session()
(lt,execution) = get_lt()
(u_name, u_pwd, cookies) = get_setting()
c = requests.cookies.RequestsCookieJar()
data = {
"username":u_name,
"password":u_pwd,
"lt":lt,
"execution":execution,
"_eventId":"submit"
}
for k, v in cookies.items():
c.set(k, v)
session.cookies.update(c)
#print(session.cookies)
headers={ 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36', 'Accept-Encoding': 'gzip, deflate','Connection': 'keep-alive','Accept': '*/*'}
try:
session.post(login_url, headers=headers,data=data)
# html=session.post('https://my.csdn.net',headers=headers).text
# print(session.cookies)
# f = open('csdn.html', 'w')
# f.write(html)
# f.close()
print('ok')
except Exception as e:
raise e