这次给大家带来的是一个2500的爬虫外包项目,在这里肯定有人说这个不值这么多,也有人会各种嘲讽。但是别忘了,人的眼界不一样,做事的决定自然会不同。
如果2500能给你带来25000的利润,那么你会选择毫不犹豫的支付吗?话不多说,看聊天记录,还是挺久以前做的了。(我后面再给大家一一道来,暂时先看咱们的教程。)
分析(x0)
进入目标网站:https://www.laifeng.com/
随便进入一个分类,点击进入某直播间
当我们点击弹幕直播框时,它会要求咱们登录账号。这也实属正常,没办法那咱们就登录吧。
登录成功后,咱们抓一下这个发送弹幕的包:
确实尴尬,我不知道发送什么,就说了句您好,没想到小姐姐很惊讶的说您好,您也好…然后说听完一首《飘向北方》就下播了…我不知道他下播了后,我还是否可以发送弹幕。
不管了,我先抽支烟看看小姐姐听完这首歌再接着写。
emmm,roomid为直播间的房间号,content为我发送的内容。
roomid可以在url中看得到的,前面那个图我没截出来,自己看一下就知道了。
t为时间戳,sign签名也是JavaScript加密的。其它值不变,自己发送两次弹幕抓包对比一下就好了。
分析(x1)
有人会奇怪你怎么知道t为时间戳…这玩意还需要说么,还是说一下吧,查找一下t的来源,我觉得向这种短的参数,最好别直接搜t,你会搜出来一大堆的。我建议搜临近的值sign,因为你提交的表单中有这么多的参数,那么在js文件中基本也会有相对应的参数的。
t:i意思是把i赋值给t,而
i = (new Date).getTime()
哦豁,没学过前端的人就看不懂这个是啥意思了,这个其实就是JavaScript语法中的取现行时间。
在我们的鬼鬼js调试工具看看效果:
可以看到是它是一直在变化的,就像咱们的时间一样一直在流逝变化。不懂什么是时间戳的自己去谷歌一下。
或者在咱们的控制台也可以得到它:
既然它是利用JavaScript这么个语法生成的参数,那么我们用Python如何实现?
OK,至此已经解决第一个加密的参数。
分析(x2)
接下来就是大头菜了,咱们分析sign签名是如何得到的:
好吧,很多位置参数,压根不知道是如何得来的但是可以看到它用到 i 这个参数,也就是咱们的时间戳。
还是debug一下吧:
打个断点,在浏览器上随意发送一弹幕,发现g就是appkey是一个定值,c为一个字典,咱们要取的是c字典中data键所对应的值:
是不是好熟悉,这不就是咱们post中的data的值么?那么就只剩下d了。
d为一个字典,而咱们需要的是d里面的一个叫token键对应的值:
凭我经验,这个d根本不需咱们去找它应该就是咱们的cookies,直接搜一下就完事了…
那么到此为止,咱们的所有参数都已经分析完了,咱们开始测试一下:
发现缺少对象,emmmmm我三十岁的人都没对象…这里毫无疑问就是少了h这个函数对象。
那么咱们去给它找出来即可
点一下这个花括号,然后这样子的话函数末也有出现这么一个横杠,然后把JavaScript代码抠下来再来测试:
OK,到此为止,咱们已经完成了百分之五十了,为什么最核心的部分完成确只完成百分之五十呢?因为这个项目为三个程序:自动对接接吗台子注册账号、房间ID号采集筛选出主播在线的ID号、咱们这个的话就是关键的发送程序。
代码
Python代码:
import requests
import execjs
import time
# 携带cookies进入主页
ck = '123456'
headers = {
'authority': 'www.laifeng.com',
'method': 'GET',
'path': '/?',
'scheme': 'https',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7',
'cache-control': 'no-cache',
'cookie': ck,
'pragma': 'no-cache',
'sec-ch-ua-mobile': '?0',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'none',
'sec-fetch-user': '?1',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3870.400 QQBrowser/10.8.4405.400',
}
url = 'https://www.laifeng.com/'
r = requests.get(url, headers=headers)
print(r.text)
# 进入直播间
url = f'https://v.laifeng.com/711329'
headers = {
'authority': 'v.laifeng.com',
'method': 'GET',
'path': '/711329',
'scheme': 'https',
'accept': 'application/json, text/javascript, */*; q=0.01',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7',
'cache-control': 'no-cache',
# 'cookie': 'mk=453ed14d2f6e4871ba8f09cfefcba1a3; cna=XacYGCzzLgkCAW411IXRpIVk; P_ck_ctl=5633DEE6E3B29B5783C835500682978B; xlly_s=1; P_ck_ctr=870A7039ED41E1AF8A4E5AF9A7D413C5; premium_cps=0_0%7C76%7C85232%7C0___; cmk=80ff16e81ec844f7946cfb91262dc197; P_pck_rm=z99ATxqw33e2f40cb442c0ZBnFgC5IrE%2BWMeN4%2BR%2BFlfXo6CU2HuinwjayRmNzYP5BIz5HRZzLPXYEPoNGojmxulCSHs6dFdWS1WNMYs6WkVelQxcsN%2FkHwmDvakV1b8hA0MqQXvBvTdMeakZiDzsBNT%2BuFifi6PNbRVoQ%3D%3D%5FV2; P_gck=NA%7CPmRanzni%2BsGuV8NRBrUBaw%3D%3D%7CNA%7C1621735722621; P_sck=8agJlNkqujZS6MrSyNJwjanMcMbXipu2qC%2BxD4UmyvNoTDHSq7Nah1Epvqm%2FaUXXcspBt9AU9cvP8ksA8NHQKpdD9h1%2Bd0oOFKVzm2HD0ZkEUaPPVJ28NNQmgMPfzqvrbS6Rz1TAHSvGhiEJt9gmuQ%3D%3D; uk=1362040016; anchor-task-tips=vistived; fansTuan-tips=vistived; _m_h5_tk=298bfbf0d1f474b3cd1e7566f68193a1_1621744352184; _m_h5_tk_enc=2d0745b898b59366cf07b6efa7cd875b; isg=BGVlUl6Rac0flo0Rbvh1ju8VdCGfohk0iyhzDmdLjxyrfoTwL_CVBFBWCOII_jHs; imk=MTM2MjA0MDAxNi0xLTE2MjE3NDA2ODg4MDUtMTYyMTgyNzA4ODgwNQ%3D%3D-FC64AFD62649932D1C07520D9BCA6A50; __ysuid=1621740688797e9b',
'pragma': 'no-cache',
'sec-ch-ua-mobile': '?0',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3870.400 QQBrowser/10.8.4405.400',
'x-requested-with': 'XMLHttpRequest',
}
r = requests.get(url, headers=headers)
print(r)
print(r.text)
# 获取sign
t = str(int(time.time()*1000))
with open('js1.js', 'r', encoding='utf-8') as f:
ctx = execjs.compile(f.read())
sign = ctx.call('test', '{"roomId":"711329","content":"找个壮男薇123456"}', t)
print(sign)
# 发送弹幕
url = f'https://acs.laifeng.com/h5/mtop.youku.live.platform.chat/1.0/?jsv=2.6.1&appKey=24679788&t={t}&sign={sign}&type=originaljson&dataType=json&api=mtop.youku.live.platform.chat&v=1.0&ecode=1'
data = {
'data': '{"roomId":"711329","content":"找个壮男薇123456"}'
}
headers = {
'authority': 'acs.laifeng.com',
'method': 'POST',
'path': f'/h5/mtop.youku.live.platform.chat/1.0/?jsv=2.6.1&appKey=24679788&t={t}&sign={sign}&type=originaljson&dataType=json&api=mtop.youku.live.platform.chat&v=1.0&ecode=1',
'scheme': 'https',
'accept': 'application/json',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7',
'cache-control': 'no-cache',
'content-length': '65',
'content-type': 'application/x-www-form-urlencoded',
# 'cookie': 'mk=453ed14d2f6e4871ba8f09cfefcba1a3; cna=XacYGCzzLgkCAW411IXRpIVk; xlly_s=1; cmk=80ff16e81ec844f7946cfb91262dc197; P_gck=NA%7CPmRanzni%2BsGuV8NRBrUBaw%3D%3D%7CNA%7C1621735722621; uk=1362040016; anchor-task-tips=vistived; fansTuan-tips=vistived; P_sck=%2BUdo1iqrKx4%2FVd0vPYAbiHOpyqJ39%2Fw8brn2mmb2jYLlctldDNJ2qXSzYFPJDEYzodI65rDbJDzRtM6T7xkFNfREb9ajH8aAhsEioWLTbTp9LqNh%2ByYY7yW43dhpBBcSerlcOCmoajgMf%2BWzmhN7zw%3D%3D; P_pck_rm=z99ATxqw33e2f40cb442c0ZBnFgC5IrE%2BWMeN4%2BR%2BFlfXo6CU2HuinwjayRmNzYP5BIz5HRZzLPXYEPoNGojmxulCSHs6dFdWS1WNMYs6WkVelQxcsN%2FkHwmDvakV1b8hA0MqQXvBvTdMeakZiDzsBNT%2BuFifi6PNbRVoQ%3D%3D_V2; _m_h5_tk=83a19c51d0630a852efa9b4189393fca_1621764171783; _m_h5_tk_enc=2346991bccefcfc40b4ddb78c83c888f; __ysuid=1621760043001XhI; imk=MTM2MjA0MDAxNi0xLTE2MjE3NjAwNDM3MTAtMTYyMTg0NjQ0MzcxMA%3D%3D-1AEDB0971C2FA6CB9B62CAA7858E1C42; isg=BNHRDPTLVUpHT7ldmhTZitMx4N1rPkWwB8xnQrNkNxiMWvOs-ouggV6o_i68yd3o',
'cookie': ck,
'origin': 'https://v.laifeng.com',
'pragma': 'no-cache',
'referer': 'https://v.laifeng.com/711329',
'sec-ch-ua-mobile': '?0',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-site',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3870.400 QQBrowser/10.8.4405.400',
}
r = requests.post(url, headers=headers, data=data)
print(r)
print(r.text)
需要自己手动拿下登录页码的cookies,然后自己更改好发送的内容与房间id即可。
JavaScript源码:
function h(a) {
function b(a, b) {
return a << b | a >>> 32 - b
}
function c(a, b) {
var c, d, e, f, g;
return e = 2147483648 & a,
f = 2147483648 & b,
c = 1073741824 & a,
d = 1073741824 & b,
g = (1073741823 & a) + (1073741823 & b),
c & d ? 2147483648 ^ g ^ e ^ f: c | d ? 1073741824 & g ? 3221225472 ^ g ^ e ^ f: 1073741824 ^ g ^ e ^ f: g ^ e ^ f
}
function d(a, b, c) {
return a & b | ~a & c
}
function e(a, b, c) {
return a & c | b & ~c
}
function f(a, b, c) {
return a ^ b ^ c
}
function g(a, b, c) {
return b ^ (a | ~c)
}
function h(a, e, f, g, h, i, j) {
return a = c(a, c(c(d(e, f, g), h), j)),
c(b(a, i), e)
}
function i(a, d, f, g, h, i, j) {
return a = c(a, c(c(e(d, f, g), h), j)),
c(b(a, i), d)
}
function j(a, d, e, g, h, i, j) {
return a = c(a, c(c(f(d, e, g), h), j)),
c(b(a, i), d)
}
function k(a, d, e, f, h, i, j) {
return a = c(a, c(c(g(d, e, f), h), j)),
c(b(a, i), d)
}
function l(a) {
for (var b, c = a.length,
d = c + 8,
e = (d - d % 64) / 64, f = 16 * (e + 1), g = new Array(f - 1), h = 0, i = 0; c > i;) b = (i - i % 4) / 4,
h = i % 4 * 8,
g[b] = g[b] | a.charCodeAt(i) << h,
i++;
return b = (i - i % 4) / 4,
h = i % 4 * 8,
g[b] = g[b] | 128 << h,
g[f - 2] = c << 3,
g[f - 1] = c >>> 29,
g
}
function m(a) {
var b, c, d = "",
e = "";
for (c = 0; 3 >= c; c++) b = a >>> 8 * c & 255,
e = "0" + b.toString(16),
d += e.substr(e.length - 2, 2);
return d
}
function n(a) {
a = a.replace(/\r\n/g, "\n");
for (var b = "",
c = 0; c < a.length; c++) {
var d = a.charCodeAt(c);
128 > d ? b += String.fromCharCode(d) : d > 127 && 2048 > d ? (b += String.fromCharCode(d >> 6 | 192), b += String.fromCharCode(63 & d | 128)) : (b += String.fromCharCode(d >> 12 | 224), b += String.fromCharCode(d >> 6 & 63 | 128), b += String.fromCharCode(63 & d | 128))
}
return b
}
var o, p, q, r, s, t, u, v, w, x = [],
y = 7,
z = 12,
A = 17,
B = 22,
C = 5,
D = 9,
E = 14,
F = 20,
G = 4,
H = 11,
I = 16,
J = 23,
K = 6,
L = 10,
M = 15,
N = 21;
for (a = n(a), x = l(a), t = 1732584193, u = 4023233417, v = 2562383102, w = 271733878, o = 0; o < x.length; o += 16) p = t,
q = u,
r = v,
s = w,
t = h(t, u, v, w, x[o + 0], y, 3614090360),
w = h(w, t, u, v, x[o + 1], z, 3905402710),
v = h(v, w, t, u, x[o + 2], A, 606105819),
u = h(u, v, w, t, x[o + 3], B, 3250441966),
t = h(t, u, v, w, x[o + 4], y, 4118548399),
w = h(w, t, u, v, x[o + 5], z, 1200080426),
v = h(v, w, t, u, x[o + 6], A, 2821735955),
u = h(u, v, w, t, x[o + 7], B, 4249261313),
t = h(t, u, v, w, x[o + 8], y, 1770035416),
w = h(w, t, u, v, x[o + 9], z, 2336552879),
v = h(v, w, t, u, x[o + 10], A, 4294925233),
u = h(u, v, w, t, x[o + 11], B, 2304563134),
t = h(t, u, v, w, x[o + 12], y, 1804603682),
w = h(w, t, u, v, x[o + 13], z, 4254626195),
v = h(v, w, t, u, x[o + 14], A, 2792965006),
u = h(u, v, w, t, x[o + 15], B, 1236535329),
t = i(t, u, v, w, x[o + 1], C, 4129170786),
w = i(w, t, u, v, x[o + 6], D, 3225465664),
v = i(v, w, t, u, x[o + 11], E, 643717713),
u = i(u, v, w, t, x[o + 0], F, 3921069994),
t = i(t, u, v, w, x[o + 5], C, 3593408605),
w = i(w, t, u, v, x[o + 10], D, 38016083),
v = i(v, w, t, u, x[o + 15], E, 3634488961),
u = i(u, v, w, t, x[o + 4], F, 3889429448),
t = i(t, u, v, w, x[o + 9], C, 568446438),
w = i(w, t, u, v, x[o + 14], D, 3275163606),
v = i(v, w, t, u, x[o + 3], E, 4107603335),
u = i(u, v, w, t, x[o + 8], F, 1163531501),
t = i(t, u, v, w, x[o + 13], C, 2850285829),
w = i(w, t, u, v, x[o + 2], D, 4243563512),
v = i(v, w, t, u, x[o + 7], E, 1735328473),
u = i(u, v, w, t, x[o + 12], F, 2368359562),
t = j(t, u, v, w, x[o + 5], G, 4294588738),
w = j(w, t, u, v, x[o + 8], H, 2272392833),
v = j(v, w, t, u, x[o + 11], I, 1839030562),
u = j(u, v, w, t, x[o + 14], J, 4259657740),
t = j(t, u, v, w, x[o + 1], G, 2763975236),
w = j(w, t, u, v, x[o + 4], H, 1272893353),
v = j(v, w, t, u, x[o + 7], I, 4139469664),
u = j(u, v, w, t, x[o + 10], J, 3200236656),
t = j(t, u, v, w, x[o + 13], G, 681279174),
w = j(w, t, u, v, x[o + 0], H, 3936430074),
v = j(v, w, t, u, x[o + 3], I, 3572445317),
u = j(u, v, w, t, x[o + 6], J, 76029189),
t = j(t, u, v, w, x[o + 9], G, 3654602809),
w = j(w, t, u, v, x[o + 12], H, 3873151461),
v = j(v, w, t, u, x[o + 15], I, 530742520),
u = j(u, v, w, t, x[o + 2], J, 3299628645),
t = k(t, u, v, w, x[o + 0], K, 4096336452),
w = k(w, t, u, v, x[o + 7], L, 1126891415),
v = k(v, w, t, u, x[o + 14], M, 2878612391),
u = k(u, v, w, t, x[o + 5], N, 4237533241),
t = k(t, u, v, w, x[o + 12], K, 1700485571),
w = k(w, t, u, v, x[o + 3], L, 2399980690),
v = k(v, w, t, u, x[o + 10], M, 4293915773),
u = k(u, v, w, t, x[o + 1], N, 2240044497),
t = k(t, u, v, w, x[o + 8], K, 1873313359),
w = k(w, t, u, v, x[o + 15], L, 4264355552),
v = k(v, w, t, u, x[o + 6], M, 2734768916),
u = k(u, v, w, t, x[o + 13], N, 1309151649),
t = k(t, u, v, w, x[o + 4], K, 4149444226),
w = k(w, t, u, v, x[o + 11], L, 3174756917),
v = k(v, w, t, u, x[o + 2], M, 718787259),
u = k(u, v, w, t, x[o + 9], N, 3951481745),
t = c(t, p),
u = c(u, q),
v = c(v, r),
w = c(w, s);
var O = m(t) + m(u) + m(v) + m(w);
return O.toLowerCase()
}
function test(tk, data_one, i) {
g = "24679788";
/i = (new Date).getTime();
j = h(tk + "&" + i + "&" + g + "&" + data_one);
return j;
}
效果
关于Python技术储备
学好 Python 不论是就业还是做副业赚钱都不错,但要学会 Python 还是要有一个学习规划。最后大家分享一份全套的 Python 学习资料,给那些想学习 Python 的小伙伴们一点帮助!
一、Python所有方向的学习路线
Python所有方向路线就是把Python常用的技术点做整理,形成各个领域的知识点汇总,它的用处就在于,你可以按照上面的知识点去找对应的学习资源,保证自己学得较为全面。
二、学习软件
工欲善其事必先利其器。学习Python常用的开发软件都在这里了,给大家节省了很多时间。
三、入门学习视频
我们在看视频学习的时候,不能光动眼动脑不动手,比较科学的学习方法是在理解之后运用它们,这时候练手项目就很适合了。
四、实战案例
光学理论是没用的,要学会跟着一起敲,要动手实操,才能将自己的所学运用到实际当中去,这时候可以搞点实战案例来学习。
五、面试资料
我们学习Python必然是为了找到高薪的工作,下面这些面试题是来自阿里、腾讯、字节等一线互联网大厂最新的面试资料,并且有阿里大佬给出了权威的解答,刷完这一套面试资料相信大家都能找到满意的工作。