一、单首采集
千千音乐地址:https://music.91q,com
获取音乐的播放地址
1.播放一首歌曲,进入播放页面
打开发者工具,点击媒体一栏,刷新网页,得到下面结果
第二个数据包url就是我们想要的播放地址,复制url地址,在另一个页面打开,可以发现是歌曲。
现在目标变为找到一个数据包,里面含有这个音乐播放地址,还有视频标题。
用搜索按钮,复制url中的一串明显字符,进行搜索。
在第二个包中有音乐播放地址,和音乐标题。
单首采集代码展示:
import requests
cookies = {
'Hm_lvt_d0ad46e4afeacf34cd12de4c9b553aa6': '1711952990',
'token_type': 'access_token',
'access_token': 'ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
'refresh_token': 'ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
'userid': '993887197314678784',
'cuid': 'ce4f7807-296e-f5da-6015-f780d3c10873',
'Hm_lpvt_d0ad46e4afeacf34cd12de4c9b553aa6': '1711956161',
}
headers = {
'authority': 'music.91q.com',
'accept': '*/*',
'accept-language': 'zh-CN,zh;q=0.9',
'authorization': 'access_token ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
# 'cookie': 'Hm_lvt_d0ad46e4afeacf34cd12de4c9b553aa6=1711952990; token_type=access_token; access_token=ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=; refresh_token=ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=; userid=993887197314678784; cuid=ce4f7807-296e-f5da-6015-f780d3c10873; Hm_lpvt_d0ad46e4afeacf34cd12de4c9b553aa6=1711956161',
'from': 'web',
'referer': 'https://music.91q.com/player',
'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
}
params = {
'sign': '1f9967971bf0ab3ed546312f5846aabb',
'appid': '16073360',
'TSID':'T10038929666',
'timestamp': '1711956161',
}
json_data = requests.get('https://music.91q.com/v1/song/tracklink', params=params, cookies=cookies, headers=headers).json()
title = json_data['data']['title']
play_url = json_data['data']['path']
music_content = requests.get(url=play_url,headers=headers).content
with open('music\\'+title+'.mp3','wb') as f:
f.write(music_content)
二、多首采集
多首采集,就是看上述第二个数据包中参数有何种变化,多播放几个音频,比较参数变化。
发现有四个参数,其中有三个变化,sign是签名参数是加密的,需要逆向。appid是不变的,TSID是歌曲ID,timestamp是时间戳。
如何得到这三个变化的参数。
得到时间戳代码:‘
import time
date_time = int(time.time())
TSID是歌曲ID,这种数据一般在目录页会出现,且我们需要在动态数据包中找,不能再静态数据中找。如在搜索框搜索许嵩,出现目录页。打开开发者工具,在网络中点击XHR中,点击下一页。
在这个数据包中,含有多首歌曲的TSID数据。这个数据包的访问参数是
没有加密的数据,这个数据包包含一页数据,爬多页,改变pageNo即可。
获得TSID的代码:
search_cookies = {
'Hm_lvt_d0ad46e4afeacf34cd12de4c9b553aa6': '1711952990',
'token_type': 'access_token',
'access_token': 'ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
'refresh_token': 'ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
'userid': '993887197314678784',
'cuid': 'ce4f7807-296e-f5da-6015-f780d3c10873',
'Hm_lpvt_d0ad46e4afeacf34cd12de4c9b553aa6': '1711955156',
}
search_headers = {
'authority': 'music.91q.com',
'accept': 'application/json, text/plain, */*',
'accept-language': 'zh-CN,zh;q=0.9',
'authorization': 'access_token ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
# 'cookie': 'Hm_lvt_d0ad46e4afeacf34cd12de4c9b553aa6=1711952990; token_type=access_token; access_token=ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=; refresh_token=ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=; userid=993887197314678784; cuid=ce4f7807-296e-f5da-6015-f780d3c10873; Hm_lpvt_d0ad46e4afeacf34cd12de4c9b553aa6=1711955156',
'device-id': '9c1ce27f08b16479d2e17743062b28ed',
'from': 'web',
'referer': 'https://music.91q.com/search?word=%E8%AE%B8%E5%B5%A9',
'requestid': '1711955480_j2CdnTt',
'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
}
search_params = {
'sign': '2ebe418b78ece60302c662a2daa7b2dd',
'word': '许嵩',
'type': '1',
'pageNo': '1',
'pageSize': '20',
'appid': '16073360',
'timestamp': '1711955480',
}
search_json = requests.get('https://music.91q.com/v1/search', params=search_params, cookies=search_cookies, headers=search_headers).json()
for index in search_json['data']['typeTrack']:
TSID = index['TSID']
三、得到sign参数
在下方搜索框进行搜索
点击进入第一个js文件,按ctrl+f,找到sign位置,合理判断哪一处正确。
在这里打上断点,然后点击一首歌曲播放,得到上图的样子。 观察代码,sign由i获得,i=r.sign,r是一个函数
复制该函数,并在控制台打印它要接受的参数
初步js代码:
function createSign(e) {
if ("[object Object]" !== Object.prototype.toString.call(e))
throw new Error("The parameter of query must be a Object.");
var t = Math.floor(Date.now() / 1e3);
Object.assign(e, {
timestamp: t
});
var n = Object.keys(e);
n.sort();
for (var r = "", i = 0; i < n.length; i++) {
var s = n[i];
r += (0 == i ? "" : "&") + s + "=" + e[s]
}
return {
sign: md5(r += secret),
timestamp: t,
md5: md5
}
}
e = {"TSID": 'T10038929666',
"appid": 16073360,
"timestamp": 1711956284
}
console.log(createSign(e))
运行后
在函数上方复制secret的值
运行后
我先判断他是不是标准md5加密,发现不是
如果是标准加密算法,则这样子补函数
const CryptoJs = require('crypto-js');
function md5(pwd){
const encryptedPwd = CryptoJs.MD5(pwd).toString();
return encryptedPwd
}
在上方,复制MD5函数
然后出现
再在上方复制
然后
接着复制
然后运行得到结果:
四、最终代码
import requests
import time
import json
import execjs
import re
date_time = int(time.time())
# 获取歌曲id
search_cookies = {
'Hm_lvt_d0ad46e4afeacf34cd12de4c9b553aa6': '1711952990',
'token_type': 'access_token',
'access_token': 'ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
'refresh_token': 'ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
'userid': '993887197314678784',
'cuid': 'ce4f7807-296e-f5da-6015-f780d3c10873',
'Hm_lpvt_d0ad46e4afeacf34cd12de4c9b553aa6': '1711955156',
}
search_headers = {
'authority': 'music.91q.com',
'accept': 'application/json, text/plain, */*',
'accept-language': 'zh-CN,zh;q=0.9',
'authorization': 'access_token ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
# 'cookie': 'Hm_lvt_d0ad46e4afeacf34cd12de4c9b553aa6=1711952990; token_type=access_token; access_token=ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=; refresh_token=ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=; userid=993887197314678784; cuid=ce4f7807-296e-f5da-6015-f780d3c10873; Hm_lpvt_d0ad46e4afeacf34cd12de4c9b553aa6=1711955156',
'device-id': '9c1ce27f08b16479d2e17743062b28ed',
'from': 'web',
'referer': 'https://music.91q.com/search?word=%E8%AE%B8%E5%B5%A9',
'requestid': '1711955480_j2CdnTt',
'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
}
search_params = {
'sign': '2ebe418b78ece60302c662a2daa7b2dd',
'word': '许嵩',
'type': '1',
'pageNo': '1',
'pageSize': '20',
'appid': '16073360',
'timestamp': '1711955480',
}
search_json = requests.get('https://music.91q.com/v1/search', params=search_params, cookies=search_cookies, headers=search_headers).json()
for index in search_json['data']['typeTrack']:
TSID = index['TSID']
e = {"TSID": TSID, "appid": 16073360, "timestamp": date_time}
print(e)
js_data = open('demo.js','r',encoding='utf-8').read()
js_function = execjs.compile(js_data)
return_data = js_function.call('createSign',e)
print(return_data)
cookies = {
'Hm_lvt_d0ad46e4afeacf34cd12de4c9b553aa6': '1711952990',
'token_type': 'access_token',
'access_token': 'ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
'refresh_token': 'ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
'userid': '993887197314678784',
'cuid': 'ce4f7807-296e-f5da-6015-f780d3c10873',
'Hm_lpvt_d0ad46e4afeacf34cd12de4c9b553aa6': '1711956161',
}
headers = {
'authority': 'music.91q.com',
'accept': '*/*',
'accept-language': 'zh-CN,zh;q=0.9',
'authorization': 'access_token ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=',
# 'cookie': 'Hm_lvt_d0ad46e4afeacf34cd12de4c9b553aa6=1711952990; token_type=access_token; access_token=ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=; refresh_token=ZGNiMWRiMWIyNGQyNTkxYTU0YTdlMTRmZjM1NmE5YmE=; userid=993887197314678784; cuid=ce4f7807-296e-f5da-6015-f780d3c10873; Hm_lpvt_d0ad46e4afeacf34cd12de4c9b553aa6=1711956161',
'from': 'web',
'referer': 'https://music.91q.com/player',
'sec-ch-ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
}
params = {
'sign': return_data['sign'],
'appid': '16073360',
'TSID':TSID ,
'timestamp': return_data['timestamp'],
}
try:
json_data = requests.get('https://music.91q.com/v1/song/tracklink', params=params, cookies=cookies, headers=headers).json()
title = json_data['data']['title']
play_url = json_data['data']['path']
music_content = requests.get(url=play_url,headers=headers).content
with open('music\\'+title+'.mp3','wb') as f:
f.write(music_content)
except:
pass
创造sign的函数需要参数e
e = {"TSID": TSID, "appid": 16073360, "timestamp": date_time}
而e由TSID和时间戳构成。
四、最终结果