一、背景
分析过程参照:https://cuiqingcai.com/5024.html , 主要梳理下处理这类js混淆的思路,并记录踩过的坑
二、过程
1、确定加密参数
点进去后抓包,没有明显的数据接口链接,切换一个城市后,发现两个数据接口:https://www.aqistudy.cn/apinew/aqistudyapi.php ,链接相同,post_data不同(一个请求AQI数据,一个请求天气数据),该请求post_data整个进行了加密,同时返回结果也进行了加密
2、确定加密函数
全局搜索‘d’显然是不现实的,所以通过城市列表中各城市对应的点击事件为突破口,寻找js函数
将鼠标放在北京上,在Elements右侧(或下侧)的事件监听(Event Listeners)的点击事件中,可以看到相应的事件,点击红色箭头所指,会跳转到对应的js函数中,如下图
该函数中,发现一个函数getData(),显然是获取数据的js, 全局搜索,最终在:https://www.aqistudy.cn/html/city_detail.html 中发现该函数
调用了两个函数getAQIData()和getWeatherData(),查看这两个函数后,发现均调用了同一个函数getServerData(),在 https://www.aqistudy.cn/js/jquery-1.8.0.min.js?v=1.2 中发现该函数,但是是经过js混淆的代码
通过反混淆处理网站(http://www.bm8.com.cn/jsConfusion/ ),将从eval往下全部复制,通过反混淆处理,获得明文
在getServerData中,通过调用函数getParam进行请求参数加密,decodeData函数进行返回结果解密
var getParam = (function () {
function ObjectSort(obj) {
var newObject = {};
Object.keys(obj).sort().map(function (key) {
newObject[key] = obj[key]
});
return newObject
}
return function (method, obj) {
var appId = '1a45f75b824b2dc628d5955356b5ef18';
var clienttype = 'WEB';
var timestamp = new Date().getTime();
var param = {
appId: appId,
method: method,
timestamp: timestamp,
clienttype: clienttype,
object: obj,
secret: hex_md5(appId + method + timestamp + clienttype + JSON.stringify(ObjectSort(obj)))
};
param = BASE64.encrypt(JSON.stringify(param));
return AES.encrypt(param, aes_client_key, aes_client_iv)
}
})();
getParam对参数进行了MD5, base64以及AES加密
function decodeData(data) {
data = AES.decrypt(data, aes_server_key, aes_server_iv);
data = DES.decrypt(data, des_key, des_iv);
data = BASE64.decrypt(data);
return data
}
decodeData函数也进行了AES、DES、base64解密
3、处理js函数
方法一:Python实现:暂未实现,AES加解密没有实现,
import hashlib
def MD5(data_str):
object = hashlib.md5()
object.update(data_str.encode('utf-8'))
return object.hexdigest()
appId='1a45f75b824b2dc628d5955356b5ef18'
method='GETDETAIL'
timestamp=str(int(time.time()*13))
clienttype='WEB'
obj= '{"city":"北京","endTime":"2019-05-14 08:00:00","startTime":"2019-05-14 05:00:00","type":"HOUR"}'
data = appId + method + timestamp + clienttype + obj
secret = MD5(data)
dicts = '''{appId: %s, clienttype: %s, method: %s, object: %s, secret: %s, timestamp: %s,}'''%(appId, clienttype, method, obj, secret, timestamp)
ss = base64.b64encode(dicts.encode())
python重写时,需要注意字典的顺序,js加密时是安装字典的key值进行了排序,而Python的字典是无序的
方法二:执行js代码,选择js2py
新增一个函数,用于传递数据,调用参数加密函数
function getEncryptedData(method, city, type, startTime, endTime) {
var param = {};
param.city = city;
param.type = type;
param.startTime = startTime;
param.endTime = endTime;
xx = getParam(method, param);
return xx
}
使用node.js执行时,将加密后的参数用于请求数据,返回数据正常,但是当通过js2py执行时,返回结果:{“success”:false,“errcode”:1005,“errmsg”:“invalid timestamp”},通过日志打印,发现了两次执行时不同之处:
secret: hex_md5(appId + method + timestamp + clienttype + JSON.stringify(ObjectSort(obj)))
该参数的JSON.stringify(ObjectSort(obj)),打印出来不同,
node.js:{"city":"北京","endTime":"2019-05-14 08:00:00","startTime":"2019-05-14 05:00:00","type":"HOUR"}
js2py:{"city":"\\u5317\\u4eac","endTime":"2019-05-14 08:00:00","startTime":"2019-05-14 05:00:00","type":"HOUR"}
通过Python实现secret的加密,并传入js,实现加密过程,改写的getEncryptedData和getParam函数
function getEncryptedData(method, city, type, startTime, endTime, timestamp, secret) {
var param = {};
param.city = city;
param.type = type;
param.startTime = startTime;
param.endTime = endTime;
xx = getParam(method, param, timestamp, secret);
return xx
}
var getParam = (function () {
function ObjectSort(obj) {
var newObject = {};
Object.keys(obj).sort().map(function (key) {
newObject[key] = obj[key]
});
return newObject
}
return function (method, obj, timestamp, secret) {
var appId = '1a45f75b824b2dc628d5955356b5ef18';
var clienttype = 'WEB';
var param = {
appId: appId,
method: method,
timestamp: timestamp,
clienttype: clienttype,
object: obj,
secret: secret
};
param = BASE64.encrypt(JSON.stringify(param));
return AES.encrypt(param, aes_client_key, aes_client_iv)
}
})();
Python调用代码
import time
import requests
import hashlib
import js2py
def MD5(data_str):
object = hashlib.md5()
object.update(data_str.encode('utf-8'))
return object.hexdigest()
def get_data(city, type, startTime, endTime, method):
appId = '1a45f75b824b2dc628d5955356b5ef18'
timestamp = str(int(time.time() * 1000))
clienttype = 'WEB'
obj = '{"city":"%s","endTime":"%s","startTime":"%s","type":"%s"}' % (city, endTime, startTime, type)
data = appId + method + timestamp + clienttype + obj
secret = MD5(data)
with open('tac.js') as f:
js = f.read()
context = js2py.EvalJs()
context.execute(js)
param_sign = context.getEncryptedData(method, city, type, startTime, endTime, timestamp, secret)
url = 'https://www.aqistudy.cn/apinew/aqistudyapi.php'
post_data = {
'd': param_sign,
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36',
}
response = requests.post(url, data=post_data, headers=headers)
data = context.decodeData(response.text)
print(data)
if __name__ == '__main__':
city = '北京'
type = 'HOUR'
startTime = '2019-05-14 05:00:00'
endTime = '2019-05-14 08:00:00'
method = 'GETDETAIL'
get_data(city, type, startTime, endTime, method)
time.sleep(1)
method = 'GETCITYWEATHER'
get_data(city, type, startTime, endTime, method)
三、总结
- python重写js代码时,注意字典的顺序,js加密时是安装字典的key值进行了排序,而Python的字典是无序的
- js加密的一般处理思路:
- 加密url或某个post_data参数时,可以全局搜索对应的参数名或者值,确定js函数
- 整个post_data加密或无法搜索到参数,可通过元素的事件监听,确定js函数
- 注意node.js与其他执行js的Python库对数据处理的不同之处