简述
本文将讲解如何使用JS逆向的技术将加密后的动态数据转换成人能看懂的真实数据。代码建立在爬虫基础之上,若是没有基础可以先看看基础(本文不会解释基础代码),也可以看看我的:会python就学得会的爬虫基础(只讲实战)_怎么才能把python学到什么都能爬的地步-CSDN博客
锁定想要的数据
选取网站:企名片科创平台
网站一览:

我们想要爬取的是图中的各个【文字内容】。由图可知,每个大的【标题】都会跟着一小段【概要】,一篇文章的【概要】肯定不是文章的所有内容。对于常见的【动态数据】(XHR或Fetch),有以下特点:
- 点击“加载更多”
- 搜索框自动补全
- 表格分页
- 实时数据更新
我们可以大致推断这些数据是【动态数据】。那么,打开【抓包工具】,在【网络】中将筛选条件设为【Fetch/XHR】,得到以下资源:

经过搜寻,只有第二个文件在【预览】中包含一个encrypt_data,是我们想要的数据,完成对数据的锁定。
        并且,在锁定文件后,我们可以构建基础的爬虫代码。为了方便,在此使用在线工具:Convert curl commands to Python构建基础的python代码。我们只需要将:
的数据放在
的【Curl command】中,并且复制下方的代码就构建完成了。
对数据进行解密
通过对数据值进行观察,我们人类完全看不懂,因此推测可能是加密后的数据。那么我们怎么知道它的加密方式呢?这个时候就需要用到【JS逆向】技术了。
顾名思义,【JS逆向】就是我们写爬虫的人通过网页编程人员的JS文件反推他们写网页的逻辑。因此,找到数据的加密方式,就是找到密文到网页中我们看到的文字的解密代码。
找到数据的解密代码,我们可以以该数据为【锚点】,取搜寻代码中出现该数据的地方。

点击最右边搜索按钮,把encrypt_data输在搜索框内,并找寻.js的文件:

很巧,只有第一个也是仅此一个的js文件。那么我们只需要点进去查看分析js代码就可以了。不过这里有三个地方都出现了【锚点】,该如何抉择?
【附】锚点找寻注意事项
1.我们要找寻的是方法,是函数,因为我们现在缺的是这个数据是怎么变成真实数据的,所以以 . 调用的对象属性代码【出局】,以()调用的代码【入局】。
2.关键字出现过于频繁,增加筛选条件:

添加文件url的非协议非域名部分。
那么点击第二个或第三个进入js文件代码。在这两个地方打上断点并重新加载页面:

这样就可以对网页数据进行拦截并获取想要的数据了。可以看到这是一个简单的逻辑判断赋值语句:
e.encrypt_data && (t.baseURL === "https://businessapi.qimingpian.cn" ? e.data = Mc(e.encrypt_data) : e.data = Kc(e.encrypt_data))如此一来更能证明这是一个动态加载的数据了,这个三目运算符揭晓了它【两面三刀】的特点:大致来说。如果是主页面就放【精简版】,否则就是【完整版】。那么,我们只需要对数据进行后面函数的调用,就可以得到明文了。
【附】获取JS代码
1.将鼠标悬浮在Kc上,可以看到:

点击蓝色链接,可以跳转到函数的代码位置,复制就可以了。
对于变量,直接Ctrl+F搜索就行了。
2.在【控制台】输入:
函数名.toString()打印出来的就是函数体了:

不过这个方法需要注意在此断点时执行了此函数。也就是说,调整你断点的位置和时机,获取想要的函数。当然,对于变量也是同理,直接输出变量名就可以了。
        那么,下一步就是调试JS代码看能不能成功了。在PyCharm创建JS文件,输入Kc函数和输出语句(不能运行的下载Node.js插件):
很明显没有结果。查看错误信息:
ReferenceError: Vc is not defined
说明Vc函数没有定义。循环使用【附】的方法,补齐缺失的函数和变量,就得到了最终完整解码JS文件。
function Kc(e) {
    return JSON.parse(Vc("sjdqmp20161205#_316@gfmt", decode(e), 0, 0, "012345677890123", 1))
}
function Fc(e) {
    for (var t = new Array(0, 4, 536870912, 536870916, 65536, 65540, 536936448, 536936452, 512, 516, 536871424, 536871428, 66048, 66052, 536936960, 536936964), n = new Array(0, 1, 1048576, 1048577, 67108864, 67108865, 68157440, 68157441, 256, 257, 1048832, 1048833, 67109120, 67109121, 68157696, 68157697), o = new Array(0, 8, 2048, 2056, 16777216, 16777224, 16779264, 16779272, 0, 8, 2048, 2056, 16777216, 16777224, 16779264, 16779272), a = new Array(0, 2097152, 134217728, 136314880, 8192, 2105344, 134225920, 136323072, 131072, 2228224, 134348800, 136445952, 139264, 2236416, 134356992, 136454144), r = new Array(0, 262144, 16, 262160, 0, 262144, 16, 262160, 4096, 266240, 4112, 266256, 4096, 266240, 4112, 266256), i = new Array(0, 1024, 32, 1056, 0, 1024, 32, 1056, 33554432, 33555456, 33554464, 33555488, 33554432, 33555456, 33554464, 33555488), c = new Array(0, 268435456, 524288, 268959744, 2, 268435458, 524290, 268959746, 0, 268435456, 524288, 268959744, 2, 268435458, 524290, 268959746), l = new Array(0, 65536, 2048, 67584, 536870912, 536936448, 536872960, 536938496, 131072, 196608, 133120, 198656, 537001984, 537067520, 537004032, 537069568), s = new Array(0, 262144, 0, 262144, 2, 262146, 2, 262146, 33554432, 33816576, 33554432, 33816576, 33554434, 33816578, 33554434, 33816578), d = new Array(0, 268435456, 8, 268435464, 0, 268435456, 8, 268435464, 1024, 268436480, 1032, 268436488, 1024, 268436480, 1032, 268436488), m = new Array(0, 32, 0, 32, 1048576, 1048608, 1048576, 1048608, 8192, 8224, 8192, 8224, 1056768, 1056800, 1056768, 1056800), C = new Array(0, 16777216, 512, 16777728, 2097152, 18874368, 2097664, 18874880, 67108864, 83886080, 67109376, 83886592, 69206016, 85983232, 69206528, 85983744), E = new Array(0, 4096, 134217728, 134221824, 524288, 528384, 134742016, 134746112, 16, 4112, 134217744, 134221840, 524304, 528400, 134742032, 134746128), P = new Array(0, 4, 256, 260, 0, 4, 256, 260, 1, 5, 257, 261, 1, 5, 257, 261), x = e.length > 8 ? 3 : 1, y = new Array(32 * x), w = new Array(0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0), p, S, b = 0, h = 0, v, $ = 0; $ < x; $++) {
        var g = e.charCodeAt(b++) << 24 | e.charCodeAt(b++) << 16 | e.charCodeAt(b++) << 8 | e.charCodeAt(b++),
            k = e.charCodeAt(b++) << 24 | e.charCodeAt(b++) << 16 | e.charCodeAt(b++) << 8 | e.charCodeAt(b++);
        v = (g >>> 4 ^ k) & 252645135, k ^= v, g ^= v << 4, v = (k >>> -16 ^ g) & 65535, g ^= v, k ^= v << -16, v = (g >>> 2 ^ k) & 858993459, k ^= v, g ^= v << 2, v = (k >>> -16 ^ g) & 65535, g ^= v, k ^= v << -16, v = (g >>> 1 ^ k) & 1431655765, k ^= v, g ^= v << 1, v = (k >>> 8 ^ g) & 16711935, g ^= v, k ^= v << 8, v = (g >>> 1 ^ k) & 1431655765, k ^= v, g ^= v << 1, v = g << 8 | k >>> 20 & 240, g = k << 24 | k << 8 & 16711680 | k >>> 8 & 65280 | k >>> 24 & 240, k = v;
        for (var L = 0; L < w.length; L++) w[L] ? (g = g << 2 | g >>> 26, k = k << 2 | k >>> 26) : (g = g << 1 | g >>> 27, k = k << 1 | k >>> 27), g &= -15, k &= -15, p = t[g >>> 28] | n[g >>> 24 & 15] | o[g >>> 20 & 15] | a[g >>> 16 & 15] | r[g >>> 12 & 15] | i[g >>> 8 & 15] | c[g >>> 4 & 15], S = l[k >>> 28] | s[k >>> 24 & 15] | d[k >>> 20 & 15] | m[k >>> 16 & 15] | C[k >>> 12 & 15] | E[k >>> 8 & 15] | P[k >>> 4 & 15], v = (S >>> 16 ^ p) & 65535, y[h++] = p ^ v, y[h++] = S ^ v << 16
    }
    return y
}
decode = function(p) {
            u = /[\t\n\f\r ]/g
            c = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
            p = String(p).replace(u, "");
            var m = p.length;
            m % 4 == 0 && (p = p.replace(/==?$/, ""),
            m = p.length),
            (m % 4 == 1 || /[^+a-zA-Z0-9/]/.test(p)) && a("Invalid character: the string to be decoded is not correctly encoded.");
            for (var _ = 0, T, w, y = "", E = -1; ++E < m; )
                w = c.indexOf(p.charAt(E)),
                T = _ % 4 ? T * 64 + w : w,
                _++ % 4 && (y += String.fromCharCode(255 & T >> (-2 * _ & 6)));
            return y
}
function Vc(e, t, n, o, a, r) {
    var i = new Array(16843776, 0, 65536, 16843780, 16842756, 66564, 4, 65536, 1024, 16843776, 16843780, 1024, 16778244, 16842756, 16777216, 4, 1028, 16778240, 16778240, 66560, 66560, 16842752, 16842752, 16778244, 65540, 16777220, 16777220, 65540, 0, 1028, 66564, 16777216, 65536, 16843780, 4, 16842752, 16843776, 16777216, 16777216, 1024, 16842756, 65536, 66560, 16777220, 1024, 4, 16778244, 66564, 16843780, 65540, 16842752, 16778244, 16777220, 1028, 66564, 16843776, 1028, 16778240, 16778240, 0, 65540, 66560, 0, 16842756),
        c = new Array(-2146402272, -2147450880, 32768, 1081376, 1048576, 32, -2146435040, -2147450848, -2147483616, -2146402272, -2146402304, -2147483648, -2147450880, 1048576, 32, -2146435040, 1081344, 1048608, -2147450848, 0, -2147483648, 32768, 1081376, -2146435072, 1048608, -2147483616, 0, 1081344, 32800, -2146402304, -2146435072, 32800, 0, 1081376, -2146435040, 1048576, -2147450848, -2146435072, -2146402304, 32768, -2146435072, -2147450880, 32, -2146402272, 1081376, 32, 32768, -2147483648, 32800, -2146402304, 1048576, -2147483616, 1048608, -2147450848, -2147483616, 1048608, 1081344, 0, -2147450880, 32800, -2147483648, -2146435040, -2146402272, 1081344),
        l = new Array(520, 134349312, 0, 134348808, 134218240, 0, 131592, 134218240, 131080, 134217736, 134217736, 131072, 134349320, 131080, 134348800, 520, 134217728, 8, 134349312, 512, 131584, 134348800, 134348808, 131592, 134218248, 131584, 131072, 134218248, 8, 134349320, 512, 134217728, 134349312, 134217728, 131080, 520, 131072, 134349312, 134218240, 0, 512, 131080, 134349320, 134218240, 134217736, 512, 0, 134348808, 134218248, 131072, 134217728, 134349320, 8, 131592, 131584, 134217736, 134348800, 134218248, 520, 134348800, 131592, 8, 134348808, 131584),
        s = new Array(8396801, 8321, 8321, 128, 8396928, 8388737, 8388609, 8193, 0, 8396800, 8396800, 8396929, 129, 0, 8388736, 8388609, 1, 8192, 8388608, 8396801, 128, 8388608, 8193, 8320, 8388737, 1, 8320, 8388736, 8192, 8396928, 8396929, 129, 8388736, 8388609, 8396800, 8396929, 129, 0, 0, 8396800, 8320, 8388736, 8388737, 1, 8396801, 8321, 8321, 128, 8396929, 129, 1, 8192, 8388609, 8193, 8396928, 8388737, 8193, 8320, 8388608, 8396801, 128, 8388608, 8192, 8396928),
        d = new Array(256, 34078976, 34078720, 1107296512, 524288, 256, 1073741824, 34078720, 1074266368, 524288, 33554688, 1074266368, 1107296512, 1107820544, 524544, 1073741824, 33554432, 1074266112, 1074266112, 0, 1073742080, 1107820800, 1107820800, 33554688, 1107820544, 1073742080, 0, 1107296256, 34078976, 33554432, 1107296256, 524544, 524288, 1107296512, 256, 33554432, 1073741824, 34078720, 1107296512, 1074266368, 33554688, 1073741824, 1107820544, 34078976, 1074266368, 256, 33554432, 1107820544, 1107820800, 524544, 1107296256, 1107820800, 34078720, 0, 1074266112, 1107296256, 524544, 33554688, 1073742080, 524288, 0, 1074266112, 34078976, 1073742080),
        m = new Array(536870928, 541065216, 16384, 541081616, 541065216, 16, 541081616, 4194304, 536887296, 4210704, 4194304, 536870928, 4194320, 536887296, 536870912, 16400, 0, 4194320, 536887312, 16384, 4210688, 536887312, 16, 541065232, 541065232, 0, 4210704, 541081600, 16400, 4210688, 541081600, 536870912, 536887296, 16, 541065232, 4210688, 541081616, 4194304, 16400, 536870928, 4194304, 536887296, 536870912, 16400, 536870928, 541081616, 4210688, 541065216, 4210704, 541081600, 0, 541065232, 16, 16384, 541065216, 4210704, 16384, 4194320, 536887312, 0, 541081600, 536870912, 4194320, 536887312),
        C = new Array(2097152, 69206018, 67110914, 0, 2048, 67110914, 2099202, 69208064, 69208066, 2097152, 0, 67108866, 2, 67108864, 69206018, 2050, 67110912, 2099202, 2097154, 67110912, 67108866, 69206016, 69208064, 2097154, 69206016, 2048, 2050, 69208066, 2099200, 2, 67108864, 2099200, 67108864, 2099200, 2097152, 67110914, 67110914, 69206018, 69206018, 2, 2097154, 67108864, 67110912, 2097152, 69208064, 2050, 2099202, 69208064, 2050, 67108866, 69208066, 69206016, 2099200, 0, 2, 69208066, 0, 2099202, 69206016, 2048, 67108866, 67110912, 2048, 2097154),
        E = new Array(268439616, 4096, 262144, 268701760, 268435456, 268439616, 64, 268435456, 262208, 268697600, 268701760, 266240, 268701696, 266304, 4096, 64, 268697600, 268435520, 268439552, 4160, 266240, 262208, 268697664, 268701696, 4160, 0, 0, 268697664, 268435520, 268439552, 266304, 262144, 266304, 262144, 268701696, 4096, 64, 268697664, 4096, 266304, 268439552, 64, 268435520, 268697600, 268697664, 268435456, 262144, 268439616, 0, 268701760, 262208, 268435520, 268697600, 268439552, 268439616, 0, 268701760, 266240, 266240, 4160, 4160, 262208, 268435456, 268701696),
        P = Fc(e), x = 0, y, w, p, S, b, h, v, $, g, k, L, F, N, V, H = t.length, q = 0, ce = P.length == 32 ? 3 : 9;
    ce == 3 ? $ = n ? new Array(0, 32, 2) : new Array(30, -2, -2) : $ = n ? new Array(0, 32, 2, 62, 30, -2, 64, 96, 2) : new Array(94, 62, -2, 32, 64, 2, 30, -2, -2), r == 2 ? t += "        " : r == 1 ? n && (p = 8 - H % 8, t += String.fromCharCode(p, p, p, p, p, p, p, p), p === 8 && (H += 8)) : r || (t += "\0\0\0\0\0\0\0\0");
    var Y = "", f = "";
    for (o == 1 && (g = a.charCodeAt(x++) << 24 | a.charCodeAt(x++) << 16 | a.charCodeAt(x++) << 8 | a.charCodeAt(x++), L = a.charCodeAt(x++) << 24 | a.charCodeAt(x++) << 16 | a.charCodeAt(x++) << 8 | a.charCodeAt(x++), x = 0); x < H;) {
        for (h = t.charCodeAt(x++) << 24 | t.charCodeAt(x++) << 16 | t.charCodeAt(x++) << 8 | t.charCodeAt(x++), v = t.charCodeAt(x++) << 24 | t.charCodeAt(x++) << 16 | t.charCodeAt(x++) << 8 | t.charCodeAt(x++), o == 1 && (n ? (h ^= g, v ^= L) : (k = g, F = L, g = h, L = v)), p = (h >>> 4 ^ v) & 252645135, v ^= p, h ^= p << 4, p = (h >>> 16 ^ v) & 65535, v ^= p, h ^= p << 16, p = (v >>> 2 ^ h) & 858993459, h ^= p, v ^= p << 2, p = (v >>> 8 ^ h) & 16711935, h ^= p, v ^= p << 8, p = (h >>> 1 ^ v) & 1431655765, v ^= p, h ^= p << 1, h = h << 1 | h >>> 31, v = v << 1 | v >>> 31, w = 0; w < ce; w += 3) {
            for (N = $[w + 1], V = $[w + 2], y = $[w]; y != N; y += V) S = v ^ P[y], b = (v >>> 4 | v << 28) ^ P[y + 1], p = h, h = v, v = p ^ (c[S >>> 24 & 63] | s[S >>> 16 & 63] | m[S >>> 8 & 63] | E[S & 63] | i[b >>> 24 & 63] | l[b >>> 16 & 63] | d[b >>> 8 & 63] | C[b & 63]);
            p = h, h = v, v = p
        }
        h = h >>> 1 | h << 31, v = v >>> 1 | v << 31, p = (h >>> 1 ^ v) & 1431655765, v ^= p, h ^= p << 1, p = (v >>> 8 ^ h) & 16711935, h ^= p, v ^= p << 8, p = (v >>> 2 ^ h) & 858993459, h ^= p, v ^= p << 2, p = (h >>> 16 ^ v) & 65535, v ^= p, h ^= p << 16, p = (h >>> 4 ^ v) & 252645135, v ^= p, h ^= p << 4, o == 1 && (n ? (g = h, L = v) : (h ^= k, v ^= F)), f += String.fromCharCode(h >>> 24, h >>> 16 & 255, h >>> 8 & 255, h & 255, v >>> 24, v >>> 16 & 255, v >>> 8 & 255, v & 255), q += 8, q == 512 && (Y += f, f = "", q = 0)
    }
    if (Y += f, Y = Y.replace(/\0*$/g, ""), !n) {
        if (r === 1) {
            var H = Y.length, _ = 0;
            H && (_ = Y.charCodeAt(H - 1)), _ <= 8 && (Y = Y.substring(0, H - _))
        }
        Y = decodeURIComponent(escape(Y))
    }
    return Y
}再利用python的基础爬虫代码调用JS文件工具,得到最终的数据:
import requests
import execjs
headers = {
    'Accept': 'application/json, text/plain, */*',
    'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6',
    'Connection': 'keep-alive',
    'Content-Type': 'application/x-www-form-urlencoded',
    'Origin': 'https://wx.qmpsee.com',
    'Platform': 'web',
    'Sec-Fetch-Dest': 'empty',
    'Sec-Fetch-Mode': 'cors',
    'Sec-Fetch-Site': 'same-site',
    'Source': 'see',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.0.0 Safari/537.36 Edg/141.0.0.0',
    'appflag': 'see-h5-1.0.0',
    'sec-ch-ua': '"Microsoft Edge";v="141", "Not?A_Brand";v="8", "Chromium";v="141"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Windows"',
}
data = {
    'page': '1',
    'num': '20',
    'ca_uuid': 'feef62bfdac45a94b9cd89aed5c235be',
    'appflag': 'see-h5-1.0.0',
}
response = requests.post('https://wyiosapi.qmpsee.com/Web/getCaDetail', headers=headers, data=data)
encrypt_data = response.json()['encrypt_data']
with open('encrypt.js', 'r', encoding='utf-8') as f:
    js_code = f.read()
result = execjs.compile(js_code).call('Kc', encrypt_data)
print(result)
至此,终焉。读者可以自行挖掘这零散的数据。
 
                   
                   
                   
                   
                            
 
                             
                             
       
           
                 
                 
                 
                 
                 
                
               
                 
                 
                 
                 
                
               
                 
                 扫一扫
扫一扫
                     
              
             
                   9477
					9477
					
 被折叠的  条评论
		 为什么被折叠?
被折叠的  条评论
		 为什么被折叠?
		 
		  到【灌水乐园】发言
到【灌水乐园】发言                                
		 
		 
    
   
    
   
             
            


 
            