以有道翻译为例子
等我写完突然发现不用逆向好像,但是,js逆向也是也可以。
一 处理请求url
删掉原url中的_o 就可以拿到未加密的数据了
1.先找到url
start_url='https://fanyi.youdao.com/translate_osmartresult=dict&smartresult=rule'
其实这里是有加密的,但是重点不在这,所以我直接给出解决方法了:
start_url = 'http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'.replace('_o', '')
是不是很简单,哈哈哈哈。
简单但是有效。
2.模拟浏览器进行欺骗
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/87.0.4280.66 Safari/537.36'
}
这一步是核心
3.构建表单大字典
由此假设我们输入的是单词为word(当然句子同理),大字典如下:
word = input(r'请输入要翻译的词语:')
form_data = {
"i": word,
"from": "AUTO",
"to": "AUTO",
"smartresult": "dict",
"client": "fanyideskweb",
"salt": "16110565853966",
"sign": "ac9e6a1ad007208044975e79cab12b10",
"lts": "1611056585396",
"bv": "7b07590bbf1761eedb1ff6dbfac3c1f0",
"doctype": "json",
"version": "2.1",
"keyfrom": "fanyi.web",
"action": "FY_BY_REALTlME",
}
4.进行数据请求
response = requests.post(start_url, headers=headers, data=form_data).content.decode()
pprint(response)
输入这是一只小狗,得到
(' '
'{"type":"ZH_CN2EN","errorCode":0,"elapsedTime":12,"translateResult":[[{"src":"这是一只小狗","tgt":"This '
'is a small dog"}]]}\n')
观察下他的数据类型
我们要把它转为json数据
## json_str = json.loads(response)
5.输出结果
最后一步就是取值了。
result = json_str["translateResult"][0][0]["tgt"]
print(f'翻译的结果是:{result}')
完成!!!
最后代码:
import requests, json
from pprint import pprint
def main():
# 1、url + headers
start_url = r'http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'.replace('_o', '')
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/87.0.4280.66 Safari/537.36'
}
# 2、post表单数据 使用正则 自动添加""
while True:
word = input(r'请输入要翻译的词语:')
form_data = {
"i": word,
"from": "AUTO",
"to": "AUTO",
"smartresult": "dict",
"client": "fanyideskweb",
"salt": "16110565853966",
"sign": "ac9e6a1ad007208044975e79cab12b10",
"lts": "1611056585396",
"bv": "7b07590bbf1761eedb1ff6dbfac3c1f0",
"doctype": "json",
"version": "2.1",
"keyfrom": "fanyi.web",
"action": "FY_BY_REALTlME",
}
# 3、响应
response = requests.post(start_url, headers=headers, data=form_data).content.decode()
pprint(type(response))
# 4、将json数据变成python对象
json_str = json.loads(response)
# pprint(json_str)
# 5、使用字典知识提取数据
result = json_str["translateResult"][0][0]["tgt"]
print(f'翻译的结果是:{result}')
if __name__ == '__main__':
main()
二 js逆向参数sign
经过尝试lts 和 salt参数是由时间戳组成
sign是个md5参数,使用第三方库execjs执行即可
import execjs
import requests
class YDspider(object):
def __init__(self):
self.start_url = 'https://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule'
self.headers = {
'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
'Cookie': 'OUTFOX_SEARCH_USER_ID_NCOO=1402334259.831596; OUTFOX_SEARCH_USER_ID="1686220787@10.169.0.102"; _ga=GA1.2.1761775442.1627881866; JSESSIONID=aaaEGLu579s-R_2ZLLc0x; ___rl__test__cookies=1636440210395',
'Origin': 'https: // fanyi.youdao.com',
'Referer': 'https: // fanyi.youdao.com /',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36'
}
def parse_self_data(self,key):
with open('index.js', 'r') as f:
js = execjs.compile(f.read())
r = js.call('rr', f'{key}')
self.data = {
'i': key,
'from': 'AUTO',
'to': 'AUTO',
'smartresult': 'dict',
'client': 'fanyideskweb',
'salt': r['salt'],
'sign': r['sign'],
'lts': r['ts'],
'bv': 'c795a332c678d5063a1ee5eb15253848',
'doctype': 'json',
'version': '2.1',
'keyfrom': 'fanyi.web',
'action': 'FY_BY_REALTlME'
}
def parse_start_url(self,key):
try:
self.parse_self_data(key)
response = requests.post(self.start_url,headers=self.headers,data=self.data).json()
#print(response,type(response))
self.parse_resposne(response)
except:
print('===输入有错误===')
def parse_resposne(self,response):
data = response['smartResult']['entries']
s = ''
for tmp in data:
s += tmp
print(s)
if __name__ == '__main__':
while True:
n = input('1.开始翻译\n2.退出程序\n\t')
if n == '2':
break
else:
YD = YDspider()
key = input('请输入你想要翻译的内容:')
YD.parse_start_url(key)