五一节放假正想放肆一把时,发现
为什么浏览器能预知我的输入呢?
可否用他实现一些有趣的事呢?
于是一个F12网页现原形
太多东西了,清除之后在操作一次
嗯,这里面的东西有点东西
而且又是Get请求,好办
import urllib.request
import urllib.parse
_360_url = 'https://smart.sug.so.com/suggest?crec=0&pid=webpage&word=%E5%B9%82&srcg=&src=hao_360so_suggest&encodein=utf-8&encodeout=utf-8&count=10&callback=__jsonp28__&t=1651398804750'#'常规'中的'请求URL'
headers={
'user-agent':"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36 QIHU 360SE/13.1.5330.0"
}#'请求标头'中的'user-agent'
re=urllib.request.Request(url=_360_url,headers=headers)
res=urllib.request.urlopen(re)
T= res.read().decode('utf-8')
T就是预览(响应)中的数据了,只是没有换行之类的
>>> print(T)
__jsonp28__({"abv":"b","errno":0,"data":{"query":"幂","errorcode":0,"ext":"nlpv=test_yc_18","version":"revise","result":[{"ci":"1.000000","ctrScore":"0.070790","recallType":"nginx'wangdun'ci'tail'xgb","recallScore":"0.000000","word":"幂的运算","rerankScore":"0.266065","rankScore":"0.266065"},{"ci":"1.000000","ctrScore":"0.064769","recallType":"nginx'wangdun'ci'tail'xgb","recallScore":"0.000000","word":"幂函数","rerankScore":"0.254498","rankScore":"0.254498"},{"ci":"1.000000","ctrScore":"0.014691","recallType":"nginx'wangdun'xgb","recallScore":"0.000000","word":"幂是什么","rerankScore":"0.121207","rankScore":"0.121207"},{"ci":"1.000000","ctrScore":"0.007204","recallType":"nginx'xgb","recallScore":"0.000000","word":"幂怎么读","rerankScore":"0.084879","rankScore":"0.084879"},{"ci":"1.000000","ctrScore":"0.004697","recallType":"nginx'wangdun'tail'xgb","recallScore":"0.000000","word":"幂的乘方与积的乘方","rerankScore":"0.068532","rankScore":"0.068532"},{"ci":"1.000000","ctrScore":"0.004360","recallType":"nginx'wangdun'ci'tail'xgb","recallScore":"0.000000","word":"幂函数图像","rerankScore":"0.066031","rankScore":"0.066031"},{"ci":"1.000000","ctrScore":"0.004268","recallType":"nginx'wangdun'xgb","recallScore":"0.000000","word":"幂函数公式","rerankScore":"0.065332","rankScore":"0.065332"},{"ci":"1.000000","ctrScore":"0.003952","recallType":"nginx'ci'tail'xgb","recallScore":"0.000000","word":"幂学在线","rerankScore":"0.062861","rankScore":"0.062861"},{"ci":"1.000000","ctrScore":"0.002698","recallType":"nginx'wangdun'xgb","recallScore":"0.000000","word":"幂次方","rerankScore":"0.051942","rankScore":"0.051942"},{"ci":"1.000000","ctrScore":"0.001830","recallType":"nginx'wangdun'tail'xgb","recallScore":"0.000000","word":"幂律分布","rerankScore":"0.042779","rankScore":"0.042779"}],"ssid":"cebbce4bdd6848cf8e25f8bdcf3ce3a3","src":"hao_360so_suggest"}});
对于这么长的数据,我是这样捣鼓的:
T = T[13:-3]
T = eval(T)
result = T['data']['result']
for i in result:
print(i['word'])
有点奇葩吧,这段大意是:
1.现将它里面长得像字典的东西截出来。
2.用eval()函数把它变成真的字典。
3.分析字典T,将其中有用的一个列表截下来。
4.分析这个由若干个字典组成的列表,用for循环将有用的数据打印下来。
OK,
但没完全OK,
如何让它具有海纳百川的能力,而不只会一个‘幂’?
很简单,看看请求URL长什么样子。
再看看'查询字符串参数'是什么样子的
很显然,url是由红色部分加上蓝色部分,蓝色部分也就是用‘&’将'查询字符串参数'中的参数连接起来形成的。但‘幂’这个中文变成了‘%E5%B9%82’,这是由于中文在请求中会被URL加密。
这样加密即可:
from urllib.parse import quote
s = quote('幂', 'utf-8')
>>>print(s)
%E5%B9%82
于是与上面的代码有机结合,完整代码:
from urllib.parse import quote
import urllib.request
import urllib.parse
import json
import time
a = input()
s = quote(a, 'utf-8')
_360_url = 'https://smart.sug.so.com/suggest?crec=0&pid=webpage&word={}&srcg=&src=hao_360so_suggest&encodein=utf-8&encodeout=utf-8&count=10&callback=__jsonp28__&t=1651398804750'.format(s)
headers={
'user-agent':"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36 QIHU 360SE/13.1.5330.0"
}
re=urllib.request.Request(url=_360_url,headers=headers)
res=urllib.request.urlopen(re)
T = res.read().decode('utf-8')
T = T[13:-3]
T = eval(T)
result = T['data']['result']
for i in result:
print(i['word'])
完
吾不多求,喜欢就请君点个赞。