页面分析
获取post的具体网址:
获取需要传递的数据
代码实现
from urllib import parse,request #parse用来转换在post时需要传输的值
import json
##以翻译hello为例
###输入查询词,在header处显示的是post方式访问,访问的链接是https://fanyi.baidu.com/sug,response时数据是json格式content-type: application/json,输入的值是kw: hell
url="https://fanyi.baidu.com/sug"
data={'kw':'hello'}
data=parse.urlencode(data)
print(data)
headers={'content-length': len(data)}
req=request.Request(url,data=bytes(data,encoding='utf-8'),headers=headers) #当request的参数中data是存在的时候 就是post访问方式
resp=request.urlopen(req)
###解析json格式的返回字符串
res_json=resp.read().decode('utf-8')
print(res_json)
myjson=json.loads(res_json)
print(myjson)
print(myjson['data'][0]['v'])
最终执行结果:
注意:urlencode也可以用来拼接url链接:
#当需要拼接url时:如下url=
#https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=1&rsv_idx=1&tn=baidu&wd=fiddler%E5%AE%89%E8%A3%85&fenlei=256&oq=fiddler%25E4%25B8%258B%25E8%25BD%25BD&rsv_pq=fcd8477700004214&rsv_t=1c661ksw6PxN371S62fgFQICO5MCASzR%2F2DNJRQd4JenyReCJUGpip9hwjg&rqlang=cn&rsv_enter=0&rsv_dl=tb&sug=fiddler&rsv_btype=t&rsv_sug3=16&rsv_sug1=11&rsv_sug7=100&prefixsug=fiddler%25E5%25AE%2589%25E8%25A3%2585&rsp=0&inputT=9322&rsv_sug4=10990
#拆分为:
#https://www.baidu.com/s
# ?
# ie=utf-8
# f=8
# rsv_bp=1
# rsv_idx=1
# tn=baidu
# wd=fiddler%E5%AE%89%E8%A3%85
# fenlei=256
# oq=fiddler%25E4%25B8%258B%25E8%25BD%25BD
# rsv_pq=fcd8477700004214
# ...等 这里太多了 省略
# #rsv_sug4=10990
from urllib.parse import urlencode #处理url的参数为需要的格式
para={
'ie':'utf-8',
'f':'8',
'rsv_bp':'1',
'rsv_idx':'1',
'tn':'baidu',
'wd':'fiddler%E5%AE%89%E8%A3%85',
'fenlei':'256',
'oq':'fiddler%25E4%25B8%258B%25E8%25BD%25BD',
'rsv_pq':'fcd8477700004214'
}
url='https://www.baidu.com/s?'+urlencode(para)
print(url)
执行结果为: