前期准备
手机安装了豆果美食app,安装了fiddler证书,WLAN做了手动代理,手机与电脑在同一局域网下
实战开始
打开手机,发现请求已经在更新了,我们只需要找就可以了,我们要的数据其实有一定特点,例如host应该是包含douguo这个东西的,然后后看到了api字眼,使用工具栏的find工具,发现找到响应请求了,使用json工具可以解码,找到了有用请求右键给他标记颜色
打开fiddler查看请求头和请求体
在这里,将https改为http就能正常返回了,经过实验可以删除一些不关紧要的参数,编写代码如下:
import requests
import pandas as pd
data = {
'client':'4',
'_vs':'2305',
}
headers = {
"client": "4",
"version": "6922.2",
"device": "MI 6",
"sdk": "19,4.4.2",
"imei": "863254010448503",
"channel": "qqkp",
"resolution": "720*1280",
"dpi": "1.5",
"brand": "Xiaomi",
"scale": "1.5",
"timezone": "28800",
"language": "zh",
"cns": "3",
"carrier": "CMCC",
"user-agent": "Mozilla/5.0 (Linux; Android 4.4.2; MI 6 Build/NMF26X) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/30.0.0.0 Mobile Safari/537.36",
"reach": "1",
"newbie": "1",
"Content-Type": "application/x-www-form-urlencoded; charset=utf-8",
"Accept-Encoding": "gzip, deflate",
"Connection": "Keep-Alive",
"Host": "api.douguo.net",
}
url = 'http://api.douguo.net/recipe/flatcatalogs'
res = requests.post(url=url,data=data,headers=headers).json()
print(res)
all_types = res['result']['cs']
data = [] # 存放数据
for one_type in all_types: # 热门
food_first_type = one_type['name']
for i in one_type['cs']: #
food_second_type = i['name']
for j in i['cs']:
food_name = j['name']
url = 'http:' + j['ju'].split(':')[-1]
data.append({'food_first_type':food_first_type,'food_second_type':food_second_type,'name':food_name,'url':url})
df = pd.DataFrame(data)
print(data)
df.to_excel('数据.xlsx',index=False)
如果想获取菜谱和配料的话就继续找请求然后循环下去就好了
注,其他用到的请求: