我目前正在开发一个scraper来分析数据,并用python2.7、beauthulsoup、Requests、Json等制作网站图表。。。在
我想用明确的关键字进行搜索,然后刮取不同商品的价格,得出平均值。在
所以我试着像平常一样,beauthoulsoup来获取json响应,但是它给我的响应是:{"data":{"uuid":"YNp-EuXHrw","index_name":"Listing","default_name":null,"query":"supreme box logo","filters":{"strata":["basic","grailed","hype"]}}}
我发现"uuid":"YNp-EuXHrw"(总是不同的值)被设置为定义将显示项目数据的URL,如:https://www.grailed.com/feed/YNp EuXHrw
所以我请求从api中用
^{pr2}$
但问题是,当我向
https://www.grailed.com/feed/YNp EuXHrw
或者不管uuid是什么,我得到。在
我的全部代码是:import BeautifulSoup,requests,re,string,time,datetime,sys,json
s = requests.session()
url = "https://www.grailed.com/api/searches"
payload = {
"index_name":"Listing_production","query":"supreme box logo sweatshirts","filters":{"strata":["grailed","hype","basic"],"category_paths":[],"sizes":[],"locations":[],"designers":[],"min_price":"null","max_price":"null"}
}
headers = {
"Host": "www.grailed.com",
"Connection":"keep-alive",
"Content-Length": "217",
"Origin": "null",
"x-api-version": "application/grailed.api.v1",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36",
"content-type": "application/json",
"accept": "application/json",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4",
}
response = s.post(url, headers=headers, json=payload)
res_json = json.loads(response.text)
print response
id = res_json['data']['uuid']
urlID = "https://www.grailed.com/feed/" + str(id)
print urlID
response = s.get(urlID, headers=headers, json=res_json)
print response
当你通过Chrome或者其他URL快速改变的请求时,你可以看到grailed. com
到grailed.com/ feed/uuid
所以我试着向这个URL发出GET请求,但是得到的响应是500。在
当uuid URL上显示的数据甚至没有出现在网络请求中时,我该怎么做?在
我希望我说得很清楚,对不起我的英语