python抓取谷歌指数(Google Trends)

过去7天的数据链接:https://trends.google.com/trends/api/widgetdata/multiline?hl=zh-CN&tz=-480&req={“time”:“2020-05-11T03\:38\:59 2020-05-18T03\:38\:59”,“resolution”:“HOUR”,“locale”:“zh-CN”,“comparisonItem”:[{“geo”:{},“complexKeywordsRestriction”:{“keyword”:[{“type”:“BROAD”,“value”:“NBA”}]}}],“requestOptions”:{“property”:"",“backend”:“CM”,“category”:0}}&token=APP6_UEAAAAAXsNU002YsOS6N9Eb5Z_2BpV-LTY0_AGz&tz=-480
在这个链接中req和token参数需要我们获取的:
获取req,token参数:

def get_token(keyword):
  headers = {
    'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36',
    'x-client-data': 'CIu2yQEIo7bJAQjEtskBCKmdygEIy67KAQjQr8oBCLywygEIl7XKAQjttcoBCI66ygEYx7fKAQ==',
    'referer': 'https://trends.google.com/trends/explore?date=today%201-m&q=bitcoin,blockchain,eth',
    'cookie': '__utmc=10102256; __utma=10102256.31392724.1583402727.1586332529.1586398363.11; __utmz=10102256.1586398363.11.11.utmcsr=shimo.im|utmccn=(referral)|utmcmd=referral|utmcct=/docs/qxW86VTXr8DK6HJX; __utmt=1; __utmb=10102256.9.9.1586398779015; ANID=AHWqTUlRutPWkqC3UpC_-5XoYk6zqoDW3RQX5ePFhLykky73kQ0BpL32ATvqV3O0; CONSENT=WP.284bc1; NID=202=xLozp9-VAAGa2d3d9-cqyqmRjW9nu1zmK0j50IM4pdzJ6wpWTO_Z49JN8W0s1OJ8bySeirh7pSMew1WdqRF890iJLX4HQwwvVkRZ7zwsBDxzeHIx8MOWf27jF0mVCxktZX6OmMmSA0txa0zyJ_AJ3i9gmtEdLeopK5BO3X0LWRA; 1P_JAR=2020-4-9-2'
  }
  url = 'https://trends.google.com/trends/api/explore?hl=zh-CN&tz=-480&req={{"comparisonItem":[{{"keyword":"{}","geo":"","time":"now 7-d"}}],"category":0,"property":""}}&tz=-480'.format(keyword)
  r = requests.get(url, headers=headers)
  data = json.loads(r.text[5:])
  req = data['widgets'][0]['request']
  token = data['widgets'][0]['token']
  result = {'req':req,'token':token}
  return result

获取趋势变化图数据:

def google(keyword):
  """谷歌指数"""
  info = get_token(keyword)
  req = info['req']
  token = info['token']
  url = 'https://trends.google.com/trends/api/widgetdata/multiline?hl=zh-CN&tz=-480&req={}&token={}&tz=-480'.format(req, token)
  r = requests.get(url)
  if r.status_code == 200:
    data = json.loads(r.text.encode().decode('unicode_escape')[6:])['default']['timelineData']
    for data_e in data:
      timestamp = int(data_e['time']) * 1000
      value = data_e['value'][0]
      keyword = keyword
      print(timestamp, value, keyword)

输出:print(google(‘NBA’)
输出结果(部分数据):
在这里插入图片描述
由于Google的接口中数据大部分不是可以直接转换成字典的,需要将前面几个字符去掉才可以转换,所以这边需要注意一下!

  • 2
    点赞
  • 14
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值