一、用Python中requests爬取百度热搜及指数,用到了正则解析
步骤:
- 导入相关的库
import requests import re import csv
- 发送请求及响应数据
url = 'https://m.zhaopin.com/sou/jl864/kwCLO66RII0PJP0NG8/p1' headers = { 'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Mobile Safari/537.36' } response = requests.get(url, headers=headers) content=resonse.content
- 用正则解析页面,并将热搜及指数做成列表
hot_searches = re.findall('<div class="c-single-text-ellipsis">(.*?)</div>',content,re.S) search_1 = [] for search in hot_searches: search_1.append(search) print(search_1) hot_indexes = re.findall('<div class="hot-index_1Bl1a">(.*?)</div>',conten