
爬虫
尉迟海棠
这个作者很懒,什么都没留下…
展开
-
request + bs4 爬取网易云音乐热门评论
"""获取网易云音乐中的评论"""import requestsfrom bs4 import BeautifulSoupimport jsondef comment(): url = r'https://music.163.com/weapi/comment/resource/comments/get?csrf_token=' headers = { 'user - agent': "Mozilla/5.0 (Windows NT 10.0; Win64;原创 2021-05-30 20:46:48 · 529 阅读 · 0 评论 -
requset + bs4 爬取贝壳房源
"""爬取贝壳找房的房源"""import jsonimport requestsfrom bs4 import BeautifulSoupimport pandas as pddef get_data(keyword): """ 获取原始数据 :param keyword: :return: """ ip = '114.100.0.229:9999' proxy = {"http": ip} url = 'https://原创 2021-05-30 20:39:44 · 334 阅读 · 0 评论 -
requests + bs4 爬取豆瓣 top250 电影信息
"""爬取豆瓣top250个电影"""import requestsimport bs4import redef open_url(url): headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36'} res = requests.get(url,原创 2021-05-29 19:45:02 · 956 阅读 · 0 评论 -
Selenium 爬取百度图片
Selenium 爬取百度图片# coding=utf-8"""获取10张百度图片"""from selenium import webdriverfrom selenium.webdriver.common.action_chains import ActionChainsfrom selenium.webdriver.common.keys import Keysimport time, requestsdef download_img(kw): # 打开浏览器原创 2021-05-29 19:43:13 · 432 阅读 · 0 评论