爬虫
nadoudou123
这个作者很懒,什么都没留下…
展开
-
python爬虫豆瓣TOP250电影信息并写入数据库
import requestsimport pymysqlfrom loguru import loggerfrom lxml import etreedb = pymysql.connect(host='localhost', port=3306, user='root', passwd='1234', db='db', charset='utf8')logger.info("正在连接到数据库")cursor = db.cursor()cursor.execute("DR...原创 2021-10-04 15:25:47 · 6334 阅读 · 0 评论 -
python爬虫堆糖图片 异步加载
import osimport jsonpathimport requestsheaders = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q.原创 2021-09-28 18:37:50 · 381 阅读 · 0 评论 -
使用xpath批量爬取堆糖图片
import requestsfrom lxml import etreekw = input("输入搜索的关键字:")url = "https://www.duitang.com/search/?kw={}&type=feed".format(kw)headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/9.原创 2021-09-27 17:29:04 · 404 阅读 · 0 评论 -
500彩票 足彩赛事(正则表达式)
import requestsimport reurl = 'https://live.500.com/'browser_headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'}html = requests.get(url, headers=browser.原创 2021-09-13 18:44:32 · 599 阅读 · 0 评论 -
猫眼电影(python+re正则)
import reimport requestsurl = 'https://maoyan.com/films'headers = { 'Content-Type': 'text/plain; charset=UTF-8', 'Origin': 'https://maoyan.com', 'Referer': 'https://maoyan.com/board/4', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win6.原创 2021-09-13 17:36:02 · 338 阅读 · 0 评论