爬虫
saberpan
这个作者很懒,什么都没留下…
展开
-
爬取贴吧和Taptap上面的图片,使用ai对剑与远征关卡进行识别重命名
# encoding:utf-8 import base64 import requests import os import re from lxml import html headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko...原创 2020-05-08 14:17:51 · 560 阅读 · 0 评论 -
爬取剑与远征贴吧通关作业图片
from lxml import html import requests import os headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36", } ...原创 2020-04-28 18:00:10 · 187 阅读 · 0 评论 -
pip lxml没有etree
python3.7版本安装lxml没有etree from lxml import html import requests url = 'https://www.baidu.com' text=requests.get(url).text print(text) etree = html.etree html = etree.HTML(text) title = html.xpath("//a"...原创 2020-04-27 17:18:24 · 482 阅读 · 1 评论 -
王者荣耀投票爬虫实时票数,严禁用于商业用途
import requests import datetime from openpyxl import load_workbook import time headers = { 'User-Agent': 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, l...原创 2019-10-12 18:32:43 · 4399 阅读 · 10 评论