python应用——爬取千图网的付费视频
最近小编学到了爬取千图网的代码,特来分享一下???
首先要会查找网页的源代码!
import requests
from lxml import etree
headers = {
“User-Agent”: “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/75.0.3770.100 Safari/537.36”,
“Referer”: “https://www.58pic.com/tupian/5848.html”
}
# 请求千图网拿到整体数据
response = requests.get(“https://www.58pic.com/tupian/5848.html”)
# 抽取千图网的数据,即视频标题,视频链接
html = etree.HTML(response.text)
src_list = html.xpath(’//div[@key=“19”]//@data-video’)
#要学会查看对应的标签,要切记不要打错字母,小编就是犯了这样一个错误,弄了好久
tit_list = html.xpath(’//div[@key=“19”]/p[@class=“card-title”]/span[@class=“title-text”]/text()’)
for src, tit in zip(src_list, tit_list):
# 下载视频
content = requests.get(“https:” + src, headers=headers).content
# 4.保存视频
filename = “video” + tit + “.mp4”
print(“正在保存视频文件:”+filename)
#不存在的文件会自动创建
with open(filename, “wb”) as f:
f.write(content)
成功之后就会在文件中自动创建一个文件夹:如图
这样即可保存文件????