作者:huishao
联系方式:1943133326@qq.com
前言
饱受Vue3摧残之余,忽然想在pc端找一个多音源的播放器,找着找着找偏了,发现一个音乐下载网站,
嘶溜~ 看起来还不错的资源, 那我就不客气啦~
本例爬取的是该站点囧菌的歌曲,还别说,音质还挺可以!!!
由于代码属实过于简单,直接贴代码
此处涉及 : request 发送请求,下载资源 ;bf4解析html文本
完整代码
import requests
import demjson
import json
import re
from bs4 import BeautifulSoup
headers={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'}
for pageNum in range(1,5,1):
reaa = requests.get(f"https://www.hifini.com/tag-490-{pageNum}.htm")
bs = BeautifulSoup(reaa.text, "html.parser")
# 获取 当前页所有URL
for urlDiv in list(bs.find_all(class_=re.compile("subject break-all"))):
urls = (urlDiv.a).get('href')
resss = requests.get(f'https://www.hifini.com/{urls}')
bsc = BeautifulSoup(resss.text, "html.parser")
for i in list(bsc.find_all("script")):
if "APlayer" in str(i):
jsonss = ((str(i).split("er(")[1])[:-1]).split(");")[0]
title = ((jsonss.split("title: ")[1]).split(",")[0])[1:-1]
title = title.replace('\\', "") # 此处是因为有一首歌的歌名里包含了反斜杠“ \ ”
url = ((jsonss.split("url: ")[1]).split(",")[0])[1:-1]
with open(f'D:/Mandisa/{title}.m4a','wb') as f:
r1=requests.get(url="https://www.hifini.com/"+url,headers=headers).content
f.write(r1)
print(f"------- 封茗囧菌的歌曲: {title} 下载成功! ")
break