2023年批量下载和改名音频专辑(多页列表)

一、下载原理
多页列表,有get类型的,有post类型的,xima的专辑多页列表属于get类型。
比如"https://www.xi__mala__ya.com/album/262212"专辑,里面有1018个音频,分成35页,每页的json数据如下,
(1)请求格式,
https://www.xi___ma___laya.com/revision/album/v1/getTracksList?albumId=262212&pageNum=0&sort=0&pageSize=30
(2)返回格式,

{"ret":200,"data":{"currentUid":0,"albumId":262212,"trackTotalCount":1018,"sort":0,"tracks":[{"index":1,"trackId":2968682,"isPaid":false,"tag":0,"title":"潘吉Jenny告诉你--世界杯开战!","playCount":234549,"showLikeBtn":true,"isLike":false,"showShareBtn":true,"showCommentBtn":true,"showForwardBtn":true,"createDateFormat":"2014-06","url":"/sound/2968682","duration":609,"isVideo":false,"isVipFirst":false,"breakSecond":0,"length":609,"albumId":262212,"albumTitle":"潘吉Jenny告诉你-学英语聊美国","albumCoverPath":"storages/be09-audiofreehighqps/54/6E/GKwRIJIG8HemAAk3qQGehGJc.jpeg","anchorId":11119867,"anchorName":"开言英语","ximiVipFreeType":0,"joinXimi":false},{"index":2,"trackId":2968849,"isPaid":false,"tag":0,"title":"潘吉Jenny告诉你--美国血拼一族!","playCount":165942,"showLikeBtn":true,"isLike":false,"showShareBtn":true,"showCommentBtn":true,"showForwardBtn":true,"createDateFormat":"2014-06","url":"/sound/2968849","duration":738,"isVideo":false,"isVipFirst":false,"breakSecond":0,"length":738,"albumId":262212,"albumTitle":"潘吉Jenny告诉你-学英语聊美国","albumCoverPath":"storages/be09-audiofreehighqps/54/6E/GKwRIJIG8HemAAk3qQGehGJc.jpeg","anchorId":11119867,"anchorName":"开言英语","ximiVipFreeType":0,"joinXimi":false

(3)音频地址
利用url里面 /sound/xxxx的数字,可以得到音频地址,

xi__ma.com/sound/2968849

此页面在点击播放后,发送请求格式,

https://www.xi___mala____ya.com/revision/play/v1/audio?id=2968849&ptype=1

返回的 json

{"ret":200,"data":{"trackId":2968849,"canPlay":true,"isPaid":false,"hasBuy":true,"src":"https://aod.cos.tx.xmcdn.com/group10/M0B/3B/BA/wKgDZ1WctZjjfjoAAFsf1AU2p0s242.m4a","albumIsSample":false,"sampleDuration":0,"isBaiduMusic":false,"firstPlayStatus":true,"isVipFree":false,"isXimiAhead":false,"isAlbumTimeLimited":false,"ximiVipFreeType":0,"joinXimi":false}}

其中关键的是src属性,提供可以下载的音频地址。

(4)处理方法
我们不需要再取回网页,只要分两次,取回json数据即可。
第一次是目录分页的每一页的json,有用的信息为 title , index, url
第二次是每个音频的json,有用的信息为 音频地址。
使用者只需要提供album_id,指定多页型专辑,其它均自动处理。

二、代码如下

# -*- coding:utf-8 -*-
import requests
from bs4 import BeautifulSoup
import re
import os
from win32com.client import Dispatch

Headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}

def get_list_json(album_id,pageNum):
    list_url = f'https://www.xi___ma____la___ya.com/revision/album/v1/getTracksList?albumId={album_id}&pageNum={pageNum}&sort=0&pageSize=30'
    #print(list_url)
    wd_data = requests.get(list_url,headers=Headers)
    return wd_data.json()  

def get_sound_json(url):
    wd_data = requests.get(url,headers=Headers)    
    return wd_data.json()    

def get_m4a(album_id,pagelow, pagehigh):    
    o = Dispatch("ThunderAgent.Agent64.1") 
    for pageNum in range(pagelow,pagehigh):
        print("Page : --------------" + str(pageNum))
        list_json = get_list_json(album_id,pageNum)
        for track in list_json['data']['tracks']:
            #print(track)
            id = track['url'][7:]
            title = track['title']
            index = track['index']
            name = str(index) + "_" + title
            m4a_url = "https://www.xi_____ma_____la____ya.com/revision/play/v1/audio?id=" + id + "&ptype=1"        
            m4a_json = get_sound_json(m4a_url)
            m4a = m4a_json['data']['src']
            print(m4a)           
            o.AddTask(m4a, name)
    o.CommitTasks()
            
if __name__ == '__main__':
    #url = "https://www.xi__malaya.com/album/71718770"
    album_id = "262212"
    get_m4a(album_id,11,20)

说明一:get_m4a后面的两个数字是分页的开始页(含)、与结束页(不含)。
说明二:分页0和分页1的内容一样。

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值