python抓取5sing上的歌曲

以 http://5sing.kugou.com/inory/fc/1.html 为例

#coding:utf-8
from bs4 import BeautifulSoup
import requests
import os
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support.ui import WebDriverWait
import time
import subprocess
import sys
import os
import io

url_cover = 'http://5sing.kugou.com/inory/fc/{count}.html'
url_covers = []
ffsave = 'music_save'
isSave = False

def save(filename, contents):
	fh = open(filename, 'w+', encoding='utf-8')
	fh.write(contents)
	fh.close()

def save_append(filename, contents):
	fh = open(filename, 'a+', encoding='utf-8')
	fh.write(contents)
	fh.close()

chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_experimental_option("prefs", {'profile.default_content_settings.popups' : 0, 'download.default_directory' : '.'})
browser = webdriver.Chrome(chrome_options = chromeOptions)
browser.set_page_load_timeout(5)

for i in range(1, 12):
	url_current = url_cover.format(count=i)
	browser.get(url_current)
	WebDriverWait(browser, 2)
	html = browser.page_source
	soup = BeautifulSoup(html, "lxml")
	links = soup.select('a')
	for link in links:
		href = link.get('href')
		if (href.startswith('http://5sing.kugou.com/fc/')) and href.endswith('html'):
			title = link.get('title')
			url_covers.append(href)
			print(title, href)
	print('')

if not os.path.isdir(ffsave):
	os.mkdir(ffsave)

for url in url_covers:
	try:
		browser.get(url)
		browser.set_page_load_timeout(10)
		WebDriverWait(browser, 5)
		html = browser.page_source
		soup = BeautifulSoup(html, "lxml")
		title = soup.select('title')[0].get_text()
		link = soup.select('audio')[0].get('src')
		cmd = 'ffmpeg -i "{url}" -c copy "{filename}.mp3" -y'.format(url=link, filename=title)
		if isSave:
			subprocess.Popen(cmd)
		save_append('{ffsave}/music_url.txt'.format(ffsave=ffsave), cmd + '\r\n')
		print(title, link)
		print(cmd)
		print('')
	except:
		print('download {url} failed!'.format(url=url))
		save_append('{ffsave}/music_error.txt'.format(ffsave=ffsave), 'download {url} failed!'.format(url=url))

browser.quit()

 

#coding:utf-8
import subprocess
import sys
import time

def read(filename):
	fh = open(filename, 'r', encoding='utf-8')
	lines = fh.readlines()
	fh.close()
	return lines

lines = read('music_url.bat')

for line in lines:
	cmd = line.replace('\r', '')
	cmd = cmd.replace('\n', '')
	print(cmd)
	subprocess.Popen(cmd)
	time.sleep(0)

 

 

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
ReadMe Release Version beta_1.0 index.py imageMatlab.py This is more or less a wrapper for Matplotlib imaging functions such that their behavior is equivalent, in terms of colormap, aspect and so forth, to the expected behavior of Matlab's functions. sepVocal.py This script can be used to execute the desired separation. See below for an example of use of this file. SIMM.py This script implements the actual algorithm for parameter estimation. It is mainly used by sepVocal.py. tracking.py The Viterbi decoding algorithm is implemented in this script. Requirements: These scripts have been tested with Python 2.7, The packages that are required to run the scripts are pydub,ffmepg, Numpy, Spicy, Matplotlib. One can respectively find the latest versions at the following addresses: http://pydub.com/ https://ffmpeg.org http://numpy.org/ http://scipy.org/ http://matplotlib.sourceforge.net/ Notes: Prefer recent versions of the above packages, in order to avoid compatibility issues, notably for Matplotlib. Note that this latter package is not necessary for the program to run, although you might want to watch a bit what is happening! Spicy should be version 0.8+, since we use its io.wavefile module to read the wave files. We once used the audio lab module, but it would seem that it is a bit more complicated to install (with the benefit that many more file formats are allowed). Usage: The easy way to use these scripts is to run the exec package of our release version: http://www.github.com/beata_1.0 for more develop: you can run the index.py on pycharm directly. note: the output files will create under you source wav file. ContactMe Email:xlzhang14@fudan.edu.cn

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值