Python数据分析初探项目基于Python数据可视化的网易云音乐歌单分析系统大学编程作业（TUST 天津科技大学 2022年）

末影小黑xh

已于 2023-08-17 22:33:35 修改

阅读量5.7k

点赞数 37

分类专栏：源码分享大学学习文章标签： python 数据分析信息可视化娱乐

于 2023-03-28 17:56:59 首次发布

本文链接：https://blog.csdn.net/qq_40734758/article/details/129820594

版权

大学学习同时被 2 个专栏收录

14 篇文章

订阅专栏

源码分享

12 篇文章

订阅专栏

Python 数据分析初探项目基于 Python 数据可视化的网易云音乐歌单分析系统大学编程作业（TUST 天津科技大学 2022 年）

Python 数据分析初探项目基于 Python 数据可视化的网易云音乐歌单分析系统大学编程作业（TUST 天津科技大学 2022 年）
- 一、项目简介
- 二、交流学习
Python 数据分析初探项目基于 Python 数据可视化的网易云音乐歌单分析系统

一、项目简介

本基于 Python 数据可视化的网易云音乐歌单分析系统，我使用了 Python 丰富的第三方开源模块，如 numpy, pandas, matplotlib, time, requests, squarify, jieba, wordcloud, bs4 等来制作，实现了对网易云音乐歌单数据的获取，对歌单数据进行可视化分析，得出歌单的评论、收藏、播放、贡献、分布的数量图以及词云，并提出歌单优化的建议。通过这次 Python 数据分析初探项目的实践，我巩固了 Python 的语法知识，熟练应用了各个第三方开源模块，为之后的 Python 数据分析学习打下基础。

这个项目是我大三写的，现在回顾已经非常粗糙，分享出来一方面希望可以帮助初学者，另一方面希望能让同学们可以从目前大学中普遍毫无价值的形式主义作业中解脱出来，更加高效地学习优质计算机知识和主流编程技术，一起发扬开源精神，感受互联网技术的美好愿景。

二、交流学习

互联网开源精神需要大家一起互相交流学习，互相支持奉献。欢迎大家与我友好交流。

加我 QQ 好友获取所有项目源码和项目文档，感谢大家的支持！

Python 数据分析初探项目基于 Python 数据可视化的网易云音乐歌单分析系统

一、项目简介

（一）项目背景

随着音乐软件的普及，海量的相关数据被创造。在大数据的时代，任何大量的数据一旦被利用起来，将会产生巨大的价值。利用 Python 分析歌曲的相关数据来挖掘客户的需求并更进一步的扩大用户量的例子比比皆是。

考虑到现实的可操作性以及 Python 在数据分析和交互、探索性计算以及数据可视化等方面都有非常成熟的库。且经过小组测试可行性，决定利用 Python 对音乐软件歌单进行分析。

（二）项目过程

此次项目利用 Python 对网易云音乐歌单数据的获取，对歌单数据进行可视化分析。得出歌单的评论、收藏、播放、贡献、分布的数量图以及词云，并提出歌单优化的建议。

项目利用爬虫对数据获取，后对其进行数据清洗，最终进行数据可视化。在分析过程中使用 numpy, pandas, matplotlib, time, requests, squarify, jieba, wordcloud, bs4 第三方模块，最后以柱状图，词云图以及标签图来展示歌曲收藏量，播放量等相关分析结果并结合相关数据优化歌单播放量。

最后我们实现了项目，并对项目进行了测试。

歌单索引网页调试分析

图 1 歌单索引网页调试分析

歌单详情网页调试分析

图 2 歌单详情网页调试分析

二、项目设计流程图

（一）基于 Python 数据可视化的网易云音乐歌单分析系统的整体架构

基于 Python 数据可视化的网易云音乐歌单分析系统的整体架构图

图 3 基于 Python 数据可视化的网易云音乐歌单分析系统的整体架构图

（二）获取歌单索引页的信息

获取歌单索引页的信息的流程图

图 4 获取歌单索引页的信息的流程图

（三）获取歌单详情页的信息

获取歌单详情页的信息的流程图

图 5 获取歌单详情页的信息的流程图

（四）歌曲出现次数 TOP10

歌曲出现次数 TOP10 的流程图

图 6 歌曲出现次数 TOP10 的流程图

（五）网易云音乐欧美歌单播放 TOP10

网易云音乐欧美歌单播放 TOP10 的流程图

图 7 网易云音乐欧美歌单播放 TOP10 的流程图

（六）网易云音乐欧美歌单评论 TOP10

网易云音乐欧美歌单评论 TOP10 的流程图

图 8 网易云音乐欧美歌单评论 TOP10 的流程图

（七）欧美歌单播放数量分布情况

欧美歌单播放数量分布情况的流程图

图 9 欧美歌单播放数量分布情况的流程图

（八）网易云音乐欧美歌单标签图

网易云音乐欧美歌单标签图的流程图

图 10 网易云音乐欧美歌单标签图的流程图

（九）歌单介绍词云图

歌单介绍词云图的流程图

图 11 歌单介绍词云图的流程图

三、项目实现代码

（一）netease_cloud_music_data_analysis.py

import os

from music_index import get_data_of_music_list_index_page
from music_detail import get_data_of_music_list_detail_page
from top_10_song import data_visualization_of_top_10_song
from top_10_song_up import data_visualization_of_top_10_song_up
from top_10_ea_song_playlists import data_visualization_of_top_10_ea_song_playlists
from top_10_of_ea_song_collection import data_visualization_of_top_10_of_ea_song_collection
from top_10_of_ea_song_comment import data_visualization_of_top_10_of_ea_song_comment
from top_10_ea_song_collection_distribution import data_visualization_of_top_10_ea_song_collection_distribution
from top_10_ea_song_playlists_distribution import data_visualization_of_top_10_ea_song_playlists_distribution
from label_ea_song import data_visualization_of_label_ea_song
from music_wordcloud import data_visualization_of_music_wordcloud


def menu():
    """网易云音乐数据分析系统菜单"""
    print("欢迎使用网易云音乐数据分析系统！(^▽^ )")
    print("---------------------------------------------")
    print("")
    print("        【网易云音乐数据分析系统】 ")
    print("")
    print("        A.获取歌单索引页的信息")
    print("        B.获取歌单详情页的信息")
    print("        C.生成歌曲出现次数 Top10 图片")
    print("        D.生成歌单贡献 UP 主 TOP10 图片")
    print("        E.生成网易云音乐欧美歌单播放 TOP10 图片")
    print("        F.生成网易云音乐欧美歌单收藏 TOP10 图片")
    print("        G.生成网易云音乐欧美歌单评论 TOP10 图片")
    print("        H.生成欧美歌单收藏数量分布情况图片")
    print("        I.生成欧美歌单播放数量分布情况图片")
    print("        J.生成网易云音乐欧美歌单标签图片")
    print("        K.生成歌单介绍词云图片")
    print("")
    print("---------------------------------------------")
    print("请输入您要进行的操作（输入 quit 退出！）：")


def key_down():
    """网易云音乐数据分析系统功能交互"""
    option = input()

    if option == 'quit' or option == 'QUIT':
        print("已退出！\n\n")
        input()

        exit(0)
    elif option == 'a' or option == 'A':
        # 获取歌单索引页的信息
        get_data_of_music_list_index_page()

        return
    elif option == 'b' or option == 'B':
        # 获取歌单详情页的信息
        get_data_of_music_list_detail_page()

        return
    elif option == 'c' or option == 'C':
        # 生成歌曲出现次数 Top10 图片
        data_visualization_of_top_10_song()

        return
    elif option == 'd' or option == 'D':
        # 生成歌单贡献 UP 主 TOP10 图片
        data_visualization_of_top_10_song_up()

        return
    elif option == 'e' or option == 'E':
        # 生成网易云音乐欧美歌单播放 TOP10 图片
        data_visualization_of_top_10_ea_song_playlists()

        return
    elif option == 'f' or option == 'F':
        # 生成网易云音乐欧美歌单收藏 TOP10 图片
        data_visualization_of_top_10_of_ea_song_collection()

        return
    elif option == 'g' or option == 'G':
        # 生成网易云音乐欧美歌单评论 TOP10 图片
        data_visualization_of_top_10_of_ea_song_comment()

        return
    elif option == 'h' or option == 'H':
        # 生成欧美歌单收藏数量分布情况图片
        data_visualization_of_top_10_ea_song_collection_distribution()

        return
    elif option == 'i' or option == 'I':
        # 生成欧美歌单播放数量分布情况图片
        data_visualization_of_top_10_ea_song_playlists_distribution()

        return
    elif option == 'j' or option == 'J':
        # 生成网易云音乐欧美歌单标签图片
        data_visualization_of_label_ea_song()

        return
    elif option == 'k' or option == 'K':
        # 生成歌单介绍词云图片
        data_visualization_of_music_wordcloud()

        return
    else:
        print("选择错误，请重新输入！\n\n")
        input()

        return


if __name__ == '__main__':
    """运行界面及功能代码"""
    while True:
        menu()
        key_down()

        # 清屏
        os.system('cls')

（二）music_index.py

"""数据获取，获取歌单索引页的信息"""
from bs4 import BeautifulSoup
import requests
import time


def get_data_of_music_list_index_page():
    """获取歌单索引页的信息"""
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) '
                      'Chrome/63.0.3239.132 Safari/537.36 '
    }

    print("正在获取歌单索引页的信息...")

    # 输出进度条
    t = 60
    start = time.perf_counter()

    for i in range(t + 1):
        finsh = "▓" * i
        need_do = "-" * (t - i)
        progress = (i / t) * 100
        dur = time.perf_counter() - start

        print("\r{:^3.0f}%[{}->{}]{:.2f}s".format(progress, finsh, need_do, dur), end="")

        time.sleep(0.02)

    for i in range(0, 1330, 35):
        # print('\r', i, end='', flush=True)

        time.sleep(2)

        url = 'https://music.163.com/discover/playlist/?cat=欧美&order=hot&limit=35&offset=' + str(i)
        response = requests.get(url=url, headers=headers)
        html = response.text
        soup = BeautifulSoup(html, 'html.parser')

        # 获取包含歌单详情页网址的标签
        ids = soup.select('.dec a')

        # 获取包含歌单索引页信息的标签
        lis = soup.select('#m-pl-container li')
        # print('\r', len(lis), end='', flush=True)

        for j in range(len(lis)):
            # 获取歌单详情页地址
            url = ids[j]['href']

            # 获取歌单标题,替换英文分割符
            title = ids[j]['title'].replace(',', '，')

            # 获取歌单播放量
            play = lis[j].select('.nb')[0].get_text()

            # 获取歌单贡献者名字
            user = lis[j].select('p')[1].select('a')[0].get_text()

            # 输出歌单索引页信息
            print('\r', url, title, play, user, end='', flush=True)

            # 将索引页写入CSV文件中
            with open('./music_data/music_list.csv', 'a+', encoding='utf-8-sig') as f:
                f.write(url + ',' + title + ',' + play + ',' + user + '\n')

    print("\n已获取歌单索引页的信息，保存至 music_data/music_list.csv")

（三）music_detail.py

"""数据获取，获取歌单详情页的信息"""
from bs4 import BeautifulSoup
import pandas as pd
import requests
import time


def get_data_of_music_list_detail_page():
    """获取歌单详情页的信息"""
    df = pd.read_csv('./music_data/music_list.csv', header=None, on_bad_lines=None, names=['url', 'title', 'play',
                                                                                           'user'])

    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) '
                      'Chrome/63.0.3239.132 Safari/537.36 '
    }

    print("正在获取歌单详情页的信息...")

    # 输出进度条
    t = 60
    start = time.perf_counter()

    for i in range(t + 1):
        finsh = "▓" * i
        need_do = "-" * (t - i)
        progress = (i / t) * 100
        dur = time.perf_counter() - start

        print("\r{:^3.0f}%[{}->{}]{:.2f}s".format(progress, finsh, need_do, dur), end="")

        time.sleep(0.02)

    for i in df['url']:
        time.sleep(2)

        url = 'https://music.163.com' + i
        response = requests.get(url=url, headers=headers)
        html = response.text
        soup = BeautifulSoup(html, 'html.parser')

        # 获取歌单标题
        title = soup.select('h2')[0].get_text().replace(',', '，')

        # 获取标签
        tags = []
        tags_message = soup.select('.u-tag i')

        for p in tags_message:
            tags.append(p.get_text())

        # 对标签进行格式化
        if len(tags) > 1:
            tag = '-'.join(tags)
        else:
            tag = tags[0]

        # 获取歌单介绍
        if soup.select('#album-desc-more'):
            text = soup.select('#album-desc-more')[0].get_text().replace('\n', '').replace(',', '，')
        else:
            text = '无'

        # 获取歌单收藏量
        collection = soup.select('#content-operation i')[1].get_text().replace('(', '').replace(')', '')

        # 歌单播放量
        play = soup.select('.s-fc6')[0].get_text()

        # 歌单内歌曲数
        songs = soup.select('#playlist-track-count')[0].get_text()

        # 歌单评论数
        comments = soup.select('#cnt_comment_count')[0].get_text()

        # 输出歌单详情页信息
        print('\r', title, tag, text, collection, play, songs, comments, end='', flush=True)

        # 将详情页信息写入CSV文件中
        with open('./music_data/music_detail.csv', 'a+', encoding='utf-8-sig') as f:
            f.write(title + ',' + tag + ',' + text + ',' + collection + ',' + play + ',' + songs + ',' + comments +
                    '\n')

        # 获取歌单内歌曲名称
        li = soup.select('.f-hide li a')

        for j in li:
            with open('./music_data/music_name.csv', 'a+', encoding='utf-8-sig') as f:
                f.write(j.get_text() + '\n')

print("\n已获取歌单详情页的信息，保存至 music_data/music_name.csv")

（四）top_10_song.py

"""数据可视化，歌曲出现次数 Top10"""
import pandas as pd
import matplotlib.pyplot as plt
import time


def data_visualization_of_top_10_song():
    """歌曲出现次数 Top10"""
    df = pd.read_csv('./music_data/music_detail.csv', header=None, names=['title'], encoding='utf-8-sig')

    print("正在生成歌曲出现次数 Top10 图片...")

    # 输出进度条
    t = 60
    start = time.perf_counter()

    for i in range(t + 1):
        finsh = "▓" * i
        need_do = "-" * (t - i)
        progress = (i / t) * 100
        dur = time.perf_counter() - start

        print("\r{:^3.0f}%[{}->{}]{:.2f}s".format(progress, finsh, need_do, dur), end="")

        time.sleep(0.02)

    # 数据聚合分组
    place_message = df.groupby(['title'])
    place_com = place_message['title'].agg(['count'])
    place_com.reset_index(inplace=True)
    place_com_last = place_com.sort_index()
    dom = place_com_last.sort_values('count', ascending=False)[0:10]

    # 设置显示数据
    names = [i for i in dom.title]
    names.reverse()
    nums = [i for i in dom['count']]
    nums.reverse()
    data = pd.Series(nums, index=names)

    # 设置图片显示属性,字体及大小
    plt.rcParams['font.sans-serif'] = ['Microsoft YaHei']
    plt.rcParams['font.size'] = 10
    plt.rcParams['axes.unicode_minus'] = False

    # 设置图片显示属性
    plt.figure(figsize=(16, 8), dpi=80)
    ax = plt.subplot(1, 1, 1)
    ax.patch.set_color('white')

    # 设置坐标轴属性
    lines = plt.gca()

    # 设置坐标轴颜色
    lines.spines['right'].set_color('none')
    lines.spines['top'].set_color('none')
    lines.spines['left'].set_color((64/255, 64/255, 64/255))
    lines.spines['bottom'].set_color((64/255, 64/255, 64/255))

    # 设置坐标轴刻度
    lines.xaxis.set_ticks_position('none')
    lines.yaxis.set_ticks_position('none')

    # 绘制柱状图,设置柱状图颜色
    data.plot.barh(ax=ax, width=0.7, alpha=0.7, color=(16/255, 152/255, 168/255))

    # 添加标题,设置字体大小
    ax.set_title('网易云音乐欧美歌单歌曲 TOP10', fontsize=18, fontweight='light')

    # 添加歌曲出现次数文本
    for x, y in enumerate(data.values):
        plt.text(y+3.5, x-0.12, '%s' % y, ha='center')

    # 保存图片
    plt.savefig('./music_image/top_10_song.png', dpi=None)

    # 显示图片
    plt.show()

print("\n已生成歌曲出现次数 Top10 图片，保存至 music_image/top_10_song.png")

（五）top_10_ea_song_playlists.py

"""数据可视化，网易云音乐欧美歌单播放 TOP10"""
import pandas as pd
import matplotlib.pyplot as plt
import time


def data_visualization_of_top_10_ea_song_playlists():
    """网易云音乐欧美歌单播放 TOP10"""
    df = pd.read_csv('./music_data/music_detail.csv', header=None)
    df['play'] = df[4]

    print("正在生成网易云音乐欧美歌单播放 TOP10 图片...")

    # 输出进度条
    t = 60
    start = time.perf_counter()

    for i in range(t + 1):
        finsh = "▓" * i
        need_do = "-" * (t - i)
        progress = (i / t) * 100
        dur = time.perf_counter() - start

        print("\r{:^3.0f}%[{}->{}]{:.2f}s".format(progress, finsh, need_do, dur), end="")

        time.sleep(0.02)

    # 数据排序
    names = df.sort_values(by='play', ascending=False)[0][:10]
    plays = df.sort_values(by='play', ascending=False)['play'][:10]

    # 设置显示数据
    names = [i for i in names]
    names.reverse()
    plays = [i for i in plays]
    plays.reverse()
    data = pd.Series(plays, index=names)

    # 设置图片显示属性,字体及大小
    plt.rcParams['font.sans-serif'] = ['Microsoft YaHei']
    plt.rcParams['font.size'] = 8
    plt.rcParams['axes.unicode_minus'] = False

    # 设置图片显示属性
    plt.figure(figsize=(16, 8), dpi=80)
    ax = plt.subplot(1, 1, 1)
    ax.patch.set_color('white')

    # 设置坐标轴属性
    lines = plt.gca()

    # 设置坐标轴颜色
    lines.spines['right'].set_color('none')
    lines.spines['top'].set_color('none')
    lines.spines['left'].set_color((64/255, 64/255, 64/255))
    lines.spines['bottom'].set_color((64/255, 64/255, 64/255))

    # 设置坐标轴刻度
    lines.xaxis.set_ticks_position('none')
    lines.yaxis.set_ticks_position('none')

    # 绘制柱状图,设置柱状图颜色
    data.plot.barh(ax=ax, width=0.7, alpha=0.7, color=(136/255, 43/255, 48/255))

    # 添加标题,设置字体属性
    ax.set_title('网易云音乐欧美歌单播放 TOP10', fontsize=18, fontweight='light')

    # 添加歌单收藏数量文本
    for x, y in enumerate(data.values):
        num = str(int(y / 10000))
        plt.text(y+1800000, x-0.08, '%s' % (num + '万'), ha='center')

    # 保存图片
    plt.savefig('./music_image/top_10_ea_song_playlists.png', dpi=None)

    # 显示图片
    plt.show()

print("\n已生成网易云音乐欧美歌单播放 TOP10 图片，保存至 music_image/top_10_ea_song_playlists.png")

（六）top_10_of_ea_song_collection.py

"""数据可视化，网易云音乐欧美歌单收藏 TOP10"""
import pandas as pd
import matplotlib.pyplot as plt
import time


def data_visualization_of_top_10_of_ea_song_collection():
    """网易云音乐欧美歌单收藏 TOP10"""
    df = pd.read_csv('./music_data/music_detail.csv', header=None)

    print("正在生成网易云音乐欧美歌单收藏 TOP10 图片...")

    # 输出进度条
    t = 60
    start = time.perf_counter()

    for i in range(t + 1):
        finsh = "▓" * i
        need_do = "-" * (t - i)
        progress = (i / t) * 100
        dur = time.perf_counter() - start

        print("\r{:^3.0f}%[{}->{}]{:.2f}s".format(progress,
              finsh, need_do, dur), end="")

        time.sleep(0.02)

    # 数据清洗
    dom = []
    for i in df[3]:
        dom.append(int(i.replace('万', '0000')))

    df['collection'] = dom

    # 数据排序
    names = df.sort_values(by='collection', ascending=False)[0][:10]
    collections = df.sort_values(by='collection', ascending=False)[
        'collection'][:10]

    # 设置显示数据
    names = [i for i in names]
    names.reverse()
    collections = [i for i in collections]
    collections.reverse()
    data = pd.Series(collections, index=names)

    # 设置图片显示属性,字体及大小
    plt.rcParams['font.sans-serif'] = ['Microsoft YaHei']
    plt.rcParams['font.size'] = 8
    plt.rcParams['axes.unicode_minus'] = False

    # 设置图片显示属性
    plt.figure(figsize=(16, 8), dpi=80)
    ax = plt.subplot(1, 1, 1)
    ax.patch.set_color('white')

    # 设置坐标轴属性
    lines = plt.gca()

    # 设置坐标轴颜色
    lines.spines['right'].set_color('none')
    lines.spines['top'].set_color('none')
    lines.spines['left'].set_color((64/255, 64/255, 64/255))
    lines.spines['bottom'].set_color((64/255, 64/255, 64/255))

    # 设置坐标轴刻度
    lines.xaxis.set_ticks_position('none')
    lines.yaxis.set_ticks_position('none')

    # 绘制柱状图,设置柱状图颜色
    data.plot.barh(ax=ax, width=0.7, alpha=0.7, color=(8/255, 88/255, 121/255))

    # 添加标题,设置字体属性
    ax.set_title('网易云音乐欧美歌单收藏 TOP10', fontsize=18, fontweight='light')

    # 添加歌单收藏数量文本
    for x, y in enumerate(data.values):
        num = str(y/10000)
        plt.text(y+20000, x-0.08, '%s' % (num + '万'), ha='center')

    # 保存图片
    plt.savefig('./music_image/top_10_of_ea_song_collection.png', dpi=None)

    # 显示图片
    plt.show()

print("\n已生成网易云音乐欧美歌单收藏 TOP10 图片，保存至 music_image/top_10_of_ea_song_collection.png")

（七）top_10_of_ea_song_comment.py

"""数据可视化，网易云音乐欧美歌单评论 TOP10"""
import pandas as pd
import matplotlib.pyplot as plt
import time


def data_visualization_of_top_10_of_ea_song_comment():
    """网易云音乐欧美歌单评论 TOP10"""
    df = pd.read_csv('./music_data/music_detail.csv', header=None)

    print("正在生成网易云音乐欧美歌单评论 TOP10 图片...")

    # 输出进度条
    t = 60
    start = time.perf_counter()

    for i in range(t + 1):
        finsh = "▓" * i
        need_do = "-" * (t - i)
        progress = (i / t) * 100
        dur = time.perf_counter() - start

        print("\r{:^3.0f}%[{}->{}]{:.2f}s".format(progress, finsh, need_do, dur), end="")

        time.sleep(0.02)

    # 数据清洗
    df['love'] = [int(i.replace('评论', '0')) for i in df[6]]

    # 数据排序
    names = df.sort_values(by='love', ascending=False)[0][:10]
    comments = df.sort_values(by='love', ascending=False)['love'][:10]

    # 设置显示数据
    names = [i for i in names]
    names.reverse()
    comments = [i for i in comments]
    comments.reverse()
    data = pd.Series(comments, index=names)

    # 设置图片显示属性,字体及大小
    plt.rcParams['font.sans-serif'] = ['Microsoft YaHei']
    plt.rcParams['font.size'] = 8
    plt.rcParams['axes.unicode_minus'] = False

    # 设置图片显示属性
    plt.figure(figsize=(16, 8), dpi=80)
    ax = plt.subplot(1, 1, 1)
    ax.patch.set_color('white')

    # 设置坐标轴属性
    lines = plt.gca()

    # 设置坐标轴颜色
    lines.spines['right'].set_color('none')
    lines.spines['top'].set_color('none')
    lines.spines['left'].set_color((64/255, 64/255, 64/255))
    lines.spines['bottom'].set_color((64/255, 64/255, 64/255))

    # 设置坐标轴刻度
    lines.xaxis.set_ticks_position('none')
    lines.yaxis.set_ticks_position('none')

    # 绘制柱状图,设置柱状图颜色
    data.plot.barh(ax=ax, width=0.7, alpha=0.7, color=(160/255, 102/255, 50/255))
    ax.set_title('网易云音乐欧美歌单评论 TOP10', fontsize=18, fontweight='light')

    # 添加歌单评论数量文本
    for x, y in enumerate(data.values):
        plt.text(y+200, x-0.08, '%s' % y, ha='center')

    # 保存图片
    plt.savefig('./music_image/top_10_of_ea_song_comment.png', dpi=None)

    # 显示图片
    plt.show()

print("\n已生成网易云音乐欧美歌单评论 TOP10 图片，保存至 music_image/top_10_of_ea_song_comment.png")

（八）top_10_ea_song_collection_distribution.py

"""数据可视化，欧美歌单收藏数量分布情况"""
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import time


def data_visualization_of_top_10_ea_song_collection_distribution():
    """欧美歌单收藏数量分布情况"""
    df = pd.read_csv('./music_data/music_detail.csv', header=None)

    print("正在生成欧美歌单收藏数量分布情况图片...")

    # 输出进度条
    t = 60
    start = time.perf_counter()

    for i in range(t + 1):
        finsh = "▓" * i
        need_do = "-" * (t - i)
        progress = (i / t) * 100
        dur = time.perf_counter() - start

        print("\r{:^3.0f}%[{}->{}]{:.2f}s".format(progress, finsh, need_do, dur), end="")

        time.sleep(0.02)

    # 对收藏数取对数
    dom = []
    for i in df[3]:
        dom.append(np.log(int(i.replace('万', '0000'))))

    df['collection'] = dom

    # 设置图片显示属性,字体及大小
    plt.rcParams['font.sans-serif'] = ['Microsoft YaHei']
    plt.rcParams['font.size'] = 12
    plt.rcParams['axes.unicode_minus'] = False

    # 设置图片显示属性
    plt.figure(figsize=(16, 8), dpi=80)
    ax = plt.subplot(1, 1, 1)
    ax.patch.set_color('white')

    # 设置坐标轴属性
    lines = plt.gca()

    # 设置坐标轴颜色
    lines.spines['right'].set_color('none')
    lines.spines['top'].set_color('none')
    lines.spines['left'].set_color((64/255, 64/255, 64/255))
    lines.spines['bottom'].set_color((64/255, 64/255, 64/255))
    lines.xaxis.set_ticks_position('none')
    lines.yaxis.set_ticks_position('none')

    # 绘制直方图,设置直方图颜色
    ax.hist(df['collection'], bins=30, alpha=0.7, color=(21/255, 47/255, 71/255))
    ax.set_title('欧美歌单收藏数量分布情况', fontsize=20)

    # 保存图片
    plt.savefig('./music_image/top_10_ea_song_collection_distribution.png', dpi=None)

    # 显示图片
    plt.show()

print("\n已生成欧美歌单收藏数量分布情况图片，保存至 music_image/top_10_ea_song_collection_distribution.png")

（九）top_10_ea_song_playlists_distribution.py

"""数据可视化，欧美歌单播放数量分布情况"""
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import time


def data_visualization_of_top_10_ea_song_playlists_distribution():
    """欧美歌单播放数量分布情况"""
    df = pd.read_csv('./music_data/music_detail.csv', header=None)

    print("正在生成欧美歌单播放数量分布情况图片...")

    # 输出进度条
    t = 60
    start = time.perf_counter()

    for i in range(t + 1):
        finsh = "▓" * i
        need_do = "-" * (t - i)
        progress = (i / t) * 100
        dur = time.perf_counter() - start

        print("\r{:^3.0f}%[{}->{}]{:.2f}s".format(progress, finsh, need_do, dur), end="")

        time.sleep(0.02)

    # 对播放数取对数
    dom = []
    for i in df[4]:
        dom.append(np.log(i))

    df['collection'] = dom

    # 设置图片显示属性,字体及大小
    plt.rcParams['font.sans-serif'] = ['Microsoft YaHei']
    plt.rcParams['font.size'] = 12
    plt.rcParams['axes.unicode_minus'] = False

    # 设置图片显示属性
    plt.figure(figsize=(16, 8), dpi=80)
    ax = plt.subplot(1, 1, 1)
    ax.patch.set_color('white')

    # 设置坐标轴属性
    lines = plt.gca()

    # 设置坐标轴颜色
    lines.spines['right'].set_color('none')
    lines.spines['top'].set_color('none')
    lines.spines['left'].set_color((64/255, 64/255, 64/255))
    lines.spines['bottom'].set_color((64/255, 64/255, 64/255))
    lines.xaxis.set_ticks_position('none')
    lines.yaxis.set_ticks_position('none')

    # 绘制直方图,设置直方图颜色
    ax.hist(df['collection'], bins=30, alpha=0.7, color=(255/255, 153/255, 0/255))
    ax.set_title('欧美歌单播放数量分布情况', fontsize=20)

    # 保存图片
    plt.savefig('./music_image/top_10_ea_song_playlists_distribution.png', dpi=None)

    # 显示图片
    plt.show()

    print("\n已生成欧美歌单播放数量分布情况图片，保存至 music_image/top_10_ea_song_playlists_distribution.png")

（十）label_ea_song.py

"""数据可视化，网易云音乐欧美歌单标签图"""
import squarify
import pandas as pd
import matplotlib.pyplot as plt
import time


def data_visualization_of_label_ea_song():
    """网易云音乐欧美歌单标签图"""
    df = pd.read_csv('./music_data/music_detail.csv', header=None)

    print("正在生成网易云音乐欧美歌单标签图片...")

    # 输出进度条
    t = 60
    start = time.perf_counter()

    for i in range(t + 1):
        finsh = "▓" * i
        need_do = "-" * (t - i)
        progress = (i / t) * 100
        dur = time.perf_counter() - start

        print("\r{:^3.0f}%[{}->{}]{:.2f}s".format(progress, finsh, need_do, dur), end="")

        time.sleep(0.02)

    # 处理标签信息
    tags = []
    dom2 = []

    for i in df[1]:
        c = i.split('-')

        for j in c:
            if j not in tags:
                tags.append(j)
            else:
                continue

    for item in tags:
        num = 0

        for i in df[1]:
            type2 = i.split('-')

            for j in range(len(type2)):
                if type2[j] == item:
                    num += 1
                else:
                    continue

        dom2.append(num)

    # 数据创建
    data = {'tags': tags, 'num': dom2}
    frame = pd.DataFrame(data)
    df1 = frame.sort_values(by='num', ascending=False)
    name = df1['tags'][:10]
    income = df1['num'][:10]

    # 绘图 details
    colors = ['#993333', '#CC9966',  '#333333', '#663366', '#003366', '#009966', '#FF6600', '#FF0033', '#009999',
              '#333366']
    plot = squarify.plot(sizes=income, label=name, color=colors, alpha=1, value=income, edgecolor='white',
                         linewidth=1.5)

    # 设置图片显示属性,字体及大小
    plt.rcParams['font.sans-serif'] = ['Microsoft YaHei']
    plt.rcParams['font.size'] = 8
    plt.rcParams['axes.unicode_minus'] = False

    # 设置标签大小为 1
    plt.rc('font', size=6)

    # 设置标题大小
    plot.set_title('网易云音乐欧美歌单标签图', fontsize=13, fontweight='light')

    # 除坐标轴
    plt.axis('off')

    # 除上边框和右边框刻度
    plt.tick_params(top=False, right=False)

    # 保存图片
    plt.savefig('./music_image/label_ea_song.png', dpi=None)

    # 显示图片
    plt.show()

print("\n已生成网易云音乐欧美歌单标签图片，保存至 music_image/label_ea_song.png")

（十一）music_wordcloud.py

"""数据可视化，歌单介绍词云图"""
from wordcloud import WordCloud, ImageColorGenerator
import matplotlib.pyplot as plt
import pandas as pd
import jieba
import time


def data_visualization_of_music_wordcloud():
    """歌单介绍词云图"""
    df = pd.read_csv('./music_data/music_detail.csv', header=None)
    text = ''

    print("正在生成歌单介绍词云图片...")

    # 输出进度条
    t = 60
    start = time.perf_counter()

    for i in range(t + 1):
        finsh = "▓" * i
        need_do = "-" * (t - i)
        progress = (i / t) * 100
        dur = time.perf_counter() - start

        print("\r{:^3.0f}%[{}->{}]{:.2f}s".format(progress, finsh, need_do, dur), end="")

        time.sleep(0.02)

    for line in df[2]:
        text += ' '.join(jieba.cut(line, cut_all=False))

    background_image = plt.imread('./music_image/background_image.jpg')

    stopwords = set('')
    stopwords.update(
        ['封面', 'none介绍', '介绍', '歌单', '歌曲', '我们', '自己', '没有', '就是', '可以', '知道', '一起', '不是',
         '因为', '什么', '时候', '还是', '如果', '不要', '那些', '那么', '那个', '所有', '一样', '一直', '不会', '现在',
         '他们', '这样', '最后', '这个', '只是', '有些', '其实', '开始', '曾经', '所以', '不能', '你们', '已经', '后来',
         '一切', '一定', '这些', '一些', '只有', '还有'])

    wc = WordCloud(
        background_color='white',
        mask=background_image,
        font_path='./font_resources/STZHONGS.ttf',
        max_words=2000,
        max_font_size=150,
        random_state=30,
        stopwords=stopwords
    )
    wc.generate_from_text(text)

    # 看看词频高的有哪些,把无用信息去除
    process_word = WordCloud.process_text(wc, text)
    sort = sorted(process_word.items(), key=lambda e: e[1], reverse=True)
    # print(sort[:50])

    img_colors = ImageColorGenerator(background_image)
    wc.recolor(color_func=img_colors)
    plt.imshow(wc)
    plt.axis('off')

    # 保存图片
    wc.to_file("./music_image/music_wordcloud.png")

    # 显示图片
    plt.show()

    print("\n已生成歌单介绍词云图片，保存至 music_image/music_wordcloud.png")