Python分析盘点2024全球流行音乐：是哪些歌曲榜单占领了我们？

2401_86372512

于 2024-09-08 09:18:14 发布

阅读量658

点赞数 21

文章标签： python 开发语言

本文链接：https://blog.csdn.net/2401_86372512/article/details/142017940

版权

不过涉及到的指标都比较专业，我不是太懂，只能根据自己的理解去做分析，有懂音乐的朋友可以提出专业的看法。

这次的数据分析工具是Python，当然如果你Python不是很熟，用tableau也是可以的，做出的图还会更好看。

一、数据准备

1、导入数据

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

from pyecharts import Bar,WordCloud,Pie,Line

%matplotlib inline

%config InlineBackend.figure_format = ‘svg’

df=pd.read_excel(r’C:\Users\Administrator\Desktop\top50.xlsx’)

df.head()

这些代码都是不需要思考的，只要打开Python做数据分析，你首先就写好，或者直接复制就行，我都是把常用代码保存好，要用的时候就调出来用，这样省时间。

列的名称都是英语，我借助了百度做了下翻译：

Track.Name-曲目；
Artist.Name-歌手；
Genre - 类型
Beats Per Minute (BPM) - 每分钟节拍，也就是节奏.
Energy - 能量 - 分数越高，代表能量就越大；
Danceability - 舞蹈性-分数越高，代表你越容易因歌而舞；
Loudness (dB) - 分贝-值越大，说明歌曲越响亮，反之则低沉；
Liveness -现场性-值越大，歌曲越有可能是现场录音的；
Valence - 情绪-值越大，情绪越激昂，反之越消沉；
lentgh-时长；
Acousticness -音质；.
Speechiness -语言-值越大，说明口语化程度越高；
Popularity -火热程度。

2、数据列的名称更改

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

from pyecharts import Bar,WordCloud,Pie,Line

%matplotlib inline

%config InlineBackend.figure_format = ‘svg’

df=pd.read_excel(r’C:\Users\Administrator\Desktop\top50.xlsx’)

df=df.rename(columns={‘Track.Name’:‘曲名’, ‘Artist.Name’:‘歌手’, ‘Genre’:‘类型’, ‘Beats.Per.Minute’:‘节奏’, ‘Energy’:‘能量’,

‘Danceability’:‘舞蹈性’, ‘Loudness…dB…’:‘分贝’,‘Liveness’:‘现场感’, ‘Length.’:‘时长’,‘Speechiness’:‘语言’, ‘Popularity’:‘火热程度’})

df.head(10)

看英语的总是不习惯，所以我们可以把英语的列名改为中文。

二、数据分析

1、2019全球最流行的音乐类型排行

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

from pyecharts import Bar,WordCloud,Pie,Line

%matplotlib inline

%config InlineBackend.figure_format = ‘svg’

df=pd.read_excel(r’C:\Users\Administrator\Desktop\top50.xlsx’)

df=df.rename(columns={‘Track.Name’:‘曲名’, ‘Artist.Name’:‘歌手’, ‘Genre’:‘类型’, ‘Beats.Per.Minute’:‘音调’, ‘Energy’:‘能量’,

‘Danceability’:‘舞蹈性’, ‘Loudness…dB…’:‘分贝’,‘Liveness’:‘现场感’, ‘Length.’:‘时长’,‘Speechiness’:‘语言’, ‘Popularity’:‘火热程度’})

df=df.groupby(‘类型’)[‘曲名’].count().reset_index()

df=df.sort_values(by=‘曲名’,ascending=False).reset_index()

cloud=WordCloud(title=‘2019最流行的音乐类型’,width=800,height=420)

cloud.add(name=‘音乐类型’,attr=df[‘类型’],value=df[‘曲名’],word_size_range=(12,60))

cloud.render(‘2019全球最流行的音乐类型.html’)

cloud

从词云图可以看到，2019年全球最火的还是流行音乐（pop&dance pop）。鉴于其他类型的音乐我都不认识，所以下面的分析，我会直接对pop&dance pop作为主要对象，把他们归为一类。

2、2019年全球流行音乐排行

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

from pyecharts import Bar,WordCloud,Pie,Line

%matplotlib inline

%config InlineBackend.figure_format = ‘svg’

df=pd.read_excel(r’C:\Users\Administrator\Desktop\top50.xlsx’)

df=df.rename(columns={‘Track.Name’:‘曲名’, ‘Artist.Name’:‘歌手’, ‘Genre’:‘类型’, ‘Beats.Per.Minute’:‘音调’, ‘Energy’:‘能量’,

‘Danceability’:‘舞蹈性’, ‘Loudness…dB…’:‘分贝’,‘Liveness’:‘现场感’, ‘Length.’:‘时长’,‘Speechiness’:‘语言’, ‘Popularity’:‘火热程度’})

df=df.replace(‘dance pop’,‘pop’)

df=df[df[‘类型’]==‘pop’].reset_index().drop(‘index’,axis=1)

通过上述代码，我已经把dance pop的类型全部换成pop。

#接上面的代码

df=df.replace(‘dance pop’,‘pop’)

df=df[df[‘类型’]==‘pop’].reset_index().drop(‘index’,axis=1)

df.pivot_table(df,index=‘曲名’).sort_values(by=‘火热程度’,ascending=False).reset_index()

How Do You Sleep?

这是全球最流行的15首流行歌曲。

结合前面的图我们可以知道：这些流行歌曲的口语化程度低，歌词普遍比较优美，有意境；同时时长恰当，多在3分钟左右…

3、根据流行程度对歌曲进行分类颁奖

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

from pyecharts import Bar,WordCloud,Pie,Line

%matplotlib inline

%config InlineBackend.figure_format = ‘svg’

df=pd.read_excel(r’C:\Users\Administrator\Desktop\top50.xlsx’)

df=df.rename(columns={‘Track.Name’:‘曲名’, ‘Artist.Name’:‘歌手’, ‘Genre’:‘类型’, ‘Beats.Per.Minute’:‘音调’, ‘Energy’:‘能量’,

‘Danceability’:‘舞蹈性’, ‘Loudness…dB…’:‘分贝’,‘Liveness’:‘现场感’, ‘Length.’:‘时长’,‘Speechiness’:‘语言’, ‘Popularity’:‘火热程度’})

df=df.replace(‘dance pop’,‘pop’)

df=df[df[‘类型’]==‘pop’].reset_index().drop(‘index’,axis=1)

df=df.pivot_table(‘火热程度’,index=‘曲名’).sort_values(by=‘火热程度’,ascending=False).reset_index()

def grade(火热程度):

if(火热程度>=90):

return ‘年度最热’

if(火热程度>=85):

return ‘年度火热’

else:

return ‘年度流行’

df[‘授予荣誉’] = df.apply(lambda x :grade(x[‘火热程度’]), axis=1)

我们知道，很多媒体都喜欢搞排行榜，而且喜欢给歌曲颁奖，这些颁奖一般会根据几个标准进行打分，算出综合排名。不过这个比较复杂，这里只根据流行程度颁奖，大于90分的就是年度最热；85-89的是年度火热；84以下的就是年度流行。这个实现代码很简单，做出分类，再给数据加一列，命名为“授予荣誉”即可。

#接上面的代码

plt.rcParams[‘font.sans-serif’]=[‘SimHei’]

plt.figure(figsize=(8,4))

sns.countplot(x=“授予荣誉”,data=df, order=[‘年度最热’,‘年度火热’,‘年度流行’],palette=“muted”)

plt.title(‘2019年全球流行音乐荣誉’,loc=‘left’,size=15)

plt.xlabel(‘授予荣誉’,size=15)

plt.ylabel(‘数量’,size=15)

plt.grid(False)

sns.despine(left=False )

2401_86372512

关注

21
点赞
踩
12

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫