djing的数据科学

介绍(Introduction)

Electronic Dance Music (EDM) is a unique genre of music in that live performances rarely involve instruments or singing. Rather, DJs excite the crowd through their mixing techniques and song selection. Thanks to 1001tracklist.com and the Spotify Developer Web API, we have access to data around what songs DJs have played during their DJ sets and what audio characteristics these songs have.

电子舞曲(EDM)是一种独特的音乐流派,因为现场表演很少涉及乐器或唱歌。 相反,DJ通过其混合技术和歌曲选择来激发观众的热情。 借助1001tracklist.comSpotify Developer Web API ,我们可以访问有关DJ在其DJ设置期间播放过哪些歌曲以及这些歌曲具有哪些音频特性的数据。

In this article, I will explain how I:

在本文中,我将解释如何:

  1. Scraped tracklist (list of songs played in a certain order) data from 1001tracklists.com

    来自1001tracklists.com的已废弃曲目列表(按特定顺序播放的歌曲列表)数据

  2. Passed those songs through the Spotify API to receive the audio features for each track

    通过Spotify API传递这些歌曲以接收每个曲目的音频功能
  3. Cleaned, transformed, and analyzed the combined data to gain insights about how audio features, like tempo and key, change throughout a DJ set

    清理,转换和分析合并的数据,以获取有关音频功能(如速度和键)在整个DJ设置中如何变化的见解

抓取DJ设置的歌曲数据(Scraping the DJ set song data)

1001tracklists.com is a cool website that crowdsources the sequential list of songs that DJs play at their shows. Fans go on the website and construct tracklists and provide links to the songs and to a recording of the set itself.

1001tracklists.com是一个很酷的网站,它众包DJ在其节目中播放的歌曲的顺序列表。 粉丝可以在网站上构建曲目列表,并提供歌曲链接以及该曲目本身的录音。

Image for post
Screenshot of a tracklist page from 1001tracklists.com
来自1001tracklists.com的跟踪列表页面的屏幕截图

As an EDM fan / DJ hobbyist, I stumbled across the site and saw the potential for this project. I picked 10 famous DJs and for each of them, scraped the data of their 10 latest tracklists. With the song order and song choice data, I knew I could use the Spotify API (which I had previously heard of) to analyze the DJs’ song choice technique across a set of quantitative metrics.

作为一名EDM爱好者/ DJ爱好者,我偶然发现了该站点,并看到了该项目的潜力。 我挑选了10个著名的DJ,并为他们每个人抓取了10个最新曲目列表的数据。 有了歌曲顺序和歌曲选择数据,我知道我可以使用Spotify API(我以前听说过)通过一组量化指标来分析DJ的歌曲选择技术。

Note: For details on how I scraped the data, please check out my Jupyter Notebook.

注意:有关我如何抓取数据的详细信息,请查看Jupyter Notebook

从Spotify API检索歌曲特征 (Retrieving song characteristics from the Spotify API)

For every song on Spotify, they have pre-calculated what they call audio features such as tempo, key, etc. (we will get deeper into these in a later section). The main output of the web scraping exercise above was to get the Spotify IDs of the songs played by each DJ so that I could run it through the API and receive this response for each song:

对于Spotify上的每首歌曲,他们都预先计算了所谓的audio features例如速度,音调等(我们将在下一部分中对其进行深入介绍)。 上面的网络抓取练习的主要输出是获取每个DJ播放的歌曲的Spotify ID,以便我可以通过API运行它并为每个歌曲接收以下响应:

Image for post
Example of the audio features calculated by Spotify for a specific song (screenshot from the Spotify API docs)
由Spotify为特定歌曲计算的音频功能示例(Spotify API文档中的屏幕截图)

With these responses collected, it was easy enough to make a Pandas dataframe containing the ordered list of songs played in each DJ set.

收集了这些响应后,就很容易制作一个Pandas数据框,其中包含每个DJ集中播放的歌曲的有序列表。

数据清理和准备 (Data Cleaning and Preparation)

As is true with all data projects, I had to do some data cleaning and preparation to get the data in a nice and analyzable format.

与所有数据项目一样,我必须进行一些数据清理和准备工作,才能以一种美观且可分析的格式获取数据。

Image for post
How the data looked at the beginning of the project
数据如何看待项目开始

In the picture above, each row represents one song played by a DJ in their set. Position represents the numerical order of when a DJ played the song in that tracklist (so position = 1 would be the first song played in the set). I quickly realized that I needed a way to equally compare all the DJ sets even though they each had a unique number of songs. Also, some songs were missing because they were not available on Spotify.

在上图中,每一行代表DJ播放的一首歌曲。 Position表示DJ在该tracklist播放歌曲时的数字顺序(因此position = 1将是该集中播放的第一首歌曲)。 我很快意识到,即使每种DJ都有独特的歌曲数量,我也需要一种均等地比较它们的方法。 另外,有些歌曲也丢失了,因为它们在Spotify上不可用。

My idea was to measure the progression of each set in terms of the percent completion relative to the song position. So I divided the position of each track by the number of tracks in the tracklist to get the percent completion for each row. I then inserted values and forward filled the song data so that each tracklist had rows going from 0–100 percent completion (details on how I did this can be seen in my Jupyter Notebook).

我的想法是根据相对于歌曲位置的完成百分比来衡量每组的进度。 因此,我将每个轨道的position除以轨道列表中轨道的数量,以获得每一行的完成百分比。 然后,我插入值并向前填充歌曲数据,以便每个tracklist具有从0%到100%完成的行(有关详细信息,请参见Jupyter Notebook )。

Image for post
Final dataframe used for analysis
用于分析的最终数据框

With the data in this format, I could now plot and compare how the values for these music features changed over the course of each DJ set.

使用这种格式的数据,我现在可以绘制并比较这些音乐功能的值在每个DJ设置过程中的变化。

进入DJ数据 (Diving into the DJ Data)

After plotting the data across all the dimensions, looking at tempo showed some interesting trends. Tempo is measured here as beats per minute (BPM), the same way we measure our heart rate. This metric helps describe the speed or pace of a song. Let’s take a look at how these DJs manage tempo throughout a set.

在绘制了所有维度的数据后,查看tempo显示出一些有趣的趋势。 Tempo以每分钟心跳数(BPM)的方式测量,与我们测量心率的方式相同。 此指标有助于描述歌曲的速度或节奏。 让我们看一下这些DJ如何管理整个曲目中的tempo

Image for post

Notice how for most DJs, the tempo hovers around 128 BPM. This is a common tempo for EDM. One of the reasons for this is because many believe that it is easy to dance at this tempo, and we can look at the danceability feature from Spotify to see if there is any truth to this.

请注意,对于大多数DJ而言, tempo如何在128 BPM附近徘徊。 这是EDM的常见tempo 。 原因之一是因为许多人认为以这种速度跳舞容易,我们可以查看Spotify的可danceability功能,以了解是否有任何道理。

Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

舞蹈性是根据节奏,节奏稳定性,拍子强度和整体规律性等音乐元素的组合来描述轨道对舞蹈的适应性。 值0.0最低可跳舞,而1.0最高可跳舞。

Spotify API Documentation

Spotify API文档

Below we can see the average danceability for each DJ over a set:

下面我们可以看到每个DJ在一组中的平均可danceability

Image for post

We see that for DJs who kept near 128 BPM (like Kaskade, Diplo, Alesso, and Zedd), the average danceability was relatively high (above 0.6 danceability is on the higher end according to Spotify’s calculated distribution for this metric). The main takeaway for an aspiring DJ: keep close to 128 BPM to keep the crowd dancing.

我们看到,谁保持近128 BPM(如Kaskade公司,DIPLO,艾利索,和捷德)的DJ,平均danceability相对较高(高于0.6 danceability根据Spotify的计算是较高端分配这个指标)。 有抱负的DJ的主要收获:保持接近128 BPM的速度以保持人群跳舞。

DJ如何挑选下一首歌曲 (How a DJ picks the next song)

So we know what tempo most DJs aim for throughout a set, but how do they pick their next song? Let’s start by understanding the software that a DJ uses.

因此,我们知道大多数DJ在整个演出中的节奏是什么,但是他们如何挑选下一首歌? 让我们首先了解DJ使用的软件。

Image for post
Screenshot from djay Pro 2 (the DJing software I use)
djay Pro 2(我使用的DJing软件)的屏幕截图

There is a lot going on in the picture above but to put it simply, the pink box is where a DJ mixes songs together and the yellow box is where the DJ picks the next song. Notice how BPM and key are prominent in both the mixing and song choice. That’s because it’s hard to mix two songs that have extremely different tempos or keys without disrupting the flow of the music. We can use our data to find out how much the DJs in our sample change these features from song-to-song. Let’s look at tempo first:

上图中发生了很多事情,但简单地说,粉红色的框是DJ将歌曲混合在一起的地方,而黄色的框是DJ选择下一首歌曲的地方。 注意BPMkey在混音和歌曲选择中如何突出。 这是因为很难在不中断音乐流的情况下混合节奏或键有极大不同的两首歌曲。 我们可以使用我们的数据来找出样本中的DJ从歌曲到歌曲如何改变了这些功能。 让我们先看一下速度:

Image for post
Image for post
Median change in tempo (BPM) from song-to-song
从歌曲到歌曲的速度中位数变化(BPM)

The box plots above show the distribution of BPM changes the DJs made in their sets from song-to-song. For example, if Zedd played a song at 126 BPM and then played a song at 128 BPM right after, we would calculate a change in tempo of 2 BPM. As you can see, most of the DJs have a median change of tempo below 6 BPM and rarely change the tempo more than 10 BPM between songs. Advice to aspiring DJs: only make small changes to BPM from song-to-song (between 0 -10 BPM).

上面的方框图显示了DJ从歌曲到歌曲的设置对BPM更改的分布。 例如,如果Zedd以126 BPM的速度播放一首歌曲,然后又以128 BPM的速度播放一首歌曲,则我们将计算出2 BPM的速度变化。 如您所见,大多数DJ的速度中值变化低于6 BPM,而歌曲之间的速度变化很少超过10 BPM。 给有抱负的DJ的建议:从歌曲到歌曲(在0 -10 BPM之间)对BPM进行少量更改。

In music theory, the key of a piece is the group of pitches, or scale, that forms the basis of a music composition in classical, Western art, and Western pop music.

音乐理论,所述key一个的是组间距,或规模,形成在音乐组合物的基础古典,西方艺术和西方流行音乐。

Wikipedia entry for key

key维基百科条目

Key is another key consideration when a DJ picks the next song. To quantify key, Spotify uses pitch class notation which labels each key with an integer (the key of C=0, C♯/D♭=1, D=2, etc.). Each song in our data has an integer from 0–11 that denotes the song’s key. In theory, a DJ mixing two songs that are close together in key should sound more harmonic than mixing together two songs that are relatively far apart in key. We can see if our DJs follow this rule by finding the differences in pitch class integers between songs.

当DJ选择下一首歌曲时, Key是另一个关键考虑因素。 为了量化琴键,Spotify使用音高类别符号,该音高类符号用整数(琴键C = 0,C = 0 / D♭= 1,D = 2等)标记每个琴键。 我们数据中的每首歌曲都有一个介于0到11之间的整数,表示该歌曲的密钥。 从理论上讲,将两首在琴键中靠得很近的歌曲混合在一起的DJ听起来比在两首在琴键中相距较远的歌曲混合在一起时更和谐。 我们可以通过查找歌曲之间音高等级整数的差异来查看我们的DJ是否遵循此规则。

Image for post
Image for post
Median change in key (difference in pitch class integers)
键的中位数变化(音调类整数的差异)

We see that the median amount of keys that our DJs change between songs goes from 3–5 and that they rarely make huge changes in key from one song to the next. This concept of mixing in key is more complicated than I’m presenting here. Many types of DJ software split up the key integers into smaller groups, distinguishing between the minor and major chords. Nevertheless, a good rule of thumb for DJs looking at this data is to try and keep the songs you play in succession as close in key as possible.

我们发现,DJ在歌曲之间更改的键的中值范围是3–5,而且他们很少会在一首歌曲到下一首歌曲中对键进行巨大的更改。 这种混合键的概念比我在这里介绍的要复杂得多。 许多类型的DJ软件将键整数分成较小的组,以区分小和弦和大和弦。 不过,对于DJ来说,查看此数据的一个很好的经验法则是尝试使您连续播放的歌曲尽可能地接近关键。

数据无法解决的DJ方面 (Aspects of DJing not addressed by the data)

Here I want to address an aspect of DJing that cannot be seen in the data presented. DJs have the ability to change the tempo and key of songs that are playing so that they mix better with the next song. For example, if I’m currently playing a song at 123 BPM in the key of C (pitch class integer of 0) and want to mix in a song that is at 131 BPM in the key of G (pitch class integer of 7), I can use my DJ controller to gradually increase the tempo of the currently playing song and also alter the key of the song so that it’s closer to the key of the next song. This is all executed through the DJ software and controller. However, trying to change the tempo and key of a song by too much can be heard and disrupt the flow of the music.

在这里,我想介绍DJ的一个方面,该方面在显示的数据中看不到。 DJ可以更改正在播放的歌曲的tempokey ,以便与下一首歌曲更好地混合。 例如,如果我当前正在以C键(音调等级整数0)以123 BPM播放一首歌曲,并想以G键(音调等级整数7)以131 BPM的歌曲进行混音,我可以使用DJ控制器逐渐提高当前播放歌曲的速度,还可以更改歌曲的音调,使其更接近下一首歌曲的音调。 全部通过DJ软件和控制器执行。 但是,试图将歌曲的速度和音调改变得太多,可能会听到并打断音乐的流动。

There are also many effects and edits that DJs add to the underlying songs that make transitions smoother which cannot be seen in the data. So if you were wondering how some DJs we looked at got away with huge shifts in tempo and key (*cough* Porter Robinson), this is how.

DJ还会将许多效果和编辑添加到基础歌曲中,从而使过渡更加平滑,这在数据中看不到。 因此,如果您想知道我们看着的某些DJ如何在节奏和键(*咳嗽*波特·罗宾逊)上发生巨大变化,就可以了。

结论 (Conclusion)

I hoped you enjoyed this look into DJing using data science! If you’re interested in the code, please check out my Github repo for this project. If I had more time and data, I would love to examine the trends of DJs throughout thousands of their sets and create an algorithm that emulates a DJ’s style given a list of songs to play. Also, someday we might have data around DJ transitions and effects through audio analysis. Until then, thanks for taking the time to read my article!

我希望您喜欢使用数据科学的DJ外观! 如果您对代码感兴趣,请查看我的Github存储库。 如果我有更多的时间和数据,我很乐意研究DJ遍及数千种唱片的趋势,并创建一种算法,该算法模仿DJ的风格(给定要播放的歌曲)。 另外,有一天,我们可能会通过音频分析获得有关DJ过渡和效果的数据。 在此之前,感谢您抽出宝贵的时间阅读我的文章!

翻译自: https://medium.com/@jcamagong/data-science-for-djing-b4c7a422c197

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值