Deep Learning Techniques for Music Generation
Performance RNN
MusicVAE
Wavenet
Abstract
五个维度分析:
- Object
- melody
- polyphony
- accompaniment
- counterpoint
- Representation
waveform, spectrogram, note, chord, meter and beat
波形,频谱图,音符,和弦, 节拍- format
MIDI, piano roll or text. - encoded
scalar, one-hot or many-hot.
- format
- Architecture
- feedforward network
- recurrent network
- autoencoder
- generative adversarial networks
- Challenge
variability, interactivity and creativity. - Strategy
single-step feedforward, iterative feedforward, sampling or input manipulation
Introduction
Type
- Melody
Single-voice monophonic melody - Polyphony
和弦
Single-voice polyphony (also named Single-track polyphony) - Multivoice or Multitrack
Multivoice polyphony (also named Multitrack polyphony) - Accompaniment
伴奏- Counterpoint, composed of one or more melodies (voices)
- Chord progression, which provides some associated harmony.
Destination and Use
- Audio system
play the generated content - Sequencer software
process the generated content(MIDI) - Human(s)
music score.
Mode
- 自动无需人干预
- 具有一些控制界面,供人类用户对生成过程进行某些互动控制
Style
相干性,覆盖率(相对于稀疏性)和范围(特定性与较大广度)
coherence
coverage (versus sparsity)
scope (specialized versus large breadth)
Representation
Audio
-
Waveform
-
Transformed Representations
-
Spectrogram
音频的常见变换表示形式是通过傅立叶变换获得的频谱
-
Chromagram
频谱图的一种变化形式
与八度无关
钢琴演奏的C大调的色谱图如图所示
四个子图(a至d)共有的x轴表示时间(以秒为单位)
(a)的y轴表示音符
(b和d)的y轴表示色度(音高等级)
(c)的y轴表示振幅
对于色谱图(b和d),彩色的第三个轴表示强度。
Main Concepts
Note 音符 (Pitch,Duration,Dynamics)
-
Pitch 音高
- frequency
单位 Hz - vertical position (height) on a score
- pitch notation
A 4 A_4 A4 (A440(频率为440 Hz)一般的音高调整标准)
音高等级+一个数字
- frequency
-
Duration 持续时间
- 绝对值 ms
- 相对值 a quarter note / an eighth note
-
Dynamics
- quantitative value (dB)
- qualitative value
an annotation on a score about how to perform the note
{