Assignment | 05-week1 -Improvise a Jazz Solo with an LSTM Network

最新推荐文章于 2021-10-09 05:20:50 发布

ZJ_Improve

最新推荐文章于 2021-10-09 05:20:50 发布

阅读量3.6k

点赞数

分类专栏：深度学习 | 吴恩达- Assignment 汇总深度学习 | 吴恩达文章标签：音乐生成 Keras LSTM 吴恩达序列模型

本文链接：https://blog.csdn.net/junjun_zhao/article/details/79420913

版权

在这个编程作业中，你将使用LSTM网络创建一个能够生成爵士音乐的模型。通过训练一个78个独特值的音乐数据集，你将学习如何处理音乐数据，并构建一个RNN模型来预测音乐序列。在训练完成后，你可以生成并聆听由模型创作的音乐片段。

摘要由CSDN通过智能技术生成

该系列仅在原课程基础上课后作业部分添加个人学习笔记，如有错误，还请批评指教。- ZJ

Coursera 课程 |deeplearning.ai |网易云课堂

CSDN：http://blog.csdn.net/JUNJUN_ZHAO/article/details/79420913

Welcome to your final programming assignment of this week! In this notebook, you will implement a model that uses an LSTM to generate music. You will even be able to listen to your own music at the end of the assignment.

欢迎来到本周的最终编程任务！在这次笔记中，您将实现一个使用 LSTM 生成音乐的模型。你甚至可以在作业结束时听取自己的音乐。

You will learn to:
- Apply an LSTM to music generation.
- Generate your own jazz music with deep learning.

将LSTM应用于音乐生成。
深度学习生成自己的爵士音乐。

Please run the following cell to load all the packages required in this assignment. This may take a few minutes.

from __future__ import print_function
import IPython
import sys
from music21 import *
import numpy as np
from grammar import *
from qa import *
from preprocess import * 
from music_utils import *
from data_utils import *
from keras.models import load_model, Model
from keras.layers import Dense, Activation, Dropout, Input, LSTM, Reshape, Lambda, RepeatVector
from keras.initializers import glorot_uniform
from keras.utils import to_categorical
from keras.optimizers import Adam
from keras import backend as K

d:\program files\python36\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.

1 - Problem statement

You would like to create a jazz music piece specially for a friend’s birthday. However, you don’t know any instruments or music composition. Fortunately, you know deep learning and will solve this problem using an LSTM netwok.

You will train a network to generate novel jazz solos in a style representative of a body of performed work.

您想为朋友的生日专门制作爵士乐曲。但是，你不知道任何乐器或音乐作品。幸运的是，你知道深度学习，并将使用 LSTM 网络来解决这个问题。

您将训练一个网络，以演奏作品的代表作风格代表爵士乐独奏。

这里写图片描述

1.1 - Dataset

You will train your algorithm on a corpus of Jazz music. Run the cell below to listen to a snippet of the audio from the training set:

IPython.display.Audio('./data/30s_seq.mp3')

音乐片段，无法上传。

We have taken care of the preprocessing of the musical data to render it in terms of musical “values.” You can informally think of each “value” as a note, which comprises a pitch and a duration. For example, if you press down a specific piano key for 0.5 seconds, then you have just played a note. In music theory, a “value” is actually more complicated than this–specifically, it also captures the information needed to play multiple notes at the same time. For example, when playing a music piece, you might press down two piano keys at the same time (playng multiple notes at the same time generates what’s called a “chord”和弦). But we don’t need to worry about the details of music theory for this assignment. For the purpose of this assignment, all you need to know is that we will obtain a dataset of values, and will learn an RNN model to generate sequences of values.

我们已经关注音乐数据的预处理，以音乐的“value”来表达它。你可以非正式地将每个“value”看作一个音符，它包含一个音高和一个持续时间。例如，如果您按下特定钢琴键0.5秒，那么您刚刚弹奏了一个音符。在音乐理论中，“value”实际上比这更复杂 - 具体来说，它还捕获了同时播放多个音符所需的信息。例如，在播放音乐作品时，可以同时按下两个钢琴键（同时播放多个音符生成所谓的“和弦”）。但是我们不需要担心这个任务的音乐理论的细节。为了这个任务的目的，你需要知道的是，我们将获得一个值的数据集，并将学习一个 RNN 模型来生成序列值。

Our music generation system will use 78 unique values. Run the following code to load the raw music data and preprocess it into values. This might take a few minutes.

我们的音乐生成系统将使用78个独特的值。运行以下代码以加载原始音乐数据并将其预处理为值。这可能需要几分钟的时间。

X, Y, n_values, indices_values = load_music_utils()
print('shape of X:', X.shape)
print('number of training examples:', X.shape[0])
print('Tx (length of sequence):', X.shape[1])
print('total # of unique values:', n_values)
print('Shape of Y:', Y.shape)
# 共 60 个训练样本，每个训练样本的 序列长度是 30 ，音符和弦相关的汇集表 共 78

shape of X: (60, 30, 78)
number of training examples: 60
Tx (length of sequence): 30
total # of unique values: 78
Shape of Y: (30, 60, 78)

You have just loaded the following:

X: This is an (m, $T_x$ , 78) dimensional array. We have m training examples, each of which is a snippet of $T_x =30$ musical values. At each time step, the input is one of 78 different possible values, represented as a one-hot vector. Thus for example, X[i,t,:] is a one-hot vector representating the value of the i-th example at time t.
X：这是一个（m, $T_x$ ,78）维数组。我们有 m 个训练样例，每个样例都是 $T_x = 30$ 音乐值的片段。在每个时间步，输入是78个不同的可能值之一，表示为一个one-hot vector。因此，例如， X[i,t,:]是表示第 i 个示例在时间 t 的值的 one-hot vector。
Y: This is essentially the same as X, but shifted one step to the left (to the past). Similar to the dinosaurus assignment, we’re interested in the network using the previous values to predict the next value, so our sequence model will try to predict $y^{\langle t \rangle}$ given $x^{\langle 1\rangle}, \ldots, x^{\langle t \rangle}$ . However, the data in Y is reordered to be dimension $(T_y, m, 78)$ , where $T_y = T_x$ . This format makes it more convenient to feed to the LSTM later.
Y：这与X基本相同，但向左移一步（到过去）。与恐龙分配类似，我们对使用先前值预测下一个值的网络感兴趣，因此我们的序列模型将尝试预测 $y^{\langle t \rangle}$ 给出 $x^{\langle 1\rangle}, \ldots, x^{\langle t \rangle}$ 。然而，Y中的数据被重新排序为 $(T_y, m, 78)$ ，其中 $T_y = T_x$ 。这种格式使得稍后进入 LSTM 更方便。
n_values: The number of unique values in this dataset. This should be 78.
indices_values: python dictionary mapping from 0-77 to musical values.

1.2 - Overview of our model

Here is the architecture of the model we will use. This is similar to the Dinosaurus model you had used in the previous notebook, except that in you will be implementing it in Keras. The architecture is as follows:

这是我们将使用的模型的架构。这与您在前一个笔记中使用的 Dinosaurus 模型类似，只不过您将在 Keras 中实现它。架构如下：

这里写图片描述

2 - Building the model

In this part you will build and train a model that will learn musical patterns. To do so, you will need to build a model that takes in X of shape $(m, T_x, 78)$ and Y of shape (T