《MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and A...

最新推荐文章于 2024-08-08 08:23:36 发布

weixin_30244681

最新推荐文章于 2024-08-08 08:23:36 发布

阅读量1.7k

点赞数 2

文章标签： php 人工智能

原文链接：http://www.cnblogs.com/punkcure/p/8270031.html

版权

MuseGAN是一种基于GAN的模型，旨在解决音乐生成中的时间序列结构、多轨依赖性和复调问题。通过jamming model、composer model和hybrid model，它能从头开始生成音乐，同时保持音乐的和谐和节奏结构。论文使用摇滚音乐的piano-rolls数据进行训练，并通过客观和主观评价方法评估生成音乐的质量。

摘要由CSDN通过智能技术生成

出处：2018 AAAI

SourceCode:https://github.com/salu133445/musegan

abstract：

（写得不错值得借鉴）重点阐述了生成音乐和生成图片，视频及语音的不同。首先音乐是基于时间序列的；其次音符在和弦、琶音（arpeggios）、旋律、复音等规则的控制之下的；同时一首歌曲是多track的。总之不能简单堆叠音符。本文基于GAN提出了三种模型来生成音乐：jamming model, the composer model and the hybrid model。作者从摇滚音乐中挑选出了10万个bar来进行训练，生成5个轨道的piano-rolls：bass, drums, guitar, piano and strings。同时作者使用了一些intra-track and inter-track objective metrics来衡量生成的音乐质量（？）。

Introduction：

GAN在文字，图片，视频上取得了巨大的成就，音乐方面也有些进展，但问题在于：

（1）音乐有自己的基于时间的架构，如下图所示：

（2）音乐是多轨道/多乐器的

现代管弦乐（orchestra）常常有4个部分:brass, strings, woodwinds and percussion,摇滚乐队常用的是bass, a drum set, guitars and possibly a vocal，音乐理论要求这些元素按时间展开后harmony并且counterpoint.

（3）musical notes are often grouped into chords,arpeggios or melodies.所以，单音（monophonic）的音乐和NLP的生成都不能直接被引入来生成复调（polyphonic）的音乐。

由于上述三个问题，许多已有工作做了一些简化的处理方式，生成单轨单音音乐，introducing a chronological ordering of notes for polyphonic music，组合单音音乐变成复音音乐等。作者的目标是摒弃这些简化手法，1) harmonic and rhythmic structure, 2) multi-track interdependency, and 3) temporal structure。该模型能够产生音乐from scratch (i.e. without human inputs)，也能follow the underlying temporal structure of a track given a priori by human.作者提出了三种方式来处理track之间的交互

（1）每个track独立生成 one generates tracks independently by their private generators (one for each)

（2）所有track由一个生成器生成 another generates all tracks jointly with only one generator

（3）在（1）的基础上，每个track生成时有额外的input信息，以保证harmonious and coordinated

为了突出group的性质，作者关

最低0.47元/天解锁文章

weixin_30244681

关注

2
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
《MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and A...

出处：2018 AAAISourceCode:https://github.com/salu133445/museganabstract：（写得不错值得借鉴）重点阐述了生成音乐和生成图片，视频及语音的不同。首先音乐是基于时间序列的；其次音符在和弦、琶音（arpeggios）、旋律、复音等规则的控制之下的；同时一首歌曲是多track的。总之不能简单堆叠音符。本文基于GAN提出了三种模型...
复制链接

扫一扫