李宏毅GAN学习(七) feature extraction

iofoGAN

一般的GAN的输入是一个随机的向量,输出是一张图片,不知道输入向量中的某一维对输出有什么影响,iofoGAN就是来做这件事的,infoGAN的网络结构如下。

相比于普通的GAN,iofoGAN多了一个分类器。如果Generator学到c的每一个维度对X的影响,则Classifier可以轻易的根据x中包含的信息推出c;如果Generator没有学到c对X明确的影响,则Classifier很难从x反推出c

个人觉得infoGAN是CGAN的加强版

VAE-GAN

可以看成用VAE加强GAN,也看一看成用GAN加强VAE

详细的算法步骤:

BiGAN

详细的算法步骤:

 

 

# -*- coding: utf-8 -*- import numpy as np import librosa import random def extract_power(y, sr, size=3): """ extract log mel spectrogram feature :param y: the input signal (audio time series) :param sr: sample rate of 'y' :param size: the length (seconds) of random crop from original audio, default as 3 seconds :return: log-mel spectrogram feature """ # normalization y = y.astype(np.float32) normalization_factor = 1 / np.max(np.abs(y)) y = y * normalization_factor # random crop start = random.randint(0, len(y) - size * sr) y = y[start: start + size * sr] # extract log mel spectrogram ##### powerspec = np.abs(librosa.stft(y,n_fft=128, hop_length=1024)) ** 2 #logmelspec = librosa.power_to_db(melspectrogram) return powerspec def extract_logmel(y, sr, size=3): """ extract log mel spectrogram feature :param y: the input signal (audio time series) :param sr: sample rate of 'y' :param size: the length (seconds) of random crop from original audio, default as 3 seconds :return: log-mel spectrogram feature """ # normalization y = y.astype(np.float32) normalization_factor = 1 / np.max(np.abs(y)) y = y * normalization_factor # random crop start = random.randint(0, len(y) - size * sr) y = y[start: start + size * sr] # extract log mel spectrogram ##### melspectrogram = librosa.feature.melspectrogram(y=y, sr=sr, n_fft=2048, hop_length=1024, n_mels=90) logmelspec = librosa.power_to_db(melspectrogram) return logmelspec def extract_mfcc(y, sr, size=3): """ extract MFCC feature :param y: np.ndarray [shape=(n,)], real-valued the input signal (audio time series) :param sr: sample rate of 'y' :param size: the length (seconds) of random crop from original audio, default as 3 seconds :return: MFCC feature """ # normalization y = y.astype(np.float32) normalization_factor = 1 / np.max(np.abs(y))
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值