生成对抗神经网络GAN
最原始的论文Generative Adversarial Networks
D的目标函数:max log(D(x))+log(1-D(G(z)))
G的目标函数: max log(D(G(z)))
D包括两部分,让D(x)尽可能的大,让D(G(z))尽可能小,因为真实图片的对应的标签为1,生成的标签为0。D的目标是让自己能正确区分哪个是0哪个是1.
G只有一个目标,让自己生成的图片更像真实的,就是让D(G(z))尽可能的大。
结构:
主要程序(keras):
def __init__(self):
...
##建立D和G网络
self.discriminator = self.build_discriminator()
self.discriminator.compile(loss='binary_crossentropy',optimizer=Adam(0.0002, 0.5),
metrics=['accuracy'])
self.generator = self.build_generator()
z = Input(shape=(self.latent_dim,))
img = self.generator(z)
self.discriminator.trainable = False
validity = self.discriminator(img)
##从噪声到D的输出作为整个的模型
self.combined = Model(z, validity)
self.combined.compile(loss='binary_crossentropy', optimizer=optimizer)
def train(self, epochs, batch_size=128, sample_interval=50):
valid = np.ones((batch_size, 1)) ##生成全0的标签
fake = np.zeros((batch_size, 1)) ##生成全1的标签
for epoch in range(epochs):
imgs = X_train[idx] ##真实图片
noise = np.random.normal(0, 1, (batch_size, self.latent_dim)) ##生成噪声
gen_imgs = self.generator.predict(noise) ##噪声经过G的输出
d_loss_real = self.discriminator.train_on_batch(imgs, valid) ##用真实的训练D
d_loss_fake = self.discriminator.train_on_batch(gen_imgs, fake) ##用假的训练D
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake) ##总的损失
noise = np.random.normal(0, 1, (batch_size, self.latent_dim))
g_loss = self.combined.train_on_batch(noise, valid) ##训练G,因为此时D的trainable是False,所以训练整体网络就相当于训练G
注意:
**CGAN:**在GAN基础上添加标签
D的目标函数: max log(D(c,x))+log(1-D(c,x))
G的目标函数: max log(D(G(c,z))
D包括两部分部分:同上
G还是只有一个目标,输入噪声和标签时,让D认为这是真实的,不是生成的。
结构:
主要程序(keras):
def __init__(self):
...
##建立D和G网络
self.discriminator = self.build_discriminator()
self.discriminator.compile(loss='binary_crossentropy',optimizer=Adam(0.0002, 0.5),
metrics=['accuracy'])
self.generator = self.build_generator()
z = Input(shape=(self.latent_dim,))
label = Input(shape=(1,))
img = self.generator([noise, label])
self.discriminator.trainable = False
valid = self.discriminator([img, label])
##从噪声到D的输出作为整个的模型
self.combined = Model([noise, label], valid)
self.combined.compile(loss='binary_crossentropy', optimizer=optimizer)
def build_generator(self):
...
model.summary()
noise = Input(shape=(self.latent_dim,))
label = Input(shape=(1,), dtype='int32')
label_embedding = Flatten()(Embedding(self.num_classes, self.latent_dim)(label))
model_input = multiply([noise, label_embedding]) ##用keras里的multiply层把z和c乘起来当做G的输入
img = model(model_input)
return Model([noise, label], img)
def build_discriminator(self):
...
model.summary()
img = Input(shape=self.img_shape)
label = Input(shape=(1,), dtype='int32')
label_embedding = Flatten()(Embedding(self.num_classes, np.prod(self.img_shape))(label))
flat_img = Flatten()(img)
model_input = multiply([flat_img, label_embedding]) #把输入图片拉成向量再和标签相乘当做D的输入
validity = model(model_input)
return Model([img, label], validity)
def train(self, epochs, batch_size=128, sample_interval=50):
...
valid = np.ones((batch_size, 1)) ##生成全0的标签
fake = np.zeros((batch_size, 1)) ##生成全1的标签
for epoch in range(epochs):
imgs, labels = X_train[idx], y_train[idx]
noise = np.random.normal(0, 1, (batch_size, 100))
gen_imgs = self.generator.predict([noise, labels])
d_loss_real = self.discriminator.train_on_batch([imgs, labels], valid)
d_loss_fake = self.discriminator.train_on_batch([gen_imgs, labels], fake)
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)
sampled_labels = np.random.randint(0, 10, batch_size).reshape(-1, 1) #随机生成一个标签,这里是对mnist数据集
g_loss = self.combined.train_on_batch([noise, sampled_labels], valid)
**WGAN:**在GAN基础上限制权值.
分析了Kullback–Leibler divergence(简称KL散度)和Jensen-Shannon divergence(简称JS散度)这两个重要的相似度衡量指标,提出Wasserstein距离又叫Earth-Mover(EM)距离,以后再写KL和JS散度。基于Wasserstein距离的优越性,把他作为生成器的loss。所以WGAN提出的新的目标函数:
D的目标函数:
G的目标函数:
结构:
主要程序(keras):
def __init__(self):
...
##建立D和G网络
self.critic = self.build_critic()
self.discriminator.compile(loss=self.wasserstein_loss,optimizer=RMSprop(lr=0.00005),
metrics=['accuracy']) ##损失函数和优化函数都变了
self.generator = self.build_generator()
z = Input(shape=(self.latent_dim,))
img = self.generator(z)
self.discriminator.trainable = False
valid = self.critic(img)
##从噪声到D的输出作为整个的模型
self.combined = Model(z, valid)
self.combined.compile(loss=self.wasserstein_loss,optimizer=RMSprop(lr=0.00005),metrics=['accuracy'])
def wasserstein_loss(self, y_true, y_pred):
return K.mean(y_true * y_pred)
def train(self, epochs, batch_size=128, sample_interval=50):
valid = -np.ones((batch_size, 1)) ##生成全为-1的标签
fake = np.ones((batch_size, 1)) ##生成全为1的标签
for epoch in range(epochs):
imgs = X_train[idx] ##真实图片
noise = np.random.normal(0, 1, (batch_size, self.latent_dim)) ##生成噪声
gen_imgs = self.generator.predict(noise) ##噪声经过G的输出
d_loss_real = self.discriminator.train_on_batch(imgs, valid) ##用真实的训练D
d_loss_fake = self.discriminator.train_on_batch(gen_imgs, fake) ##用假的训练D
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake) ##总的损失
###缩减权值
for l in self.critic.layers:
weights = l.get_weights() ##D网络的每个权值
weights = [np.clip(w, -self.clip_value, self.clip_value) for w in weights] ##np.clip截取的意思,超出的部分就把它 强置为边界部分。 self.clip_value是网络最开始的时候设置的
l.set_weights(weights)
g_loss = self.combined.train_on_batch(noise, valid) ##训练G
**infoGAN:`**解决隐变量可解释性问题,生成具有某种特征的图像。
在原始的噪声z上人为地加上一些限制,于是z就分解成两部分:一是原来的z;二是由若干个latent variables拼接而成的latent code c,c由两类组成,一类是离散变量(如mnist数据0-9的标签),另一类是连续随机变量(如mnist图片的特征加粗倾斜等)。两种变量的熵的计算方法不同。
增加了互信息来表示x和c之间的关联程度,越相关越好,所以就是求互信息的最大值,增加互信息的正则化项,增加的超参数通过实验取最优值。
互信息在决策树里边用到过,这里先介绍熵的概念
熵H(X):表示随机变量x的不确定性,熵越大不确定性越大。
条件熵H(Y|X):表示在已知随机变量X的条件下随机变量Y的不确定性。
互信息:H(Y)-H(Y|X)
信息增益(互信息):g(D,A)=H(D)-H(D|A),表示由于特征A而使得对数据集D的分类的不确定减少的程度。所以在决策树里信息增益大的特征具有更强的分类能力。
目标函数:
结构:
主要程序(keras):
def __init__(self):
optimizer = Adam(0.0002, 0.5)
losses = ['binary_crossentropy', self.mutual_info_loss]
self.discriminator, self.auxilliary = self.build_disk_and_q_net() ##D网络接了两个尾巴,组成两个网络,一个是和GAN一样的D网络,另一个是softmax分类网络叫做A网络吧
self.discriminator.compile(loss=['binary_crossentropy'],optimizer=optimizer,metrics=['accuracy']) ##D网络计算二分类损失
self.auxilliary.compile(loss=[self.mutual_info_loss], optimizer=optimizer,metrics=['accuracy']) ##A网络计算一个新的损失
self.generator = self.build_generator() ##和GAN的G网络一样,没做改变
gen_input = Input(shape=(self.latent_dim,))
img = self.generator(gen_input)
self.discriminator.trainable = False
valid = self.discriminator(img)
target_label = self.auxilliary(img)
self.combined = Model(gen_input, [valid, target_label])
self.combined.compile(loss=losses,optimizer=optimizer)
def mutual_info_loss(self, c, c_given_x):
eps = 1e-8
conditional_entropy = K.mean(- K.sum(K.log(c_given_x + eps) * c, axis=1))
entropy = K.mean(- K.sum(K.log(c + eps) * c, axis=1))
return conditional_entropy + entropy
def build_disk_and_q_net(self):
...
img_embedding = model(img)
# Discriminator
validity = Dense(1, activation='sigmoid')(img_embedding)
# Recognition
q_net = Dense(128, activation='relu')(img_embedding)
label = Dense(self.num_classes, activation='softmax')(q_net)
# Return discriinator and recognition network
return Model(img, validity), Model(img, label)
def sample_generator_input(self, batch_size):
# Generator inputs
sampled_noise = np.random.normal(0, 1, (batch_size, 62))
sampled_labels = np.random.randint(0, self.num_classes, batch_size).reshape(-1, 1)
sampled_labels = to_categorical(sampled_labels, num_classes=self.num_classes)
return sampled_noise, sampled_labels
def train(self, epochs, batch_size=128, sample_interval=50):
...
valid = np.ones((batch_size, 1))
fake = np.zeros((batch_size, 1))
for epoch in range(epochs):
imgs = X_train[idx]
sampled_noise, sampled_labels = self.sample_generator_input(batch_size)
gen_input = np.concatenate((sampled_noise, sampled_labels), axis=1)
gen_imgs = self.generator.predict(gen_input)
d_loss_real = self.discriminator.train_on_batch(imgs, valid)
d_loss_fake = self.discriminator.train_on_batch(gen_imgs, fake)
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)
g_loss = self.combined.train_on_batch(gen_input, [valid, sampled_labels])