DCGAN和cDCGAN在mnist上实战

这个博客主要是总结下模型的搭建细节,完成这两模型花了三四天吧,DCGAN的参数一直调不好。

先简单说说DCGAN的原理吧,主要原理还是GAN,一个生成网络和一个判别网络。和基础的GAN相比,区别就在于DCGAN无论是生成网络还是判别网络都采用了卷积。

一开始,生成网络和判别网络都采用较深的卷积(都用了10层吧),什么LeakyRelu都用上了,效果一直很差,后来感觉,可能mnist数据图片小,内容比较简单,GAN又难拟合,我就简单的弄了2,3层,把激活函数直接采用tanh,而且Batch_normal改成了Dropout(区别挺大的,不知道为何),效果得到了改善

生成网络:

def G_Net(input, reuse=False, Training=True):
    with tf.variable_scope('G_Net', reuse=reuse):
        with tf.variable_scope('dense'):
            net = tf.layers.dense(input, 8*8*256, activation=tf.nn.tanh)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.dropout(net, 0.5)
            net = tf.reshape(net, shape=(-1, 8, 8, 256))

        with tf.variable_scope('deconv_1'):
            net = slim.conv2d_transpose(net, num_outputs=128, kernel_size=[5, 5], stride=2, activation_fn=None,
                                  weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
            net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

        # with tf.variable_scope('conv_1'):
        #     net = slim.conv2d(net, num_outputs=128, kernel_size=[5, 5], stride=1, activation_fn=None,
        #                       weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
        #     net = tf.layers.batch_normalization(net, training=Training)
        #     net = tf.nn.relu(net)

        with tf.variable_scope('deconv_2'):
            net = slim.conv2d_transpose(net, num_outputs=1, kernel_size=[5, 5], stride=2, activation_fn=None,
                                        weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

判别网络:

def D_Net(input, reuse=False, Training=True):
    with tf.variable_scope('D_Net', reuse=reuse):
        with tf.variable_scope('conv_1'):
            net = slim.conv2d(input, kernel_size=[5, 5], num_outputs=64, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        with tf.variable_scope('conv_2'):
            net = slim.conv2d(net, kernel_size=[5, 5], num_outputs=128, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        # with tf.variable_scope('conv_3'):
        #     net = slim.conv2d(net, kernel_size=[3, 3], num_outputs=16, stride=2, activation_fn=None,
        #                        weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
        #     net = tf.layers.batch_normalization(net, training=Training)
        #     net = LeakyRelu(net)

        with tf.variable_scope('dense'):
            net = slim.flatten(net)
            net = tf.nn.dropout(net, 0.5)
            net = tf.layers.dense(net, 1024, activation=tf.nn.relu)
            net = tf.nn.dropout(net, 0.5)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.layers.dense(net, 1, activation=None)
            p = tf.nn.sigmoid(net)

    return p

效果:

效果看着还行,却出现了过拟合的现象,基本都是7和9,分析了下原因,mnist数据确实太简单了28*28,简单归简单,数字间差别还挺大的,网络简单吧,数字生成不像,网络复杂吧,又过拟合了,所以引入条件对于mnist的GAN提升应该很大。

============================================分割线================================================

cDCGAN又称为条件GAN,原理也很简单,在中间层concat上label,相当于加入条件。原本的GAN可以看成噪声到目标的映射关系,比如G(noise) = fake_img, D(img) = p_of_real, 加入条件后变为G(noise|y=label) = (fake_img | y=label), D(img|label) = p_of_real | y = label,公式总结的不大好,可以参考网上对于条件GAN的解读:http://baijiahao.baidu.com/s?id=1602483050633464000&wfr=spider&for=pc

由于label是2维的(包括Batch维度),在卷积部分添加条件时,需要先reshape到(batch_size, 1, 1, num_of_class),注意,num_of_class需要多一个fake类。reshape完还得根据当前层的size乘以一个相同size全为1的矩阵再concat,DCGAN的官方实现中有卷积过程concat的实现:

def conv_cond_concat(x, y):
  """Concatenate conditioning vector on feature map axis."""
  x_shapes = x.get_shape()
  y_shapes = y.get_shape()
  return tf.concat([
    x, y*tf.ones([x_shapes[0], x_shapes[1], x_shapes[2], y_shapes[3]])], 3)

然后是G_net:

def G_Net(input, label, reuse=False, Training=True):
    with tf.variable_scope('G_Net', reuse=reuse):
        with tf.variable_scope('dense'):
            net = tf.concat([input, label], axis=1)
            net = tf.layers.dense(net, 8*8*256, activation=tf.nn.tanh)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.dropout(net, 0.5)
            net = tf.reshape(net, shape=(-1, 8, 8, 256))

        with tf.variable_scope('deconv_1'):
            net = slim.conv2d_transpose(net, num_outputs=128, kernel_size=[5, 5], stride=2, activation_fn=None,
                                  weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
            net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

G_net只在全连接出来的地方concat了

然后是D_net:

def D_Net(input, label, reuse=False, Training=True):
    batch_size = label.get_shape()[0]
    label = tf.reshape(label, shape=(batch_size, 1, 1, 11))
    with tf.variable_scope('D_Net', reuse=reuse):
        with tf.variable_scope('conv_1'):
            net = conv_cond_concat(input, label)
            net = slim.conv2d(net, kernel_size=[5, 5], num_outputs=64, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        with tf.variable_scope('conv_2'):
            net = conv_cond_concat(net, label)
            net = slim.conv2d(net, kernel_size=[5, 5], num_outputs=128, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        # with tf.variable_scope('conv_3'):
        #     net = slim.conv2d(net, kernel_size=[3, 3], num_outputs=16, stride=2, activation_fn=None,
        #                        weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
        #     net = tf.layers.batch_normalization(net, training=Training)
        #     net = LeakyRelu(net)

        with tf.variable_scope('dense'):
            net = slim.flatten(net)
            net = tf.nn.dropout(net, 0.5)
            net = tf.layers.dense(net, 1024, activation=tf.nn.relu)
            net = tf.nn.dropout(net, 0.5)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.layers.dense(net, 11, activation=None)

    return net

由于不再只是简单的真假判断了,所以返回num_of_class的长度出来做softmax

损失函数如下:

loss_G = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=logit_fake))
    loss_D = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=logit_real) + \
                            tf.nn.softmax_cross_entropy_with_logits(labels=fake_label, logits=logit_fake))

fake_label是标记全为10的label,即全是fake类

生成效果就很棒了:

本人小白,分析很多都是个人理解,有错还望大佬们指出。

 

  • 1
    点赞
  • 17
    收藏
    觉得还不错? 一键收藏
  • 5
    评论
以下是使用DCGAN训练MNIST数据集的步骤: 1.导入必要的库和模块 ```python import torch import torch.nn as nn import torchvision import torchvision.transforms as transforms import numpy as np import matplotlib.pyplot as plt ``` 2.加载数据集 ```python transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean=0.5, std=0.5)]) train_ds = torchvision.datasets.MNIST('data/', train=True, transform=transform, download=True) dataloader = torch.utils.data.DataLoader(train_ds, batch_size=64, shuffle=True) ``` 3.定义生成器Generator ```python class Generator(nn.Module): def __init__(self): super(Generator, self).__init__() self.fc1 = nn.Linear(100, 256) self.fc2 = nn.Linear(256, 512) self.fc3 = nn.Linear(512, 1024) self.fc4 = nn.Linear(1024, 784) self.relu = nn.ReLU() self.tanh = nn.Tanh() def forward(self, x): x = self.relu(self.fc1(x)) x = self.relu(self.fc2(x)) x = self.relu(self.fc3(x)) x = self.tanh(self.fc4(x)) return x ``` 4.定义判别器Discriminator ```python class Discriminator(nn.Module): def __init__(self): super(Discriminator, self).__init__() self.fc1 = nn.Linear(784, 512) self.fc2 = nn.Linear(512, 256) self.fc3 = nn.Linear(256, 1) self.leaky_relu = nn.LeakyReLU(0.2) self.sigmoid = nn.Sigmoid() def forward(self, x): x = x.view(x.size(0), -1) x = self.leaky_relu(self.fc1(x)) x = self.leaky_relu(self.fc2(x)) x = self.sigmoid(self.fc3(x)) return x ``` 5.初始化生成器和判别器 ```python generator = Generator() discriminator = Discriminator() ``` 6.定义损失函数和优化器 ```python criterion = nn.BCELoss() lr = 0.0002 optimizer_g = torch.optim.Adam(generator.parameters(), lr=lr) optimizer_d = torch.optim.Adam(discriminator.parameters(), lr=lr) ``` 7.训练模型 ```python num_epochs = 50 for epoch in range(num_epochs): for i, (images, _) in enumerate(dataloader): # 训练判别器 discriminator.zero_grad() real_images = images.view(-1, 784) real_labels = torch.ones(images.size(0), 1) fake_labels = torch.zeros(images.size(0), 1) z = torch.randn(images.size(0), 100) fake_images = generator(z) outputs_real = discriminator(real_images) outputs_fake = discriminator(fake_images) loss_d_real = criterion(outputs_real, real_labels) loss_d_fake = criterion(outputs_fake, fake_labels) loss_d = loss_d_real + loss_d_fake loss_d.backward() optimizer_d.step() # 训练生成器 generator.zero_grad() z = torch.randn(images.size(0), 100) fake_images = generator(z) outputs = discriminator(fake_images) loss_g = criterion(outputs, real_labels) loss_g.backward() optimizer_g.step() # 打印损失 if (i + 1) % 100 == 0: print('Epoch [{}/{}], Step [{}/{}], d_loss: {:.4f}, g_loss: {:.4f}' .format(epoch, num_epochs, i + 1, len(dataloader), loss_d.item(), loss_g.item())) ``` 8.生成图片 ```python # 生成随机噪声 z = torch.randn(64, 100) # 生成图片 fake_images = generator(z) # 将图片转换为numpy数组 fake_images = fake_images.detach().numpy() # 将图片可视化 fig, axs = plt.subplots(8, 8, figsize=(10, 10)) cnt = 0 for i in range(8): for j in range(8): axs[i, j].imshow(fake_images[cnt].reshape(28, 28), cmap='gray') axs[i, j].axis('off') cnt += 1 plt.show() ```
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值