DCGAN和cDCGAN在mnist上实战

最新推荐文章于 2024-08-19 23:45:42 发布

ZouCharming

最新推荐文章于 2024-08-19 23:45:42 发布

阅读量5.7k

点赞数 1

分类专栏： GAN

本文链接：https://blog.csdn.net/zoucharming/article/details/82319721

版权

GAN 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

这个博客主要是总结下模型的搭建细节，完成这两模型花了三四天吧，DCGAN的参数一直调不好。

先简单说说DCGAN的原理吧，主要原理还是GAN，一个生成网络和一个判别网络。和基础的GAN相比，区别就在于DCGAN无论是生成网络还是判别网络都采用了卷积。

一开始，生成网络和判别网络都采用较深的卷积（都用了10层吧），什么LeakyRelu都用上了，效果一直很差，后来感觉，可能mnist数据图片小，内容比较简单，GAN又难拟合，我就简单的弄了2,3层，把激活函数直接采用tanh，而且Batch_normal改成了Dropout（区别挺大的，不知道为何），效果得到了改善

生成网络：

def G_Net(input, reuse=False, Training=True):
    with tf.variable_scope('G_Net', reuse=reuse):
        with tf.variable_scope('dense'):
            net = tf.layers.dense(input, 8*8*256, activation=tf.nn.tanh)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.dropout(net, 0.5)
            net = tf.reshape(net, shape=(-1, 8, 8, 256))

        with tf.variable_scope('deconv_1'):
            net = slim.conv2d_transpose(net, num_outputs=128, kernel_size=[5, 5], stride=2, activation_fn=None,
                                  weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
            net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

        # with tf.variable_scope('conv_1'):
        #     net = slim.conv2d(net, num_outputs=128, kernel_size=[5, 5], stride=1, activation_fn=None,
        #                       weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
        #     net = tf.layers.batch_normalization(net, training=Training)
        #     net = tf.nn.relu(net)

        with tf.variable_scope('deconv_2'):
            net = slim.conv2d_transpose(net, num_outputs=1, kernel_size=[5, 5], stride=2, activation_fn=None,
                                        weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

判别网络：

def D_Net(input, reuse=False, Training=True):
    with tf.variable_scope('D_Net', reuse=reuse):
        with tf.variable_scope('conv_1'):
            net = slim.conv2d(input, kernel_size=[5, 5], num_outputs=64, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        with tf.variable_scope('conv_2'):
            net = slim.conv2d(net, kernel_size=[5, 5], num_outputs=128, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        # with tf.variable_scope('conv_3'):
        #     net = slim.conv2d(net, kernel_size=[3, 3], num_outputs=16, stride=2, activation_fn=None,
        #                        weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
        #     net = tf.layers.batch_normalization(net, training=Training)
        #     net = LeakyRelu(net)

        with tf.variable_scope('dense'):
            net = slim.flatten(net)
            net = tf.nn.dropout(net, 0.5)
            net = tf.layers.dense(net, 1024, activation=tf.nn.relu)
            net = tf.nn.dropout(net, 0.5)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.layers.dense(net, 1, activation=None)
            p = tf.nn.sigmoid(net)

    return p

效果：

效果看着还行，却出现了过拟合的现象，基本都是7和9，分析了下原因，mnist数据确实太简单了28*28，简单归简单，数字间差别还挺大的，网络简单吧，数字生成不像，网络复杂吧，又过拟合了，所以引入条件对于mnist的GAN提升应该很大。

============================================分割线================================================

cDCGAN又称为条件GAN，原理也很简单，在中间层concat上label，相当于加入条件。原本的GAN可以看成噪声到目标的映射关系，比如G(noise) = fake_img, D(img) = p_of_real，加入条件后变为G(noise|y=label) = (fake_img | y=label), D(img|label) = p_of_real | y = label，公式总结的不大好，可以参考网上对于条件GAN的解读：http://baijiahao.baidu.com/s?id=1602483050633464000&wfr=spider&for=pc

由于label是2维的（包括Batch维度），在卷积部分添加条件时，需要先reshape到（batch_size, 1, 1, num_of_class），注意，num_of_class需要多一个fake类。reshape完还得根据当前层的size乘以一个相同size全为1的矩阵再concat，DCGAN的官方实现中有卷积过程concat的实现：

def conv_cond_concat(x, y):
  """Concatenate conditioning vector on feature map axis."""
  x_shapes = x.get_shape()
  y_shapes = y.get_shape()
  return tf.concat([
    x, y*tf.ones([x_shapes[0], x_shapes[1], x_shapes[2], y_shapes[3]])], 3)

然后是G_net:

def G_Net(input, label, reuse=False, Training=True):
    with tf.variable_scope('G_Net', reuse=reuse):
        with tf.variable_scope('dense'):
            net = tf.concat([input, label], axis=1)
            net = tf.layers.dense(net, 8*8*256, activation=tf.nn.tanh)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.dropout(net, 0.5)
            net = tf.reshape(net, shape=(-1, 8, 8, 256))

        with tf.variable_scope('deconv_1'):
            net = slim.conv2d_transpose(net, num_outputs=128, kernel_size=[5, 5], stride=2, activation_fn=None,
                                  weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
            net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

G_net只在全连接出来的地方concat了

然后是D_net:

def D_Net(input, label, reuse=False, Training=True):
    batch_size = label.get_shape()[0]
    label = tf.reshape(label, shape=(batch_size, 1, 1, 11))
    with tf.variable_scope('D_Net', reuse=reuse):
        with tf.variable_scope('conv_1'):
            net = conv_cond_concat(input, label)
            net = slim.conv2d(net, kernel_size=[5, 5], num_outputs=64, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        with tf.variable_scope('conv_2'):
            net = conv_cond_concat(net, label)
            net = slim.conv2d(net, kernel_size=[5, 5], num_outputs=128, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        # with tf.variable_scope('conv_3'):
        #     net = slim.conv2d(net, kernel_size=[3, 3], num_outputs=16, stride=2, activation_fn=None,
        #                        weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
        #     net = tf.layers.batch_normalization(net, training=Training)
        #     net = LeakyRelu(net)

        with tf.variable_scope('dense'):
            net = slim.flatten(net)
            net = tf.nn.dropout(net, 0.5)
            net = tf.layers.dense(net, 1024, activation=tf.nn.relu)
            net = tf.nn.dropout(net, 0.5)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.layers.dense(net, 11, activation=None)

    return net

由于不再只是简单的真假判断了，所以返回num_of_class的长度出来做softmax

损失函数如下：

loss_G = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=logit_fake))
    loss_D = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=logit_real) + \
                            tf.nn.softmax_cross_entropy_with_logits(labels=fake_label, logits=logit_fake))

fake_label是标记全为10的label，即全是fake类

生成效果就很棒了：