DCGAN和cDCGAN在mnist上实战

这个博客主要是总结下模型的搭建细节,完成这两模型花了三四天吧,DCGAN的参数一直调不好。

先简单说说DCGAN的原理吧,主要原理还是GAN,一个生成网络和一个判别网络。和基础的GAN相比,区别就在于DCGAN无论是生成网络还是判别网络都采用了卷积。

一开始,生成网络和判别网络都采用较深的卷积(都用了10层吧),什么LeakyRelu都用上了,效果一直很差,后来感觉,可能mnist数据图片小,内容比较简单,GAN又难拟合,我就简单的弄了2,3层,把激活函数直接采用tanh,而且Batch_normal改成了Dropout(区别挺大的,不知道为何),效果得到了改善

生成网络:

def G_Net(input, reuse=False, Training=True):
    with tf.variable_scope('G_Net', reuse=reuse):
        with tf.variable_scope('dense'):
            net = tf.layers.dense(input, 8*8*256, activation=tf.nn.tanh)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.dropout(net, 0.5)
            net = tf.reshape(net, shape=(-1, 8, 8, 256))

        with tf.variable_scope('deconv_1'):
            net = slim.conv2d_transpose(net, num_outputs=128, kernel_size=[5, 5], stride=2, activation_fn=None,
                                  weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
            net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

        # with tf.variable_scope('conv_1'):
        #     net = slim.conv2d(net, num_outputs=128, kernel_size=[5, 5], stride=1, activation_fn=None,
        #                       weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
        #     net = tf.layers.batch_normalization(net, training=Training)
        #     net = tf.nn.relu(net)

        with tf.variable_scope('deconv_2'):
            net = slim.conv2d_transpose(net, num_outputs=1, kernel_size=[5, 5], stride=2, activation_fn=None,
                                        weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

判别网络:

def D_Net(input, reuse=False, Training=True):
    with tf.variable_scope('D_Net', reuse=reuse):
        with tf.variable_scope('conv_1'):
            net = slim.conv2d(input, kernel_size=[5, 5], num_outputs=64, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        with tf.variable_scope('conv_2'):
            net = slim.conv2d(net, kernel_size=[5, 5], num_outputs=128, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        # with tf.variable_scope('conv_3'):
        #     net = slim.conv2d(net, kernel_size=[3, 3], num_outputs=16, stride=2, activation_fn=None,
        #                        weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
        #     net = tf.layers.batch_normalization(net, training=Training)
        #     net = LeakyRelu(net)

        with tf.variable_scope('dense'):
            net = slim.flatten(net)
            net = tf.nn.dropout(net, 0.5)
            net = tf.layers.dense(net, 1024, activation=tf.nn.relu)
            net = tf.nn.dropout(net, 0.5)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.layers.dense(net, 1, activation=None)
            p = tf.nn.sigmoid(net)

    return p

效果:

效果看着还行,却出现了过拟合的现象,基本都是7和9,分析了下原因,mnist数据确实太简单了28*28,简单归简单,数字间差别还挺大的,网络简单吧,数字生成不像,网络复杂吧,又过拟合了,所以引入条件对于mnist的GAN提升应该很大。

============================================分割线================================================

cDCGAN又称为条件GAN,原理也很简单,在中间层concat上label,相当于加入条件。原本的GAN可以看成噪声到目标的映射关系,比如G(noise) = fake_img, D(img) = p_of_real, 加入条件后变为G(noise|y=label) = (fake_img | y=label), D(img|label) = p_of_real | y = label,公式总结的不大好,可以参考网上对于条件GAN的解读:http://baijiahao.baidu.com/s?id=1602483050633464000&wfr=spider&for=pc

由于label是2维的(包括Batch维度),在卷积部分添加条件时,需要先reshape到(batch_size, 1, 1, num_of_class),注意,num_of_class需要多一个fake类。reshape完还得根据当前层的size乘以一个相同size全为1的矩阵再concat,DCGAN的官方实现中有卷积过程concat的实现:

def conv_cond_concat(x, y):
  """Concatenate conditioning vector on feature map axis."""
  x_shapes = x.get_shape()
  y_shapes = y.get_shape()
  return tf.concat([
    x, y*tf.ones([x_shapes[0], x_shapes[1], x_shapes[2], y_shapes[3]])], 3)

然后是G_net:

def G_Net(input, label, reuse=False, Training=True):
    with tf.variable_scope('G_Net', reuse=reuse):
        with tf.variable_scope('dense'):
            net = tf.concat([input, label], axis=1)
            net = tf.layers.dense(net, 8*8*256, activation=tf.nn.tanh)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.dropout(net, 0.5)
            net = tf.reshape(net, shape=(-1, 8, 8, 256))

        with tf.variable_scope('deconv_1'):
            net = slim.conv2d_transpose(net, num_outputs=128, kernel_size=[5, 5], stride=2, activation_fn=None,
                                  weights_initializer=tf.truncated_normal_initializer(mean=0.0, stddev=0.02))
            net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

G_net只在全连接出来的地方concat了

然后是D_net:

def D_Net(input, label, reuse=False, Training=True):
    batch_size = label.get_shape()[0]
    label = tf.reshape(label, shape=(batch_size, 1, 1, 11))
    with tf.variable_scope('D_Net', reuse=reuse):
        with tf.variable_scope('conv_1'):
            net = conv_cond_concat(input, label)
            net = slim.conv2d(net, kernel_size=[5, 5], num_outputs=64, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        with tf.variable_scope('conv_2'):
            net = conv_cond_concat(net, label)
            net = slim.conv2d(net, kernel_size=[5, 5], num_outputs=128, stride=1, activation_fn=None,
                               weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.nn.tanh(net)

            net = slim.max_pool2d(net, kernel_size=[2, 2], stride=2, padding='SAME')

        # with tf.variable_scope('conv_3'):
        #     net = slim.conv2d(net, kernel_size=[3, 3], num_outputs=16, stride=2, activation_fn=None,
        #                        weights_initializer=tf.truncated_normal_initializer(mean=0, stddev=0.02))
        #     net = tf.layers.batch_normalization(net, training=Training)
        #     net = LeakyRelu(net)

        with tf.variable_scope('dense'):
            net = slim.flatten(net)
            net = tf.nn.dropout(net, 0.5)
            net = tf.layers.dense(net, 1024, activation=tf.nn.relu)
            net = tf.nn.dropout(net, 0.5)
            # net = tf.layers.batch_normalization(net, training=Training)
            net = tf.layers.dense(net, 11, activation=None)

    return net

由于不再只是简单的真假判断了,所以返回num_of_class的长度出来做softmax

损失函数如下:

loss_G = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=logit_fake))
    loss_D = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=logit_real) + \
                            tf.nn.softmax_cross_entropy_with_logits(labels=fake_label, logits=logit_fake))

fake_label是标记全为10的label,即全是fake类

生成效果就很棒了:

本人小白,分析很多都是个人理解,有错还望大佬们指出。

 

评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值