【学习笔记】GAN模型的学习（5）------dcgan_generator

最新推荐文章于 2022-11-21 21:13:11 发布

偶遇一只核桃

最新推荐文章于 2022-11-21 21:13:11 发布

阅读量154

点赞数

分类专栏：学习笔记

本文链接：https://blog.csdn.net/qq_41799105/article/details/117428646

版权

学习笔记专栏收录该内容

16 篇文章 0 订阅

订阅专栏

1 前言
没想到这份代码如此困难

2 开始

    def dcgan_generator(z, config, training, C, reuse=False, actv=tf.nn.relu, kernel_size=5, upsample_dim=256):
        """
        Upsample noise to concatenate with quantized representation w_bar.
        + z:    Drawn from latent distribution - [batch_size, noise_dim]
        + C:    Bottleneck depth, controls bpp - last dimension of encoder output
        """
        init =  tf.contrib.layers.xavier_initializer()
        kwargs = {'center':True, 'scale':True, 'training':training, 'fused':True, 'renorm':False}
        with tf.variable_scope('noise_generator', reuse=reuse):

            # [batch_size, 4, 8, dim]
            with tf.variable_scope('fc1', reuse=reuse):
                h2 = tf.layers.dense(z, units=4 * 8 * upsample_dim, activation=actv, kernel_initializer=init)  # cifar-10
                h2 = tf.layers.batch_normalization(h2, **kwargs)
                h2 = tf.reshape(h2, shape=[-1, 4, 8, upsample_dim])

            # [batch_size, 8, 16, dim/2]
            with tf.variable_scope('upsample1', reuse=reuse):
                up1 = tf.layers.conv2d_transpose(h2, upsample_dim//2, kernel_size=kernel_size, strides=2, padding='same', activation=actv)
                up1 = tf.layers.batch_normalization(up1, **kwargs)

            # [batch_size, 16, 32, dim/4]
            with tf.variable_scope('upsample2', reuse=reuse):
                up2 = tf.layers.conv2d_transpose(up1, upsample_dim//4, kernel_size=kernel_size, strides=2, padding='same', activation=actv)
                up2 = tf.layers.batch_normalization(up2, **kwargs)
            
            # [batch_size, 32, 64, dim/8]
            with tf.variable_scope('upsample3', reuse=reuse):
                up3 = tf.layers.conv2d_transpose(up2, upsample_dim//8, kernel_size=kernel_size, strides=2, padding='same', activation=actv)  # cifar-10
                up3 = tf.layers.batch_normalization(up3, **kwargs)

            with tf.variable_scope('conv_out', reuse=reuse):
                out = tf.pad(up3, [[0, 0], [3, 3], [3, 3], [0, 0]], 'REFLECT')
                out = tf.layers.conv2d(out, C, kernel_size=7, strides=1, padding='VALID')

        return out

（1）tf.contrib.layers.xavier_initializer(uniform=True, seed=None, dtype=tf.float32)
对权重使用uniform或者normal分布来随机初始化

（2）第一层（全连接）[batch_size, 4, 8, 256]

def dense(
    inputs, units,
    activation=None,
    use_bias=True,
    kernel_initializer=None,
    bias_initializer=init_ops.zeros_initializer(),
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    trainable=True,
    name=None,
    reuse=None):

这里定义了一个全连接层
参数
inputs：输入该网络层的数据
units：输出的维度大小，改变inputs的最后一维
activation：激活函数，即神经网络的非线性变化
use_bias：使用bias为True（默认使用），不用bias改成False即可，是否使用偏置项
kernel_initializer：卷积核初始化器
trainable：表明该层的参数是否参与训练。如果为真则变量加入到图集合中
name：层的名字
reuse：是否重复使用参数
在这里输入是z，z是在噪声的分布中取；输出的维度大小为4 * 8 * upsample_dim，其中upsample_dim是输入值，其值为256；激活函数取的为relu‘使用卷积核初始化器

接着：

def batch_normalization(inputs,
                        axis=-1,
                        momentum=0.99,
                        epsilon=1e-3,
                        center=True,
                        scale=True,
                        beta_initializer=init_ops.zeros_initializer(),
                        gamma_initializer=init_ops.ones_initializer(),
                        moving_mean_initializer=init_ops.zeros_initializer(),
                        moving_variance_initializer=init_ops.ones_initializer(),
                        beta_regularizer=None,
                        gamma_regularizer=None,
                        beta_constraint=None,
                        gamma_constraint=None,
                        training=False,
                        trainable=True,
                        name=None,
                        reuse=None,
                        renorm=False,
                        renorm_clipping=None,
                        renorm_momentum=0.99,
                        fused=None,
                        virtual_batch_size=None,
                        adjustment=None):

批量归一化层
参数
inputs：输入的数据
axis：应该被规范化的轴（通常是特征轴）
momentum=0.99：滑动平均动量
epsilon=1e-3：避免分母为0
center=True：若为True偏移添加到标准化张量
scale=True：若为True缩放乘到标准化张量
trainable=True：布尔值，如果为True，还将变量添加到图形集合GraphKeys.TRAINABLE_VARIABLES
reuse=None：是否以相同的名称重用前一层的权重
renorm_momentum=0.99：用renorm更新滑动方式和标准偏差的动量
参考文章：（写的很详细）TensorFlow学习笔记之批归一化：tf.layers.batch_normalization()函数

最后：

h2 = tf.reshape(h2, shape=[-1, 4, 8, upsample_dim])

把h2输出的结果reshape成[4,8,256],其中第一维用-1表示是自己计算的，在这里是1，因为我们的batch_size是1

（3）第二层（转置卷积、上采样）[batch_size, 8, 16, 128]

def conv2d_transpose(inputs,
                     filters,
                     kernel_size,
                     strides=(1, 1),
                     padding='valid',
                     data_format='channels_last',
                     activation=None,
                     use_bias=True,
                     kernel_initializer=None,
                     bias_initializer=init_ops.zeros_initializer(),
                     kernel_regularizer=None,
                     bias_regularizer=None,
                     activity_regularizer=None,
                     kernel_constraint=None,
                     bias_constraint=None,
                     trainable=True,
                     name=None,
                     reuse=None):

这一层是转置卷积层

对转置卷积的需求通常源于使用与正常卷积相反方向的变换的愿望，即从具有某些卷积输出形状的事物转换为具有其输入形状的事物，同时保持与所述卷积兼容的连接模式。

参数
inputs：输入的张量
filters：输出空间的维度（即卷积核的数量）
kernel_size：卷积核的大小，
strides=(1, 1)：卷积的步长，这里设计的步长为2
padding=‘valid’,这里使用
data_format=’channels_last’,个字符串，channels_last（默认）或 channels_first 之一。输入中维度的排序。 channels_last 对应于形状为 (batch, height, width, channels) 的输入，而 channels_first 对应于形状为 (batch, channels, height, width) 的输入
activation=None
use_bias=True

本项目输入为上一层全连接层的输出，采用128个5*5的卷积核，步长为2，采用relu函数作为激活函数

然后再经过一个批量归一化层……

（4）第三层（转置卷积、上采样）[batch_size, 16, 32, 64]
再通过一个转置卷积，输入为上一层转置卷积的输出，采用64个5*5的卷积核，步长为2，采用relu函数作为激活函数，,然后再经过一个批量归一化层……

（5）第四层（转置卷积、上采样）[batch_size, 32, 64, 32]
再通过一个转置卷积，输入为上一层转置卷积的输出，采用32个5*5的卷积核，步长为2，采用relu函数作为激活函数，,然后再经过一个批量归一化层……

（6）第五层
pad扩展操作后，使用3个7*7的卷积核，步长为1来进行卷积操作，C是3嘛，所以最终输出的是一张图

这里还有一个疑问out = tf.pad(up3, [[0, 0], [3, 3], [3, 3], [0, 0]], 'REFLECT')这里指定的四个维度是怎么进行扩扩展的，目前只在网上看到了两组参数或者三组参数的讲解，对于第四组参数还不是很明白

偶遇一只核桃

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【学习笔记】GAN模型的学习（5）------dcgan_generator

1 前言没想到这份代码如此困难2 开始 def dcgan_generator(z, config, training, C, reuse=False, actv=tf.nn.relu, kernel_size=5, upsample_dim=256): """ Upsample noise to concatenate with quantized representation w_bar. + z: Drawn from latent
复制链接

扫一扫

专栏目录