Papers Notes_6_ DCGAN--Unsupervised Representation Learning with Deep Convolutional GAN
Approach
scale up GANs using CNNs to model images
- the all convolutional net
replace deterministic spatial pooling functions (such as maxpooling) with strided convolutions
→allow the network to learn its own spatial downsampling
use in generator and discriminator - eliminate fully connected layer
for the generator
the uniform noise distribution Z →reshape to 4-dimensional tensor
for the discriminator
the last convolutional layer is flatten→a single sigmoid output - the Batch Normalization
stabilize learning by normalizing the input to each unit to have zero mean and unit variance
helps deal with the training problems that arise due to poor initialization and helps gradient flow in deeper models
critical to get deep generators to begin learning, preventing the generator from collapsing all examples to a single point which is a common failure mod observed in GANs
apply BN to all layers→sample oscillation and model instability
avioded by not applying to the generator output or the discriminator input layer - ReLU and LeakyReLU activation
ReLU activation is used in the generator except the output layer which uses the Tanh function
→learn more quickly to saturate and cover the color space of the training distribution
within the discriminator, leaky rectified activation work well, especially for higher resolution modeling
Architecture
no pre-processing besides scaling to the range of the tanh function of [-1, 1]
trained with mini-batch stochastic gradient descent with a mini-batch size 128
all weights were initialized from a zero-centered Nomal distribution with standard deviation 0.02
Leaky ReLU, the slope of leak is 0.2
use Adam optimizer
learning rate: 0.0002, (suggested value is 0.001, too high)
momentum term
β
1
\beta_1
β1: 0.5, (suggested value is 0.9, result in training oscillation and instability)
References
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks