GAN, DCGAN, cGAN Paper Reading

GAN: Generative Adversarial Nets

1. traditional GAN 2014

1.1 Paper: 

https://arxiv.org/pdf/1406.2661

1.2 Understanding:

  1. E.g. From noise to real image.
  2. Generator: G(z;θg)
    1. Input: noise z;
    2. Output: generated image G(z).
  3. Discriminator: D(x; θd)
    1. Input: generated image: G(z), label: (0) + real image: x, label: (1)
    2. Output: predicted probability
  4. Loss:
    1. Log(D(x))+log(1-D(G(z)))
    2. For D: minimize term 1; for G: minimize term 2.

1.3 Pic: 

1.4 Network:

Simple dense layers 

2. Conditional GAN 2014

2.1 Paper: (unpaired cGAN)

https://arxiv.org/pdf/1511.06434.pdf

2.2 Key improvement:

add a condition y, y can be image, text or sound info.....

2.3 Type: (pair means the input data dependence)

  1. Un-paired cGAN
  2. Paired cGAN

2.4 Understanding:

  1. E.g. MNIST with condition y(label from 0-9)----unpaired cGAN
  2. Generator: G(z, y; θg)
    1. Input: noise z (100), label y(10)----one-hot label
      1. Map to hidden layer with Relu respectively, 200 and 1000 units
      2. Concatenate these two layers into 1200 units as the input of thefollowing layers
    2. Output: 784 units. (Mnist pic is 28*28)
  3. Discriminator: D(x; θd)
    1. Input: (real image (x), label y) 1, (generated image G(x,y) ) 0
      1. X and y map to maxout layer, 240 units and 5 pieces, 50 units and 5 pieces.
      2. Both map to hidden layer, 240 units and 4pieces and concatenate to feed sigmoid
    2. Output: predicted probability
  4. Loss: Log(D(x, y))+log(1-D(G(z, y)))
    1. For D: minimize term 1; for G: minimize term 2.

2.5 Pic:

 

2.6 Network:

Simple dense layers

2.7 there are some papers about paired cGAN:

----This part I'll implement later.....

----To be continued!!!!

3.DCGAN: deep convolutional GAN 2015

3.1 Paper: 

arxiv.org/abs/1411.1784

3.2 Key improvement:

Combine CNN and traditional GAN, the layers of GAN are convolutional layers instead of dense layers

3.3 Key ideas:

  1. Don't use MAX pooling or average pooling, while doing downsampling, change the stride of CNN
  2. Don't use fully connected layers, use reshape to change the input to larger channels.
  3. Use Batch normalization layer after CNN and activation layer.

3.4 Architecture pic

3.5 Training details:

  1. Pixels are maped b.t.w(between) [-1, 1], which is corresponding to the tanh activation function.
  2. SGD,mini batch = 128;
  3. weight initialization, mean=0, standard=0.02;
  4. LeakyRelu, slope of the leak = 0.2;
  5. Adam optimizer, learning rate = 0.0002, momentum, beta_1=0.5。
展开阅读全文

没有更多推荐了,返回首页

©️2019 CSDN 皮肤主题: 点我我会动 设计师: 上身试试
应支付0元
点击重新获取
扫码支付

支付成功即可阅读