An Introduction to GAN

What is GAN

GAN(Generative Adversarial Nets) is an unsupervised learning algorithm of deep learning, first proposed by Ian J. Goodfellow in 2014. The main innovation of GAN is to integrate ideas of competition and confrontation with the principal of data generation. For implementation, GAN achieves its goal with the help of neural network.

As GAN is often used to generate images, when explaining the principal of GAN, we use ‘image’ instead of data.

The Composition of GAN

There are two models in GAN, namely the generative model and the discriminative model. Given an image (real image), while the goal of generative model is to create an image(fake image) the same as the given image, the role of the discriminative model is to distinguish the real image from the fake image. Therefore, the generative model will optimize its parameters to make the fake image more similar to the real image to prevent the discriminative model from successfully distinguishing. Then, the discriminative model also optimizes its parameters to have a higher discriminative ability. After a long time competition between the generative model and the discriminative model, parameters in the two models reached convergence when the discriminative model could not tell which image is real and which image is fake(this is explained in The Mathematical Foundation of GAN).

The principle of GAN can also be represented by the figure below:
在这里插入图片描述
                        Figure1 The principle of GAN

In this figure, G G G(generative model) produces images from random noise and D D D(discriminative model) distinguishes the real image from the fake image.

About Value(Loss) Function

Here, we define the value function for GAN and the output of discriminative model is in the interval of (0,1):

在这里插入图片描述
Based on the principal that if an imaged is distinguished as a real image, the output of discriminative model is 1 and if an imaged is distinguished as a fake image, the output is 0, we use cross entropy as the value function. During optimization, the discriminative model aims to maximize the value function while the goal of generative model is to do the opposite way. Thus, in order to maximize V ( D , G ) V(D,G) V(D,G), the discriminative model needs to try its best to distinguish the real image from the fake image. To minimize V ( D , G ) V(D,G) V(D,G), the generative model has to make the fake image more similar to the real image, which can be inferred from the formula.

We use the principal of gradient descent and gradient ascent to update parameters in neural networks. It should be noted that in early learning, when G G G is poor, D D D can reject fake images with high confidence because they are clearly different from real images. In this case,
l o g ( 1 − D ( G ( z ) ) ) log (1-D(G(z))) log(1D(G(z))) saturates, which prevents updates of parameters in G G G. Rather than training G G G to minimize l o g ( 1 − D ( G ( z ) ) ) log (1-D(G(z))) log(1D(G(z))) , training G G G to maximize l o g D ( G ( z ) ) log D(G(z)) logD(G(z)) might be a better choice. However, the author of GAN fails to give the mathematical proof of whether the value function will converge if urging the generative model to maximize l o g D ( G ( z ) ) log D(G(z)) logD(G(z)). Therfore, in algorithm of GAN, the goal of the generative model is still to minimize l o g ( 1 − D ( G ( z ) ) ) log (1-D(G(z))) log(1D(G(z))).
The goal of two models are listed as below:

discriminative model:
m a x max max l o g ( D ( x ) ) log (D(x)) log(D(x))+ l o g ( 1 − D ( G ( z ) ) ) log (1-D(G(z))) log(1D(G(z)))

generative model:
m i n min min l o g ( 1 − D ( G ( z ) ) ) log (1-D(G(z))) log(1D(G(z)))

The Mathematical Foundation of GAN

For images given to the discriminative model, they may come from real images. They may also come from the generative model. We can consider that real images have their probability distribution and images generated from noise have another distribution. Once the distribution of noises are given and parameters of the generative model are fixed, the probability distribution of fake images that are from noises are determined. For an image given to the discriminative model, we call it as x. Here, x has p d a t a p_{data} pdata(x) probability comes from real data and p g p_{g} pg(x) probability comes from fake data.

Now we are going to prove two conclusions. First, we are going to prove that this formula 在这里插入图片描述
can reach a global optimum for p g p_{g} pg( x x x)= p d a t a p_{data} pdata ( x (x (x). After that, we will show the algorithm of GAN and prove that this algorithm promotes p g p_{g} pg( x x x) to converge to p d a t a p_{data} pdata( x x x).

The Proof of First conclusion

To prove the first conclusion, we need to first prove that for any given generator G, the optimal discriminator D will achieve the following equation:
在这里插入图片描述
Proof:

V ( D , G ) V(D,G) V(D,G) = = = E x ∼ E_{x\sim} Ex p d a t a p_{data} pdata l o g ( D ( x ) ) log (D(x)) log(D(x)) + E x ∼ E_{x\sim} Ex p g p_g pg l o g ( 1 − D ( x ) ) log (1-D(x)) log(1D(x))

                = = = ∫ x \int_x x p d a t a p_{data} pdata( x x x) l o g ( D ( x ) ) log (D(x)) log(D(x)) d x dx dx + + + ∫ x \int_x x p g p_{g} pg( x x x) l o g ( 1 − D ( x ) ) log (1-D(x)) log(1D(x)) d x dx dx

                = = = ∫ x \int_x x p d a t a p_{data} pdata( x x x) l o g ( D ( x ) ) log (D(x)) log(D(x))+ p g p_{g} pg( x x x) l o g ( 1 − D ( x ) ) log (1-D(x)) log(1D(x)) d x dx dx

As the goal of D is to maximize this value function, according to the principle of second derivative and extreme points, D should optimize its parameters to reach the following equation:在这里插入图片描述
in order to maximize V ( D , G ) V(D,G) V(D,G).

Now we are going to prove that when D has achieved:在这里插入图片描述

generator G would make the minimum of value function only if G has made p g p_{g} pg( x x x)= p d a t a p_{data} pdata( x x x).

To prove this, we should introduce JS divergence, a variant version of KL divergence, who solves the problem of asymmetry of KL divergence. The definition of JS divergence is:

D J S D_{JS} DJS( p p p|| q q q) = = = 1 2 \frac{1}{2} 21 D K L D_{KL} DKL( p p p|| p + q 2 \frac{p+q}{2} 2p+q) + + + 1 2 \frac{1}{2} 21 D K L D_{KL} DKL( q q q|| p + q 2 \frac{p+q}{2} 2p+q),

where 在这里插入图片描述
Here, we calculate JS divergence between p g p_{g} pg( x x x) and p d a t a p_{data} pdata( x x x) first.
在这里插入图片描述
The formula is from the Internet and p r p_{r} pr above is just exactly p d a t a p_{data} pdata and L L L( G G G, D D D) above is just V V V( G G G, D D D) here. It is surprising to find that when calculating JS divergence between p g p_{g} pg( x x x) and p d a t a p_{data} pdata( x x x), V V V( G G G, D D D) appears!

Hence, V V V( G G G, D D D) = = = 2 2 2 D J S D_{JS} DJS( p d a t a p_{data} pdata|| p g p_{g} pg) − - l o g 4 log4 log4.

As JS divergence is always non-negative, to make the minimum of V V V( G G G, D D D), D J S D_{JS} DJS( p d a t a p_{data} pdata|| p g p_{g} pg) should be zero. At that time, p g p_{g} pg( x x x)= p d a t a p_{data} pdata( x x x) and V V V( G G G, D D D) = = = − - l o g 4 log4 log4. What should be noted is at that time, D ∗ D^* D(x)= 1 2 \frac{1}{2} 21, showing the discriminative model would not be able to distinguish the real image from the fake image successfully.

The Proof of Second conclusion

Here is the algorithm of GAN.

在这里插入图片描述
If G and D have enough capacity, and at each step of Algorithm, the discriminator is allowed to reach its optimum given G, and p g p_{g} pg is updated so as to improve the criterion
在这里插入图片描述

then p g p_{g} pg( x x x) converges to p d a t a p_{data} pdata( x x x).

Proof:
在这里插入图片描述
For this proof, I don’t understand it well. It would be appreciated if somebody can explain it for me.

DCGAN

DCGAN is an advanced version of GAN, who combines the principal of convolutional neural network with GAN. In addition, DCGAN uses LeakyReLU as activation function and creates the idea of transposed convolution, replacing fully collected neural network with convolutional neural network. Experiments from DCGAN shows that DCGAN has better training stability than ordinary GAN.

The figure below shows the principal of DCGAN.
在这里插入图片描述

Finally, as it is easy to implement DCGAN with TensorFlow, it would be unnecessary to go into details here. For more detailed information about implementation, you can visit https://tensorflow.google.cn/

However, DCGAN has not been able to improve GAN from a deep level. The emergence of WGAN has completely solved many problems of GAN, such as instability!

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值