An Introduction to GAN

最新推荐文章于 2022-02-28 16:26:02 发布

A Peaceful Tree

最新推荐文章于 2022-02-28 16:26:02 发布

阅读量618

点赞数 1

文章标签： tensorflow 深度学习算法神经网络机器学习

本文链接：https://blog.csdn.net/weixin_44815085/article/details/105187706

版权

What is GAN

GAN(Generative Adversarial Nets) is an unsupervised learning algorithm of deep learning, first proposed by Ian J. Goodfellow in 2014. The main innovation of GAN is to integrate ideas of competition and confrontation with the principal of data generation. For implementation, GAN achieves its goal with the help of neural network.

As GAN is often used to generate images, when explaining the principal of GAN, we use ‘image’ instead of data.

The Composition of GAN

There are two models in GAN, namely the generative model and the discriminative model. Given an image (real image), while the goal of generative model is to create an image(fake image) the same as the given image, the role of the discriminative model is to distinguish the real image from the fake image. Therefore, the generative model will optimize its parameters to make the fake image more similar to the real image to prevent the discriminative model from successfully distinguishing. Then, the discriminative model also optimizes its parameters to have a higher discriminative ability. After a long time competition between the generative model and the discriminative model, parameters in the two models reached convergence when the discriminative model could not tell which image is real and which image is fake(this is explained in The Mathematical Foundation of GAN).

The principle of GAN can also be represented by the figure below:
在这里插入图片描述
Figure1 The principle of GAN

In this figure, $G$ (generative model) produces images from random noise and $D$ (discriminative model) distinguishes the real image from the fake image.

About Value(Loss) Function

Here, we define the value function for GAN and the output of discriminative model is in the interval of (0,1):

在这里插入图片描述
Based on the principal that if an imaged is distinguished as a real image, the output of discriminative model is 1 and if an imaged is distinguished as a fake image, the output is 0, we use cross entropy as the value function. During optimization, the discriminative model aims to maximize the value function while the goal of generative model is to do the opposite way. Thus, in order to maximize $V (D, G)$ , the discriminative model needs to try its best to distinguish the real image from the fake image. To minimize $V (D, G)$ , the generative model has to make the fake image more similar to the real image, which can be inferred from the formula.

We use the principal of gradient descent and gradient ascent to update parameters in neural networks. It should be noted that in early learning, when $G$ is poor, $D$ can reject fake images with high confidence because they are clearly different from real images. In this case,
$l o g (1 - D (G (z)))$ saturates, which prevents updates of parameters in $G$ . Rather than training $G$ to minimize $l o g (1 - D (G (z)))$ , training $G$ to maximize $l o g D (G (z))$ might be a better choice. However, the author of GAN fails to give the mathematical proof of whether the value function will converge if urging the generative model to maximize $l o g D (G (z))$ . Therfore, in algorithm of GAN, the goal of the generative model is still to minimize $l o g (1 - D (G (z)))$ .
The goal of two models are listed as below:

discriminative model:
$m a x$ $l o g (D (x))$ + $l o g (1 - D (G (z)))$

generative model:
$m i n$ $l o g (1 - D (G (z)))$

The Mathematical Foundation of GAN

For images given to the discriminative model, they may come from real images. They may also come from the generative model. We can consider that real images have their probability distribution and images generated from noise have another distribution. Once the distribution of noises are given and parameters of the generative model are fixed, the probability distribution of fake images that are from noises are determined. For an image given to the discriminative model, we call it as x. Here, x has $p_{data}$ (x) probability comes from real data and $p_{g}$ (x) probability comes from fake data.

Now we are going to prove two conclusions. First, we are going to prove that this formula 在这里插入图片描述
can reach a global optimum for $p_{g}$ ( $x$ )= $p_{data}$ $(x$ ). After that, we will show the algorithm of GAN and prove that this algorithm promotes $p_{g}$ ( $x$ ) to converge to $p_{data}$ ( $x$ ).

The Proof of First conclusion

To prove the first conclusion, we need to first prove that for any given generator G, the optimal discriminator D will achieve the following equation:
在这里插入图片描述
Proof:

$V (D, G)$ $=$ $E_{x\sim}$ $p_{data}$ $l o g (D (x))$ + $E_{x\sim}$ $p_g$ $l o g (1 - D (x))$

$=$ $\int_x$ $p_{data}$ ( $x$ ) $l o g (D (x))$ $d x$ $+$ $\int_x$ $p_{g}$ ( $x$ ) $l o g (1 - D (x))$ $d x$

$=$ $\int_x$ $p_{data}$ ( $x$ ) $l o g (D (x))$ + $p_{g}$ ( $x$ ) $l o g (1 - D (x))$ $d x$

As the goal of D is to maximize this value function, according to the principle of second derivative and extreme points, D should optimize its parameters to reach the following equation: 在这里插入图片描述
in order to maximize $V (D, G)$ .

Now we are going to prove that when D has achieved: 在这里插入图片描述

generator G would make the minimum of value function only if G has made $p_{g}$ ( $x$ )= $p_{data}$ ( $x$ ).

To prove this, we should introduce JS divergence, a variant version of KL divergence, who solves the problem of asymmetry of KL divergence. The definition of JS divergence is:

$D_{JS}$ ( $p$ || $q$ ) $=$ $\frac{1}{2}$ $D_{KL}$ ( $p$ || $\frac{p+q}{2}$ ) $+$ $\frac{1}{2}$ $D_{KL}$ ( $q$ || $\frac{p+q}{2}$ ),

where 在这里插入图片描述
Here, we calculate JS divergence between $p_{g}$ ( $x$ ) and $p_{data}$ ( $x$ ) first.

The formula is from the Internet and $p_{r}$ above is just exactly $p_{data}$ and $L$ ( $G$ , $D$ ) above is just $V$ ( $G$ , $D$ ) here. It is surprising to find that when calculating JS divergence between $p_{g}$ ( $x$ ) and $p_{data}$ ( $x$ ), $V$ ( $G$ , $D$ ) appears!

Hence, $V$ ( $G$ , $D$ ) $=$ $2$ $D_{JS}$ ( $p_{data}$ || $p_{g}$ ) $-$ $l o g 4$ .

As JS divergence is always non-negative, to make the minimum of $V$ ( $G$ , $D$ ), $D_{JS}$ ( $p_{data}$ || $p_{g}$ ) should be zero. At that time, $p_{g}$ ( $x$ )= $p_{data}$ ( $x$ ) and $V$ ( $G$ , $D$ ) $=$ $-$ $l o g 4$ . What should be noted is at that time, $D^*$ (x)= $\frac{1}{2}$ , showing the discriminative model would not be able to distinguish the real image from the fake image successfully.

The Proof of Second conclusion

Here is the algorithm of GAN.

在这里插入图片描述
If G and D have enough capacity, and at each step of Algorithm, the discriminator is allowed to reach its optimum given G, and $p_{g}$ is updated so as to improve the criterion

then $p_{g}$ ( $x$ ) converges to $p_{data}$ ( $x$ ).

Proof:
在这里插入图片描述
For this proof, I don’t understand it well. It would be appreciated if somebody can explain it for me.

DCGAN

DCGAN is an advanced version of GAN, who combines the principal of convolutional neural network with GAN. In addition, DCGAN uses LeakyReLU as activation function and creates the idea of transposed convolution, replacing fully collected neural network with convolutional neural network. Experiments from DCGAN shows that DCGAN has better training stability than ordinary GAN.

The figure below shows the principal of DCGAN.
在这里插入图片描述

Finally, as it is easy to implement DCGAN with TensorFlow, it would be unnecessary to go into details here. For more detailed information about implementation, you can visit https://tensorflow.google.cn/

However, DCGAN has not been able to improve GAN from a deep level. The emergence of WGAN has completely solved many problems of GAN, such as instability！

A Peaceful Tree

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
An Introduction to GAN

GAN是什么GAN（Generative Adversarial Nets）是著名的深度学习算法，由Ian J. Goodfellow等人于2014年首次提出。GAN的主要创新是将竞争和对抗的思想用于图像生成。为了进行详细的实现，GAN在神经网络的帮助下实现了其目标。在GAN出现之前，…GAN的成分GAN中有两个模型，分别是生成模型和判别模型。给定一个图像（真实图像），生成模型的目的是使图...
复制链接

扫一扫