GAN简介

最新推荐文章于 2024-06-17 13:08:59 发布

npupengsir

最新推荐文章于 2024-06-17 13:08:59 发布

阅读量6.5k

点赞数 3

分类专栏：算法入门文章标签： GAN

本文链接：https://blog.csdn.net/u012897374/article/details/77806515

版权

算法入门专栏收录该内容

20 篇文章 0 订阅

订阅专栏

1. Auto-Encoder

auto-encoder是一种无监督学习算法，研究的是将一个高维的向量或者矩阵进行encode，然后再decode，使得输出与输出值越接近越好。
这里写图片描述
训练过程中，encoder和decoder是无法单独训练的，必须两个一起训练，然后训练完成后可以将encoder和decoder单独拿出来。
传统的PCA的过程为:

因此PCA是通过线性变换，将一个高维向量转换成一个低维的 $code$ ，但是这里面只有一个 $hidden\hspace{0.2cm}layer$ 。而 $auto-encoder$ 则允许多个隐藏层：
这里写图片描述
但是需要用 $RBM(Restricted\hspace{0.2cm} Boltzmann\hspace{0.2cm} Machine)$ 来对权值进行初始化才能训练出比较好的结果。
Auto-Encoder可以用在文本检索上：

图像检索:

将图像压缩至256维，然后比较这256维向量的相似性，比直接比较图像的相似性效果要好。

De-noising auto-encoder：
这里写图片描述
在图像上面人为加入一些噪声点，这样学习的结果鲁棒性更好。

CNN里面的Auto-Encoder:
这里写图片描述
就是做unpooling和deconvolution操作，使得两个图片越相似越好。
1. unpooling操作
方法1:

pooling的时候记住最大值的位置，unpooling的时候将其他地方填上0.
方法2:
不记住最大值的位置，unpooling的时候将其他地方填上最大值即可。
2. deconvolution操作
deconvolution操作其实就是convolution操作，由低维变高维的时候，在其他的地方填上0即可。
这里写图片描述

2. VAE(Variational Auto-Encoder)

通常Auto-Encoder不能直接用来生成，需要用到VAE。
这里写图片描述
通常在学习 $m_1, m_2, m_3, \cdots, m_n$ 之外，还会学习 $\sigma_1, \sigma_2, \sigma_3, \cdots, \sigma_n$ 和 $e_1, e_2, e_3,\cdots, e_n$ 。通常 $e_1, e_2, e_3, \cdots, e_n$ 是从正太分不中取样出来的。
但这种情况最后学习的结果应该让 $\sigma_i=0$ 。因此一般还会加上另一限制函数:

l = \sum i = 1 n [(e σ i - (1 + σ i) + m 2 i)]

$l = \sum_{i=1}^n[(e^{\sigma_i}-(1+\sigma_i)+m_i^2)]$
因此需要使得

l $l$ 越小越好。此时

σi $\sigma_i$ 会接近0。

3. GAN的基本结构

由于VAE无法GAN类似于一个Auto-Encoder的不断演化的结构。训练过程为：
1. 首先将VAE的decoder拿出来，然后产生一组假的数据，并标记为0。然后取一些真的数据，并标记为1：
这里写图片描述
2. 训练下一个Generator

此时的Discriminator能够分辨假的数据和真实的数据，因此对于假的数据，会输出一个比较小的分数。然后固定Discriminator的参数不懂，只不断调整Generator的参数，使得Generator产生的图片无法被Discriminator判别为假即可。
3. 将Generator和Discriminator连接成一个网络

G e n e r a t o r + D i s c r i m i n a t o r = > G A N

$Generator + Discriminator => GAN$
4. 注意，每次训练Generator和Discrimator的时候，只能训练一个，另一个必须要固定参数。即训练Generator的时候，Discriminator的参数不能动。训练Discriminator的时候，Generator的参数不能动。

4. GAN的基本原理

首先从现有的数据中挑选出一批数据，组成 $P_{data}(x)$ ，然后训练一个 $P_G(x; \theta)$ 来产生数据(例如一个高斯混合模型)，我们希望产生的数据集 $P_G(x; \theta)$ 与原来的数据 $P_{data}(x)$ 越接近越好，即使得下面的似然函数达到最大值：

L = \prod i = 1 m P G (x i; θ)

$L=\prod_{i=1}^mP_G(x^i; \theta)$
因此需要求得的参数为：

θ * = arg max θ \prod i = 1 m P G (x i; θ)

$\theta^*=\arg\max_\theta\prod_{i=1}^mP_G(x^i; \theta)$
取对数，得：

θ * = arg max θ \sum i = 1 m ln P G (x i; θ) {x 1, x 2, \dots, x m} \in P d a t a (x) \approx arg max θ E x \sim P d a t a [ln P G (x; θ)] = arg max θ [\int x P d a t a (x) ln P G (x; θ) d x - \int x P d a t a (x) ln P d a t a (x) d x] = arg min θ K L (P d a t a (x) | | P G (x; θ))

$\begin{align} \theta^*&=\arg\max_\theta\sum_{i=1}^m\ln P_G(x^i; \theta) \hspace{1.0cm}\{x^1, x^2, \cdots, x^m\}\in P_{data}(x)\\ &\approx\arg\max_\theta E_{x\sim P_{data}}[\ln P_G(x; \theta)]\\ &=\arg\max_\theta[\int_xP_{data}(x)\ln P_G(x; \theta)dx-\int_xP_{data}(x)\ln P_{data}(x)dx]\\ &=\arg\min_\theta KL(P_{data}(x)||P_G(x; \theta)) \end{align}$
即最后需要使得

Pdata(x) $P_{data}(x)$ 和

PG(x;θ) $P_G(x; \theta)$ 的

KLdivergence $KL \hspace{0.2cm} divergence$ 达到最小，即这两个分布越相近越好。
在

GAN $GAN$ 当中，

PG(x;θ) $P_G(x; \theta)$ 就是一个神经网络，

θ $\theta$ 就是网络的各种参数：
这里写图片描述

此时可以选择一个分布

z $z$ ，然后通过一个神经网络产生一组数据

PG(x;θ) $P_G(x; \theta)$ ，使得

PG(x;θ) $P_G(x; \theta)$ 与

Pdata(x) $P_{data}(x)$ 越接近越好，因此：

P G (x) = \int z P p r i o r (z) I [G (z) = x] d z

$P_G(x)=\int_zP_{prior}(z)I_{[G(z)=x]}dz$
表示枚举所有可能的

z $z$ ，然后对其就行求积分。其中

Pprior(z) $P_{prior}(z)$ 表示点

z $z$ 出现的概率，

I[G(z)=x] $I_{[G(z)=x]}$ 表示当

G(z) $G(z)$ 与

x $x$ 相等时取

1 $1$ ，否则取

0 $0$ 。
但是

∫zPprior(z)I[G(z)=x]dz $\int_zP_{prior}(z)I_{[G(z)=x]}dz$ 不好计算，因此

GAN $GAN$ 采用

Discriminator $Discriminator$ 来计算

KLdivergence $KL \hspace{0.2cm} divergence$ 。

Discriminator $Discriminator$ 会输出一个数值

V(G,D) $V(G, D)$ 来表示产生的数据和实际的数据之间的差值，因此其实要求的

G $G$ 为:

G * = arg min G max D V (G, D)

$G^*=\arg\min_G\max_DV(G,D)$
其中

V (G, D) = E x \sim P d a t a [ln D (x)] + E x \sim P G [ln (1 - D (x))] = \int x [P d a t a (x) ln D (x) + P G (x) ln (1 - D (x))] d x

$\begin{align} V(G, D)&=E_{x\sim P_{data}}[\ln D(x)]+E_{x\sim P_G}[\ln(1-D(x))]\\ &=\int_x[P_{data}(x)\ln D(x)+P_G(x)\ln (1-D(x))]dx \end{align}$
解得：

D * = P d a t a ( x ) P d a t a ( x ) + P G ( x )

$D^*=\frac {P_{data(x)}}{P_{data}(x)+P_G(x)}$
最终，给定

G $G$ ，有：

max D V (G, D) = - 2 ln 2 + \int x P d a t a (x) ln P d a t a ( x ) ( P d a t a ( x ) + P G ( x ) ) / 2 d x + \int x P G (x) ln P G ( x ) ( P d a t a ( x ) + P G ( x ) ) / 2 d x = - 2 ln 2 + K L (P d a t a (x) | | P d a t a ( x ) + P G ( x ) 2) + K L (P G (x) | | P d a t a ( x ) + P G ( x ) 2) = - 2 ln 2 + 2 J S D (P d a t a (x) | | P G (x))

$\begin{align} \max_DV(G, D)&=-2\ln2+\int_xP_{data}(x)\ln \frac {P_{data}(x)}{(P_{data}(x)+P_G(x))/2}dx+\\ &\hspace{2.5cm}\int_xP_{G}(x)\ln \frac {P_{G}(x)}{(P_{data}(x)+P_G(x))/2}dx\\ &=-2\ln2+KL(P_{data}(x)||\frac {P_{data}(x)+P_G(x)}{2})+\\ &\hspace{2.5cm}KL(P_G(x)||\frac {P_{data}(x)+P_G(x)}{2})\\ &=-2\ln2+2JSD(P_{data}(x)||P_G(x)) \end{align}$
因此最优的

PG(x) $P_G(x)$ 就是

Pdata(x) $P_{data}(x)$ ,因此：

G * = arg min G max D V (G, D)

$G^*=\arg\min_G\max_DV(G,D)$
令

L (G) = max D V (G, D)

$L(G)=\max_DV(G,D)$
解决过程为：

θ G = θ - η \partial L ( G ) \partial θ G

$\theta_G=\theta-\eta\frac{\partial L(G)}{\partial \theta_G}$
这样得到

G $G$ 即可。
整个算法的过程为：
1. 从真实的数据分布

Pdata(x) $P_{data}(x)$ 中取样一些数据

{x1,x2,⋯,xm} $\{x^1, x^2, \cdots, x^m\}$
2. 从预先的encoder中取出一些数据

{z1,z2,⋯,zm} $\{z^1, z^2, \cdots, z^m\}$
3. 通过

{z1,z2,⋯,zm} $\{z^1, z^2, \cdots, z^m\}$ 得到一些数据

{x˜1,x˜2,⋯,x˜m} $\{\widetilde x^1, \widetilde x^2, \cdots, \widetilde x^m\}$ ，其中

x˜i=G(zi) $\widetilde x^i=G(z^i)$
4. 改变

discriminator $discriminator$ 的参数：

V ˜ θ d = 1 m \sum i = 1 m ln D (x i) + 1 m \sum i = 1 m ln (1 - D (x ˜ i)) = θ d + η \nabla V ˜ (θ d)

$\begin{align} \widetilde V&=\frac 1m\sum_{i=1}^m\ln D(x^i)+\frac 1m\sum_{i=1}^m\ln(1-D(\widetilde x^i))\\ \theta_d&=\theta_d+\eta\nabla \widetilde V(\theta_d) \end{align}$
使得失眠的

V˜ $\widetilde V$ 到达最大
5. 重复2中的步骤，然后改变

generator $generator$ 的参数，使得下式的

V˜ $\widetilde V$ 到达最小。

V ˜ θ g = 1 m \sum i = 1 m ln (1 - D (x ˜ i)) = θ g - η \nabla V ˜ (θ g)

$\begin{align} \widetilde V&=\frac 1m\sum_{i=1}^m\ln(1-D(\widetilde x^i))\\ \theta_g&=\theta_g-\eta\nabla \widetilde V(\theta_g) \end{align}$
6. 重复4，5。
注意
上式中的 $D(x^i)$ 表示的是真实图片被判别为1的概率， $D(\widetilde x^i)$ 表示生成的图片被判别为1的概率。因子Discriminator就是要使得 $D(x^i)$ 接近1， $D(\widetilde x^i)$ 接近0。而Generator就是要使得 $D(\widetilde x^i)$ 接近1。

lan Goodfellow GAN教程

5. GAN的TensorFlow简单实现

import tensorflow as tf #machine learning
import numpy as np #matrix math
import datetime #logging the time for model checkpoints and training
import matplotlib.pyplot as plt #visualize results
%matplotlib inline

#Step 1 - Collect dataset
#MNIST - handwritten character digits ~50K training and validation images + labels, 10K testing
from tensorflow.examples.tutorials.mnist import input_data
#will ensure that the correct data has been downloaded to your 
#local training folder and then unpack that data to return a dictionary of DataSet instances.
mnist = input_data.read_data_sets("MNIST_data/")

def discriminator(x_image, reuse=False):
    if (reuse):
        tf.get_variable_scope().reuse_variables()

    # First convolutional and pool layers
    # These search for 32 different 5 x 5 pixel features
    #We’ll start off by passing the image through a convolutional layer. 
    #First, we create our weight and bias variables through tf.get_variable. 
    #Our first weight matrix (or filter) will be of size 5x5 and will have a output depth of 32. 
    #It will be randomly initialized from a normal distribution.
    d_w1 = tf.get_variable('d_w1', [5, 5, 1, 32], initializer=tf.truncated_normal_initializer(stddev=0.02))
    #tf.constant_init generates tensors with constant values.
    d_b1 = tf.get_variable('d_b1', [32], initializer=tf.constant_initializer(0))
    #tf.nn.conv2d() is the Tensorflow’s function for a common convolution.
    #It takes in 4 arguments. The first is the input volume (our 28 x 28 x 1 image in this case). 
    #The next argument is the filter/weight matrix. Finally, you can also change the stride and 
    #padding of the convolution. Those two values affect the dimensions of the output volume.
    #"SAME" tries to pad evenly left and right, but if the amount of columns to be added is odd, 
    #it will add the extra column to the right,
    #strides = [batch, height, width, channels]
    d1 = tf.nn.conv2d(input=x_image, filter=d_w1, strides=[1, 1, 1, 1], padding='SAME')
    #add the bias
    d1 = d1 + d_b1
    #squash with nonlinearity (ReLU)
    d1 = tf.nn.relu(d1)
    ##An average pooling layer performs down-sampling by dividing the input into 
    #rectangular pooling regions and computing the average of each region. 
    #It returns the averages for the pooling regions.
    d1 = tf.nn.avg_pool(d1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

    #As with any convolutional neural network, this module is repeated, 
    # Second convolutional and pool layers
    # These search for 64 different 5 x 5 pixel features
    d_w2 = tf.get_variable('d_w2', [5, 5, 32, 64], initializer=tf.truncated_normal_initializer(stddev=0.02))
    d_b2 = tf.get_variable('d_b2', [64], initializer=tf.constant_initializer(0))
    d2 = tf.nn.conv2d(input=d1, filter=d_w2, strides=[1, 1, 1, 1], padding='SAME')
    d2 = d2 + d_b2
    d2 = tf.nn.relu(d2)
    d2 = tf.nn.avg_pool(d2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

     #and then followed by a series of fully connected layers. 
    # First fully connected layer
    d_w3 = tf.get_variable('d_w3', [7 * 7 * 64, 1024], initializer=tf.truncated_normal_initializer(stddev=0.02))
    d_b3 = tf.get_variable('d_b3', [1024], initializer=tf.constant_initializer(0))
    d3 = tf.reshape(d2, [-1, 7 * 7 * 64])
    d3 = tf.matmul(d3, d_w3)
    d3 = d3 + d_b3
    d3 = tf.nn.relu(d3)

    #The last fully-connected layer holds the output, such as the class scores.
    # Second fully connected layer
    d_w4 = tf.get_variable('d_w4', [1024, 1], initializer=tf.truncated_normal_initializer(stddev=0.02))
    d_b4 = tf.get_variable('d_b4', [1], initializer=tf.constant_initializer(0))

    #At the end of the network, we do a final matrix multiply and 
    #return the activation value. 
    #For those of you comfortable with CNNs, this is just a simple binary classifier. Nothing fancy.
    # Final layer
    d4 = tf.matmul(d3, d_w4) + d_b4
    # d4 dimensions: batch_size x 1

    return d4

#You can think of the generator as being a kind of reverse ConvNet. With CNNs, the goal is to 
#transform a 2 or 3 dimensional matrix of pixel values into a single probability. A generator, 
#however, seeks to take a d-dimensional noise vector and upsample it to become a 28 x 28 image. 
#ReLUs are then used to stabilize the outputs of each layer.
#example of CNN blocks http://cs231n.github.io/convolutional-networks/#fc

#it takes random inputs, and eventually mapping them down to a [1,28,28] pixel to match the MNIST data shape.  
#Be begin by generating a dense 14×14 set of values, and then run through a handful of filters of
#varying sizes and numbers of channels
#weight matrices get progressively smaller

def generator(batch_size, z_dim):
    z = tf.truncated_normal([batch_size, z_dim], mean=0, stddev=1, name='z')
    #first deconv block
    g_w1 = tf.get_variable('g_w1', [z_dim, 3136], dtype=tf.float32, initializer=tf.truncated_normal_initializer(stddev=0.02))
    g_b1 = tf.get_variable('g_b1', [3136], initializer=tf.truncated_normal_initializer(stddev=0.02))
    g1 = tf.matmul(z, g_w1) + g_b1
    g1 = tf.reshape(g1, [-1, 56, 56, 1])
    g1 = tf.contrib.layers.batch_norm(g1, epsilon=1e-5, scope='bn1')
    g1 = tf.nn.relu(g1)

    # Generate 50 features
    g_w2 = tf.get_variable('g_w2', [3, 3, 1, z_dim/2], dtype=tf.float32, initializer=tf.truncated_normal_initializer(stddev=0.02))
    g_b2 = tf.get_variable('g_b2', [z_dim/2], initializer=tf.truncated_normal_initializer(stddev=0.02))
    g2 = tf.nn.conv2d(g1, g_w2, strides=[1, 2, 2, 1], padding='SAME')
    g2 = g2 + g_b2
    g2 = tf.contrib.layers.batch_norm(g2, epsilon=1e-5, scope='bn2')
    g2 = tf.nn.relu(g2)
    g2 = tf.image.resize_images(g2, [56, 56])

    # Generate 25 features
    g_w3 = tf.get_variable('g_w3', [3, 3, z_dim/2, z_dim/4], dtype=tf.float32, initializer=tf.truncated_normal_initializer(stddev=0.02))
    g_b3 = tf.get_variable('g_b3', [z_dim/4], initializer=tf.truncated_normal_initializer(stddev=0.02))
    g3 = tf.nn.conv2d(g2, g_w3, strides=[1, 2, 2, 1], padding='SAME')
    g3 = g3 + g_b3
    g3 = tf.contrib.layers.batch_norm(g3, epsilon=1e-5, scope='bn3')
    g3 = tf.nn.relu(g3)
    g3 = tf.image.resize_images(g3, [56, 56])

    # Final convolution with one output channel
    g_w4 = tf.get_variable('g_w4', [1, 1, z_dim/4, 1], dtype=tf.float32, initializer=tf.truncated_normal_initializer(stddev=0.02))
    g_b4 = tf.get_variable('g_b4', [1], initializer=tf.truncated_normal_initializer(stddev=0.02))
    g4 = tf.nn.conv2d(g3, g_w4, strides=[1, 2, 2, 1], padding='SAME')
    g4 = g4 + g_b4
    g4 = tf.sigmoid(g4)

    # No batch normalization at the final layer, but we do add
    # a sigmoid activator to make the generated images crisper.
    # Dimensions of g4: batch_size x 28 x 28 x 1

    return g4

sess = tf.Session()

batch_size = 50
z_dimensions = 100

x_placeholder = tf.placeholder("float", shape = [None,28,28,1], name='x_placeholder')
# x_placeholder is for feeding input images to the discriminator

#One of the trickiest parts about understanding GANs is that the loss function is a little bit more complex than that
#of a traditional CNN classifiers (For those, a simple MSE or Hinge Loss would do the trick). 
#If you think back to the introduction, a GAN can be thought of as a zero sum minimax game. 
#The generator is constantly improving to produce more and more realistic images, while the discriminator is 
#trying to get better and better at distinguishing between real and generated images.
#This means that we need to formulate loss functions that affect both networks. 
#Let’s take a look at the inputs and outputs of our networks.

Gz = generator(batch_size, z_dimensions)
# Gz holds the generated images
#g(z)

Dx = discriminator(x_placeholder)
# Dx hold the discriminator's prediction probabilities
# for real MNIST images
#d(x)

Dg = discriminator(Gz, reuse=True)
# Dg holds discriminator prediction probabilities for generated images
#d(g(z))



#So, let’s first think about what we want out of our networks. We want the generator network to create 
#images that will fool the discriminator. The generator wants the discriminator to output a 1 (positive example).
#Therefore, we want to compute the loss between the Dg and label of 1. This can be done through 
#the tf.nn.sigmoid_cross_entropy_with_logits function. This means that the cross entropy loss will 
#be taken between the two arguments. The "with_logits" component means that the function will operate 
#on unscaled values. Basically, this means that instead of using a softmax function to squish the output
#activations to probability values from 0 to 1, we simply return the unscaled value of the matrix multiplication.
#Take a look at the last line of our discriminator. There's no softmax or sigmoid layer at the end.
#The reduce mean function just takes the mean value of all of the components in the matrixx returned 
#by the cross entropy function. This is just a way of reducing the loss to a single scalar value, 
#instead of a vector or matrix.
#https://datascience.stackexchange.com/questions/9302/the-cross-entropy-error-function-in-neural-networks

g_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=Dg, labels=tf.ones_like(Dg)))


#Now, let’s think about the discriminator’s point of view. Its goal is to just get the correct labels 
#(output 1 for each MNIST digit and 0 for the generated ones). We’d like to compute the loss between Dx 
#and the correct label of 1 as well as the loss between Dg and the correct label of 0.
d_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=Dx, labels=tf.fill([batch_size, 1], 0.9)))
d_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=Dg, labels=tf.zeros_like(Dg)))
d_loss = d_loss_real + d_loss_fake

tvars = tf.trainable_variables()

d_vars = [var for var in tvars if 'd_' in var.name]
g_vars = [var for var in tvars if 'g_' in var.name]

# Train the discriminator
# Increasing from 0.001 in GitHub version
with tf.variable_scope(tf.get_variable_scope(), reuse=False) as scope:
    #Next, we specify our two optimizers. In today’s era of deep learning, Adam seems to be the
    #best SGD optimizer as it utilizes adaptive learning rates and momentum. 
    #We call Adam's minimize function and also specify the variables that we want it to update.
    d_trainer_fake = tf.train.AdamOptimizer(0.0001).minimize(d_loss_fake, var_list=d_vars)
    d_trainer_real = tf.train.AdamOptimizer(0.0001).minimize(d_loss_real, var_list=d_vars)

    # Train the generator
    # Decreasing from 0.004 in GitHub version
    g_trainer = tf.train.AdamOptimizer(0.0001).minimize(g_loss, var_list=g_vars)

#Outputs a Summary protocol buffer containing a single scalar value.
tf.summary.scalar('Generator_loss', g_loss)
tf.summary.scalar('Discriminator_loss_real', d_loss_real)
tf.summary.scalar('Discriminator_loss_fake', d_loss_fake)

d_real_count_ph = tf.placeholder(tf.float32)
d_fake_count_ph = tf.placeholder(tf.float32)
g_count_ph = tf.placeholder(tf.float32)

tf.summary.scalar('d_real_count', d_real_count_ph)
tf.summary.scalar('d_fake_count', d_fake_count_ph)
tf.summary.scalar('g_count', g_count_ph)

# Sanity check to see how the discriminator evaluates
# generated and real MNIST images
d_on_generated = tf.reduce_mean(discriminator(generator(batch_size, z_dimensions)))
d_on_real = tf.reduce_mean(discriminator(x_placeholder))

tf.summary.scalar('d_on_generated_eval', d_on_generated)
tf.summary.scalar('d_on_real_eval', d_on_real)

images_for_tensorboard = generator(batch_size, z_dimensions)
tf.summary.image('Generated_images', images_for_tensorboard, 10)
merged = tf.summary.merge_all()
logdir = "tensorboard/gan/"
writer = tf.summary.FileWriter(logdir, sess.graph)
print(logdir)

saver = tf.train.Saver()

sess.run(tf.global_variables_initializer())

#During every iteration, there will be two updates being made, one to the discriminator and one to the generator. 
#For the generator update, we’ll feed in a random z vector to the generator and pass that output to the discriminator
#to obtain a probability score (this is the Dg variable we specified earlier).
#As we remember from our loss function, the cross entropy loss gets minimized, 
#and only the generator’s weights and biases get updated.
#We'll do the same for the discriminator update. We’ll be taking a batch of images 
#from the mnist variable we created way at the beginning of our program.
#These will serve as the positive examples, while the images in the previous section are the negative ones.

gLoss = 0
dLossFake, dLossReal = 1, 1
d_real_count, d_fake_count, g_count = 0, 0, 0
for i in range(50000):
    real_image_batch = mnist.train.next_batch(batch_size)[0].reshape([batch_size, 28, 28, 1])
    if dLossFake > 0.6:
        # Train discriminator on generated images
        _, dLossReal, dLossFake, gLoss = sess.run([d_trainer_fake, d_loss_real, d_loss_fake, g_loss],
                                                    {x_placeholder: real_image_batch})
        d_fake_count += 1

    if gLoss > 0.5:
        # Train the generator
        _, dLossReal, dLossFake, gLoss = sess.run([g_trainer, d_loss_real, d_loss_fake, g_loss],
                                                    {x_placeholder: real_image_batch})
        g_count += 1

    if dLossReal > 0.45:
        # If the discriminator classifies real images as fake,
        # train discriminator on real values
        _, dLossReal, dLossFake, gLoss = sess.run([d_trainer_real, d_loss_real, d_loss_fake, g_loss],
                                                    {x_placeholder: real_image_batch})
        d_real_count += 1

    if i % 10 == 0:
        real_image_batch = mnist.validation.next_batch(batch_size)[0].reshape([batch_size, 28, 28, 1])
        summary = sess.run(merged, {x_placeholder: real_image_batch, d_real_count_ph: d_real_count,
                                    d_fake_count_ph: d_fake_count, g_count_ph: g_count})
        writer.add_summary(summary, i)
        d_real_count, d_fake_count, g_count = 0, 0, 0

    if i % 1000 == 0:
        # Periodically display a sample image in the notebook
        # (These are also being sent to TensorBoard every 10 iterations)
        images = sess.run(generator(3, z_dimensions))
        d_result = sess.run(discriminator(x_placeholder), {x_placeholder: images})
        print("TRAINING STEP", i, "AT", datetime.datetime.now())
        for j in range(3):
            print("Discriminator classification", d_result[j])
            im = images[j, :, :, 0]
            plt.imshow(im.reshape([28, 28]), cmap='Greys')
            plt.show()

    if i % 5000 == 0:
        print(i)
    print(i)
#         save_path = saver.save(sess, "models/pretrained_gan.ckpt", global_step=i)
#         print("saved to %s" % save_path)

test_images = sess.run(generator(10, 100))
test_eval = sess.run(discriminator(x_placeholder), {x_placeholder: test_images})

real_images = mnist.validation.next_batch(10)[0].reshape([10, 28, 28, 1])
real_eval = sess.run(discriminator(x_placeholder), {x_placeholder: real_images})

# Show discriminator's probabilities for the generated images,
# and display the images
for i in range(10):
    print(test_eval[i])
    plt.imshow(test_images[i, :, :, 0], cmap='Greys')
    plt.show()

# Now do the same for real MNIST images
for i in range(10):
    print(real_eval[i])
    plt.imshow(real_images[i, :, :, 0], cmap='Greys')
    plt.show()

npupengsir

关注

3
点赞
踩
19

收藏

觉得还不错? 一键收藏
0
评论
GAN简介

1. Auto-Encoderauto-encoder是一种无监督学习算法，研究的是将一个高维的向量或者矩阵进行encode，然后再decode，使得输出与输出值越接近越好。训练过程中，encoder和decoder是无法单独训练的，必须两个一起训练，然后训练完成后可以将encoder和decoder单独拿出来。传统的PCA的过程为: 因此PCA是通过线性变换，将一个高维向量转换成
复制链接

扫一扫

专栏目录