GAN教程

本文中,我们将实现一个生成式对抗网络(GAN),它将随机噪声转化为数字图像。GAN由生成器(将噪声转化为图像)和判别器(区分真实和生成图像)两部分组成。通过训练,生成器最终能够生成逼真的MNIST数字图像。在大约5个周期后开始出现模糊的数字,80个周期后效果较好。尽管判别器准确性和损失未达到理想状态(全局最小值),但已展示出GAN的基本工作原理。
摘要由CSDN通过智能技术生成

GAN-MNIST-Pytorch

In this blog post we’ll implement a generative image model that converts random noise into images of digits! The full code is available here, just clone it to your machine and it’s ready to play. As a former Torch7 user, I attempt to reproduce the results from the Torch7 post.

For this, we employ Generative Adversarial Network. A GAN consists of two components; a generator which converts random noise into images and a discriminator which tries to distinguish between generated and real images. Here, ‘real’ means that the image came from our training set of images in contrast to the generated fakes.

To train the model we let the discriminator and generator play a game against each other. We first show the discriminator a mixed batch of real images from our training set and of fake images generated by the generator. We then simultaneously optimize the discriminator to answer NO to fake images and YES to real images and optimize the generator to fool the discriminator into believing that the fake images were real. This corresponds to minimizing the classification error wrt. the discriminator and maximizing it wrt. the generator. With careful optimization both generator and discriminator will improve and the generator will eventually start generating convincing images.

Implementing a GAN

We implement the generator and discriminator as convnets and train them with stochastic gradient descent.

The discriminator is a mlp with consecutive blocks of Linear Layer and LeakyReLU activation.

D = nn.Sequential(
    nn.Linear(image_size, hidden_size),
    nn.LeakyReLU(0.2),
    nn.Linear(hidden_size, hidden_size),
    nn.LeakyReLU(0.2),
    nn.Linear(hidden_size, 1),
    nn.Sigmoid())

This is a pretty standard architecture. The 28x28 grey images of digits are converted into a 781x1 vector by stacking their columns. The discriminator takes a 784x1 vector as input and predicts YES or NO with a single sigmoid output.

The generator is a mlp with a vector with Linear Layer and ReLU activation repeatedly:

G = nn.Sequential(
    nn.Linear(latent_size, hidden_size),
    nn.ReLU(),
    nn.Linear(hidden_size, hidden_size),
    nn.ReLU(),
    nn.Linear(hidden_size, image_size),
    nn.Tanh())

To generate an image we feed the generator with noise distributed N(0,1). After successful training, the output should be meaningful images!

z = torch.randn(batch_size, latent_size).cuda()
z = Variable(z)
fake_images = G(z)

Generating digits

We train our GAN using images of digits from the MNIST dataset. After at around 5 epochs you should start to see blurr digits. And after 80 epochs the results look pleasant.
after 5 epochs

after 100 epochs

Loss and Discriminator Accuracy

accuracy.png
loss.png

We also record the accuracy of the discriminator and the loss of the discriminator and the generator after each epoch. Refer to GAN, when the global minimum of the training criterion is achieved, the loss should be -log4 and the accuracy should be 0.5. Our model does not achieve the ideal result. Moreover, as we can see from the figure, our model start to overfit at around 80 epochs. The model structure and the training strategy should be improved in the future.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值