GAN - 初体验

最新推荐文章于 2023-01-10 10:26:23 发布

麥、

最新推荐文章于 2023-01-10 10:26:23 发布

阅读量675

点赞数 2

分类专栏： GAN

本文链接：https://blog.csdn.net/qq_40549994/article/details/100120424

版权

GAN 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

论文（第一篇）：鼻祖

代码合集：Github

GAN代码运行

首先运行最上方给出的github代码中关于纯GAN的内容：
GAN
将代码copy到了服务器之后：

问题一：没装torchvision
最后在pytorch的官网上选择用pip安装，cuda = 9.0
pytorch官网
问题二：段错误（核心已转储）- segmentation fault

这个问题并没有解决，因为关于segmentation fault有太多种错误原因，很难揪出。最后的方法是不用我自己创建的虚拟环境mai_env而是用主环境。
但是最后使用的torch版本是1.2，torchvision版本为0.40

成功运行！

接下来的工作就是读代码：gan

一大群的import

import argparse
import os
import numpy as np
import math

import torchvision.transforms as transforms
from torchvision.utils import save_image

from torch.utils.data import DataLoader
from torchvision import datasets
from torch.autograd import Variable

import torch.nn as nn
import torch.nn.functional as F
import torch

在gan.py目录下创建images文件夹以储存输出结果

os.makedirs("images", exist_ok=True)

定义众多可变的全局变量

parser = argparse.ArgumentParser()
parser.add_argument("--n_epochs", type=int, default=200, help="number of epochs of training")
parser.add_argument("--batch_size", type=int, default=64, help="size of the batches")
parser.add_argument("--lr", type=float, default=0.0002, help="adam: learning rate")
parser.add_argument("--b1", type=float, default=0.5, help="adam: decay of first order momentum of gradient")
parser.add_argument("--b2", type=float, default=0.999, help="adam: decay of first order momentum of gradient")
parser.add_argument("--n_cpu", type=int, default=8, help="number of cpu threads to use during batch generation")
parser.add_argument("--latent_dim", type=int, default=100, help="dimensionality of the latent space")
parser.add_argument("--img_size", type=int, default=28, help="size of each image dimension")
parser.add_argument("--channels", type=int, default=1, help="number of image channels")
parser.add_argument("--sample_interval", type=int, default=400, help="interval betwen image samples")
parser.add_argument("--gpu_device", choices=["1", "2", "3"], default="3", help="gpu device number")
opt = parser.parse_args()
print(opt)

定义输入图片shape大小，初始为（1，28，28），即单通道28×28的图片

img_shape = (opt.channels, opt.img_size, opt.img_size)

这两句话结合起来，是定义使用的GPU卡号（自己新加的源程序没有）

parser.add_argument("--gpu_device", choices=["1", "2", "3"], default="3", help="gpu device number")
os.environ["CUDA_VISIBLE_DEVICES"] = opt.gpu_device

使用cuda的条件

cuda = True if torch.cuda.is_available() else False

接下来是重头戏：Generator网络的定义

class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()

        def block(in_feat, out_feat, normalize=True):
            layers = [nn.Linear(in_feat, out_feat)]
            if normalize:
                layers.append(nn.BatchNorm1d(out_feat, 0.8))
            layers.append(nn.LeakyReLU(0.2, inplace=True))
            return layers

        self.model = nn.Sequential(
            *block(opt.latent_dim, 128, normalize=False),
            *block(128, 256),
            *block(256, 512),
            *block(512, 1024),
            nn.Linear(1024, int(np.prod(img_shape))),
            nn.Tanh()
        )

    def forward(self, z):
        img = self.model(z)
        img = img.view(img.size(0), *img_shape)
        return img

作为初学者我们一步一步来，先分析block

	def block(in_feat, out_feat, normalize=True):
        layers = [nn.Linear(in_feat, out_feat)]
        if normalize:
            layers.append(nn.BatchNorm1d(out_feat, 0.8))
        layers.append(nn.LeakyReLU(0.2, inplace=True))
        return layers

print出nn.Linear后的layers看看：

这里有四个Linear值是因为下面的nn.Sequential中调用block了三次，维度就是block的输入。
继续，把normalize后的值print出：

append用于在列表末尾添加新的对象，所以若normalize = True，Linear后面跟上了一个新定义的BatchNormld
所以最后的layers长这样（block最后的layers）

在这里插入图片描述
block中有三个nn的自带函数，拿出来看一下：

nn.Linear对输入数据线性化，最简单的线性回归函数
nn.BatchNorm1d为归一化层
nn.LeakyReLU泄露修正线性单元：LeakyReLU

然后分析nn.Sequential中的内容

	self.model = nn.Sequential(
        *block(opt.latent_dim, 128, normalize=False),
        *block(128, 256),
        *block(256, 512),
        *block(512, 1024),
        nn.Linear(1024, int(np.prod(img_shape))),
        nn.Tanh()
        )

np.prod用来计算所有元素的乘积，对于有多个维度的数组可以指定轴，如axis=1指定计算每一行的乘积。
nn.Tanh()其实就是数学上的tanh函数

最后网络输出的结果（代码部分到后面都是千篇一律，很好理解的，不作注解）

训练结果
第一行的五张图片，从左到右是经过训练epoch的一点一点增多，生成图片也越来越接近与原数据集的数据（MNIST）。
第二行的五张图片，从左到右的顺序跟第一行epoch顺序差不多，但是我在模型中导入了第一行最后生成的.pth权重，所以从一开始生成的图片就很接近于原数据集。从结果中我们看到，生成的数据不一定是越来越好的，中间结果会有波动，但比第一行的数据要好的太多太多。

麥、

关注

2
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
GAN - 初体验

论文（第一篇）：鼻祖代码合集：GithubGAN代码运行首先运行最上方给出的github代码中关于纯GAN的内容：将代码copy到了服务器之后：问题一：没装torchvision最后在pytorch的官网上选择用pip安装，cuda = 9.0问题二：段错误（核心已转储）- segmentation fault这个问题并没有解决，因为关于segmentation fault有太...
复制链接

扫一扫