CGAN及代码实现

最新推荐文章于 2024-06-21 01:41:49 发布

长命百岁️

最新推荐文章于 2024-06-21 01:41:49 发布

阅读量5.6k

点赞数 4

分类专栏：深度学习

本文链接：https://blog.csdn.net/qq_52852138/article/details/121682072

版权

CGAN 生成对抗网络 PyTorch 图像生成神经网络

关键词由CSDN通过智能技术生成

深度学习专栏收录该内容

21 篇文章 2 订阅

订阅专栏

前言

本文主要介绍CGAN及其代码实现
阅读本文之前，建议先阅读GAN(生成对抗网络)
本文基于一次课程实验，代码仅上传了需要补充部分

CGAN

全称： $\,Generative\, Adversarial\, Network$
我们知道， $G A N$ 其实又叫做 $Unconditional\, Generative\, Adversarial\, Network$

在基本的 $G A N$ 上对 $G e n e r a t o r$ 和 $D i s c r i m i n a t o r$ 的输入都添加了 $l a b e l s$ ，使得我们可以针对类别训练，控制生成图片的类别，而使得结果不那么随机

在这里插入图片描述

$z$ 为服从一定分布的随机向量
$x$ 为图像
$y$ 为控制类别的 $l a b l e s$

Genarator

输入：潜在空间的一批点（向量）和一批 label
输出：一批图片

代码整体如下

class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
		
        # 将label编码成向量
        self.label_embedding = nn.Embedding(opt.n_classes, opt.label_dim) #10 , 50
        ## TODO: There are many ways to implement the model,  one alternative 
        ## architecture is (100+50)--->128--->256--->512--->1024--->(1,28,28)

        ### START CODE HERE
        def block(in_feat, out_feat, normalize=True):
            layers = [nn.Linear(in_feat, out_feat)]
            if normalize:
                layers.append(nn.BatchNorm1d(out_feat, 0.8))
            layers.append(nn.LeakyReLU(0.2, inplace=True))
            return layers

        self.model = nn.Sequential(
            *block(opt.latent_dim + opt.label_dim, 128, normalize=False),
            *block(128, 256),
            *block(256, 512),
            *block(512, 1024),
            nn.Linear(1024, int(np.prod(img_shape))),
            nn.Tanh()
        )
        ### END CODE HERE

    def forward(self, noise, labels):
       
        ### START CODE HERE
        # Concatenate label embedding and image to produce input
        gen_input = torch.cat((self.label_embedding(labels), noise), -1) #拼接两个向量
        img = self.model(gen_input)
        img = img.view(img.size(0), *img_shape)
        return img
        ### END CODE HERE
        
        return

详细解读

nn.Embedding(num_embeddings , embedding_dim)

将输入信息编码成向量
num_embeddings 代表最多可以编码几个数据
embedding_dim 代表将每个数据编码成一个几维向量

import torch.nn as nn
embedding = nn.Embedding(10, 3)
a =  torch.LongTensor([[1,2,4,5],[4,3,2,9]])
b =  torch.LongTensor([1 , 2 , 3])
print(embedding(a))
>>tensor([[[-0.3592, -2.2254, -1.7580],
         [ 1.7920, -0.6600, -1.1435],
         [-0.8874,  0.2585, -1.0378],
         [ 0.4861,  0.3025, -1.0556]],

        [[-0.8874,  0.2585, -1.0378],
         [-0.0752, -0.1548, -0.7140],
         [ 1.7920, -0.6600, -1.1435],
         [-2.5180,  0.2028, -1.4452]]], grad_fn=<EmbeddingBackward>)
print(embedding(b))
>>tensor([[-0.3592, -2.2254, -1.7580],
        [ 1.7920, -0.6600, -1.1435],
        [-0.0752, -0.1548, -0.7140]], grad_fn=<EmbeddingBackward>)

nn.Linear()
```
 layers = [nn.Linear(in_feat, out_feat)]
```
- nn.Linear()用于设置网络中的全连接层。全连接层的输入与输出可以是多维的
- in_feat 决定了输入张量的 size
- out_feat 决定了输出张量的 size
nn.Sequential()
- 用于设置模型顺序执行的部分
前向传播
- 拼接 label 和 noise 向量，得到输入向量
- 直接调用 self.model，对 gen_inupt 进行系列操作
- 转成二维向量返回

Discriminator

输入：一批图片和图片对应的label
输出：real or fake（1 / 0）

代码整体如下

class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()

        self.label_embedding = nn.Embedding(opt.n_classes, opt.label_dim)#10,50
        ## TODO: There are many ways to implement the discriminator,  one alternative 
        ## architecture is (100+784)--->512--->512--->512--->1
        
        ### START CODE HERE
        self.model = nn.Sequential(
            nn.Linear(opt.label_dim + int(np.prod(img_shape)), 512),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(512, 512),
            nn.Dropout(0.4),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(512, 512),
            nn.Dropout(0.4),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(512, 1),
        )
        ### END CODE HERE
        
       
    def forward(self, img, labels):
        ### START CODE HERE
        # Concatenate label embedding and image to produce input
        d_in = torch.cat((img.view(img.size(0), -1), self.label_embedding(labels)), -1)
        validity = self.model(d_in)
        ### END CODE HERE
        
        return validity

和Generator类似
- 输入是一批图像和对应标签
- 训练过程写在 nn.Sequential 里
- 返回的结果是一批图像的判断（真为1，假为0）

训练过程

代码讲解写在注释里了

## TODO: implement the training process

for epoch in range(opt.n_epochs):
    for i, (imgs, labels) in enumerate(dataloader):

        batch_size = imgs.shape[0]

        # Adversarial ground truths
        #创建一个大小为(batch_size,1),数值全为 1.0 的 tensor
        valid = FloatTensor(batch_size, 1).fill_(1.0)
        
        #创建一个大小为(batch_size,1),数值全为 0.0 的 tensor
        fake = FloatTensor(batch_size, 1).fill_(0.0)

        # Configure input
        real_imgs = imgs.type(FloatTensor)
        labels = labels.type(LongTensor)
  
        # -----------------
        #  Train Generator
        # -----------------

        ### START CODE HERE
        optimizer_G.zero_grad()

        # Sample noise and labels as generator input
        #生成一批 noise
        z = Variable(FloatTensor(np.random.normal(0, 1, (batch_size, opt.latent_dim))))
        #生成一批 label
        gen_labels = Variable(LongTensor(np.random.randint(0, opt.n_classes, batch_size)))

        # 输入 z 和 gen_labels ，通过生成器，生成一批图片
        gen_imgs = generator(z, gen_labels)

        # Loss measures generator's ability to fool the discriminator
        # 通过判别器，判断生成图像的真假，返回一批图像的判别结果
        validity = discriminator(gen_imgs, gen_labels)
        # 判别为假的产生loss，这里计算生成器的loss
        g_loss = adversarial_loss(validity, valid)
		# BP + 更新
        g_loss.backward()
        optimizer_G.step()
        ### END CODE HERE

        # ---------------------
        #  Train Discriminator
        # ---------------------

        ### START CODE HERE
        optimizer_D.zero_grad()

        # 计算真实图片的 loss
        validity_real = discriminator(real_imgs, labels)
        d_real_loss = adversarial_loss(validity_real, valid)

        # 计算生成图片的 loss
        validity_fake = discriminator(gen_imgs.detach(), gen_labels)
        d_fake_loss = adversarial_loss(validity_fake, fake)

        # Total loss
        d_loss = (d_real_loss + d_fake_loss) / 2
		
        # BP + 更新
        d_loss.backward()
        optimizer_D.step()
        ### END CODE HERE

        print(
            "[Epoch %d/%d] [Batch %d/%d] [D loss: %f] [G loss: %f]"
            % (epoch, opt.n_epochs, i, len(dataloader), d_loss.item(), g_loss.item())
        )
    if (epoch+1) % 20 ==0:
        torch.save(generator.state_dict(), "./cgan_generator %d.pth" % (epoch))

最后测试，生成图像

def generate_latent_points(latent_dim, n_samples, n_classes):
    # Sample noise
    
    ### START CODE HERE
    # 随机生成向量和标签，作为测试使用
    z = Variable(FloatTensor(np.random.normal(0, 1, (n_samples, latent_dim))))
    gen_labels = Variable(LongTensor(np.random.randint(0, n_classes, n_samples)))
    ### END CODE HERE
    
    return z,gen_labels

在这里插入图片描述

长命百岁️

关注

4
点赞
踩
28

收藏

觉得还不错? 一键收藏
打赏
2
评论
CGAN及代码实现

前言本文主要介绍CGAN及其代码实现阅读本文之间，建议先阅读GAN(生成对抗网络)CGANConditionalGenerativeAdversarialNetworkConditional Generative Adversarial NetworkConditionalGenerativeAdversarialNetwork我们知道，GANGANGAN 其实又叫做 UnconditionalGenerativeAdversarialNetworkUnconditional Generati
复制链接

扫一扫