【论文笔记】域迁移 CycleGAN：unpaired Image-to-image Translation

最新推荐文章于 2024-06-21 08:36:45 发布

muyijames

最新推荐文章于 2024-06-21 08:36:45 发布

阅读量1.2k

点赞数 1

分类专栏：深度学习文章标签：深度学习人工智能

本文链接：https://blog.csdn.net/muyijames/article/details/120925391

版权

深度学习专栏收录该内容

8 篇文章 2 订阅

订阅专栏

文章目录

1 综述
2 网络结构
3 结果对比
- 3.1 Cycle Consistency Loss效果
- 3.2 其他GAN网络对比
4 源码解析
- 4.1 Generator和Discrmiator实现
- 4.1 Loss实现
参考文献

1 综述

今天分享一篇2017年的论文《Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks》，Cycle-GAN已经有很多博客对其进行了介绍，这里不再重复。这里主要提一下论文亮点：

主要解决：对于源域和目标域之间，无须建立训练数据间一对一的映射，也可实现这种迁移的问题（Domain Adaptation）。

原文描述：We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples。

论文地址：
《Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks》

代码地址1：junyanz/pytorch-CycleGAN-and-pix2pix
代码地址2：aitorzip/PyTorch-CycleGAN （代码可读性较强，下文源码分析在此基础上进行）

网络效果图：在这里插入图片描述

2 网络结构

在训练过程中，判别器和生成器的参数是分别训练的，整个过程有点像进化论中捕食者和被捕食者迭代进化的过程：

当我们固定住生成器的参数训练判别器时，判别器便能学到更好的判别技巧，当我们固定住判别器参数训练生成器时，生成器为了骗过现在更厉害的判别器，被迫产生出更好质量的图片。两者便在这迭代学习的过程中逐步进化，最终达到动态平衡；

2.1 Unpaired image data

关于paired training data 和 unpaired training data 的区别，论文中做了说明。pix2pix模型必须要求成对数据，而CycleGAN利用非成对数据也能进行训练；
在这里插入图片描述

2.2 Cycle Consistency Loss

是本论文的亮点，引入了循环映射和Cycle Consistency Loss(循环一致性损失)。

对抗训练可以学习和产生与目标域Y相同分布的输出。但单纯使用一般的 Gan-loss 损失是无法进行训练的。原因在于，在足够大的样本容量下，网络可以将相同的输入图像集合映射到目标域中图像的任何随机排列，其中任何学习的映射可以归纳出与目标分布匹配的输出分布（即：映射F完全可以将所有x都映射为Y空间中的同一张图片，使损失无效化）

因此我们希望：

x -> G(x) -> F(G(x)) ≈ x，称作 forward cycle consistency；
同理，y -> F(y) -> G(F(y)) ≈ y，称作 backward cycle consistency；

就是说，将X的图片转换到Y空间后，应该还可以转换回来。这样就杜绝模型把所有X的图片都转换为Y空间中的同一张图片了，原文的解释如下：
在这里插入图片描述
Cycle Consistency Loss 函数实现如下：

2.3 Identity Loss

在论文 application 部分之中提及了：
在这里插入图片描述

2.4 Adversarial Loss

GAN网络中都具有的Loss，函数如下：
在这里插入图片描述
但在实际实现中，借鉴了LSGAN，对 adversarial loss （对抗损失）进行了改进。用最小二乘损失代替了负对数似然目标；

2.5 网络与训练细节

（1）Generator采用的是Perceptual losses for real-time style transfer and super-resolution 一文中的网络结构；几个resblock组成的网络，降采样部分采用 stride 卷积，增采样部分采用反卷积；
（2）Discriminator 采用的仍是 pix2pix 中的 70x70 的PatchGANs 结构；
（3）图片使用了 Instance Normalization 而非经典DCGAN中所使用的Batch Normalization；
（4）使用了 Reflection padding 而非普通的 Zero padding；
（5）训练判别器时还会用到生成器产生的历史数据；
（6）Lr=0.0002。对于前100个周期，保持相同的学习速率0.0002，然后在接下来的100个周期内线性衰减到0；

3 结果对比

3.1 Cycle Consistency Loss效果

在这里插入图片描述

3.2 其他GAN网络对比

论文中给出了不同GAN网络结果：
在这里插入图片描述

4 源码解析

此处展示CycleGAN结构代码，对照网络结构看起来更易理解；

4.1 Generator和Discrmiator实现

import torch.nn as nn
import torch.nn.functional as F
import torch

class ResidualBlock(nn.Module):
    def __init__(self, in_features):
        super(ResidualBlock, self).__init__()
        conv_block = [  nn.ReflectionPad2d(1),
                        nn.Conv2d(in_features, in_features, 3),
                        nn.InstanceNorm2d(in_features),
                        nn.ReLU(inplace=True),
                        nn.ReflectionPad2d(1),
                        nn.Conv2d(in_features, in_features, 3),
                        nn.InstanceNorm2d(in_features)  ]
        self.conv_block = nn.Sequential(*conv_block)

    def forward(self, x):
        return x + self.conv_block(x)

class Generator(nn.Module):
    def __init__(self, input_nc, output_nc, n_residual_blocks=9):
        super(Generator, self).__init__()
        # Initial convolution block       
        model = [   nn.ReflectionPad2d(3),
                    nn.Conv2d(input_nc, 64, 7),
                    nn.InstanceNorm2d(64),
                    nn.ReLU(inplace=True) ]
        # Downsampling
        in_features = 64
        out_features = in_features*2
        for _ in range(2):
            model += [  nn.Conv2d(in_features, out_features, 3, stride=2, padding=1),
                        nn.InstanceNorm2d(out_features),
                        nn.ReLU(inplace=True) ]
            in_features = out_features
            out_features = in_features*2
        # Residual blocks
        for _ in range(n_residual_blocks):
            model += [ResidualBlock(in_features)]
        # Upsampling
        out_features = in_features//2
        for _ in range(2):
            model += [  nn.ConvTranspose2d(in_features, out_features, 3, stride=2, padding=1, output_padding=1),
                        nn.InstanceNorm2d(out_features),
                        nn.ReLU(inplace=True) ]
            in_features = out_features
            out_features = in_features//2
        # Output layer
        model += [  nn.ReflectionPad2d(3),
                    nn.Conv2d(64, output_nc, 7),
                    nn.Tanh() ]
        self.model = nn.Sequential(*model)

    def forward(self, x):
        return self.model(x)


class Discriminator(nn.Module):
    def __init__(self, input_nc):
        super(Discriminator, self).__init__()
        # A bunch of convolutions one after another
        model = [   nn.Conv2d(input_nc, 64, 4, stride=2, padding=1),
                    nn.LeakyReLU(0.2, inplace=True) ]
        model += [  nn.Conv2d(64, 128, 4, stride=2, padding=1),
                    nn.InstanceNorm2d(128), 
                    nn.LeakyReLU(0.2, inplace=True) ]
        model += [  nn.Conv2d(128, 256, 4, stride=2, padding=1),
                    nn.InstanceNorm2d(256), 
                    nn.LeakyReLU(0.2, inplace=True) ]
        model += [  nn.Conv2d(256, 512, 4, padding=1),
                    nn.InstanceNorm2d(512), 
                    nn.LeakyReLU(0.2, inplace=True) ]
        # FCN classification layer
        model += [nn.Conv2d(512, 1, 4, padding=1)]
        self.model = nn.Sequential(*model)

    def forward(self, x):
        x =  self.model(x)
        # Average pooling and flatten
        result = F.avg_pool2d(x, x.size()[2:]).view(x.size()[0], -1)
        return torch.squeeze(result.T)

4.1 Loss实现

###### Training ######
iter_num = 0
for epoch in range(opt.epoch, opt.n_epochs):
    for i, batch in enumerate(dataloader):
        # Set model input
        real_A = Variable(input_A.copy_(batch['A']))
        real_B = Variable(input_B.copy_(batch['B']))

        ###### Generators A2B and B2A ######
        optimizer_G.zero_grad()
      
        # Identity loss；
        same_B = netG_A2B(real_B)
        loss_identity_B = criterion_identity(same_B, real_B) * 5.0
        # G_B2A(A) should equal A if real A is fed
        same_A = netG_B2A(real_A)
        loss_identity_A = criterion_identity(same_A, real_A) * 5.0
        
        # GAN loss；
        fake_B = netG_A2B(real_A)
        pred_fake = netD_B(fake_B)
        loss_GAN_A2B = criterion_GAN(pred_fake, target_real)
        fake_A = netG_B2A(real_B)
        pred_fake = netD_A(fake_A)
        loss_GAN_B2A = criterion_GAN(pred_fake, target_real)
        
        # Cycle loss；
        recovered_A = netG_B2A(fake_B)
        loss_cycle_ABA = criterion_cycle(recovered_A, real_A) * 10.0
        recovered_B = netG_A2B(fake_A)
        loss_cycle_BAB = criterion_cycle(recovered_B, real_B) * 10.0
        
        # Total loss
        loss_G = loss_identity_A + loss_identity_B + loss_GAN_A2B + loss_GAN_B2A + loss_cycle_ABA + loss_cycle_BAB
        loss_G.backward()
        optimizer_G.step()
        ##################

        ###### Discriminator A ######
        optimizer_D_A.zero_grad()
        # Real loss
        pred_real = netD_A(real_A)
        loss_D_real = criterion_GAN(pred_real, target_real)
        # Fake loss
        fake_A = fake_A_buffer.push_and_pop(fake_A)
        pred_fake = netD_A(fake_A.detach())
        loss_D_fake = criterion_GAN(pred_fake, target_fake)
        # Total loss
        loss_D_A = (loss_D_real + loss_D_fake) * 0.5
        loss_D_A.backward()
        optimizer_D_A.step()
        ###################################

        ###### Discriminator B ######
        optimizer_D_B.zero_grad()
        # Real loss
        pred_real = netD_B(real_B)
        loss_D_real = criterion_GAN(pred_real, target_real)
        # Fake loss
        fake_B = fake_B_buffer.push_and_pop(fake_B)
        pred_fake = netD_B(fake_B.detach())
        loss_D_fake = criterion_GAN(pred_fake, target_fake)
        # Total loss
        loss_D_B = (loss_D_real + loss_D_fake) * 0.5
        loss_D_B.backward()
        optimizer_D_B.step()
        ###################################

参考文献

1、一文读懂GAN, pix2pix, CycleGAN和pix2pixHD

muyijames

关注

1
点赞
踩
16

收藏

觉得还不错? 一键收藏
1
评论
【论文笔记】域迁移 CycleGAN：unpaired Image-to-image Translation

文章目录1 综述2 网络结构2.1 Unpaired image data2.2 Cycle Consistency Loss2.3 Identity Loss2.4 Adversarial Loss2.5 网络与训练细节3 结果对比3.1 Cycle Consistency Loss效果3.2 其他GAN网络对比4 源码解析4.1 Generator和Discrmiator实现4.1 Loss实现参考文献1 综述今天分享一篇2017年的论文《Unpaired Image-to-Image Transl
复制链接

扫一扫