Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks 总结

Abstract

  • Image-to-image translation training set of aligned image pairs
  • However ,paired training data will not be available
  • Our goal: G : X --> Y using an adversarial loss
    Because mapping is highly under-constrained, couple with an inverse mapping F : Y —> X
    and introduce a cycle consistency loss to enforce F(G(X)) ≈ X (and vice versa)

1. Introduction

  • “translate” from one set into the other described as image-to-image translation
  • Years of research have produced powerful translation systems in the supervised setting,
    where example image pairs are available
  • However, obtaining paired training data can be difficult and expensive
    seek an algorithm that can learn to translate without paired input-output examples
    assume there is some underlying relationship between the domains ,and seek to learn that relationship
  • train a mapping G : X --> Y . However, all input images map to the same output image
  • translation should be “cycle consistent” ==> translator G : X -> Y and another translator F : Y -> X,
  • training both the mapping G and F simultaneously, using an adversarial loss
    and adding a cycle consistency loss that encourages F(G(x)) ≈ x and G(F(y)) ≈ y.

2. Related work

  • GANs:
    Recent methods adopt conditional image generation applications
    GANs’ success is adversarial loss:forces the generated images to be indistinguishable from real photos.
    We adopt translated images cannot be distinguished from images in the target domain.
  • Image-to-Image Translation:
    开始于 non-parametric texture model on a single input-output training image pair.
    More recent :use a dataset learn a parametric translation function using CNNs
    Our approach :builds on the “pix2pix” uses a conditional GAN,without paired training examples.
  • Unpaired Image-to-Image Translation
    (1)之前 :use adversarial networks with additional terms
    to enforce the output to be close to the input in a predefined metric space,
    such as class label space ,image pixel space , and image feature space
    (2)does not rely on any task-specific, predefined similarity function between the input and output,
    nor do we assume that the input and output have to lie in the same low-dimensional embedding space.
  • Cycle Consistency
    using transitivity as a way to regularize structured data
    similar to our work:use a cycle consistency loss as a way of using transitivity to supervise CNN training.
    In this work:we introducing a similar loss to push G and F to be consistent with each other
    Concurrent with our work:independently use a similar objective for unpaired image-to-image translation, inspired by dual learning in machine translation
  • Neural Style Transfer
    (1) synthesizes a novel image by combining the content of one image with the style of another image
    based on matching the Gram matrix statistics of pre-trained deep features.
    (2) our primary focus:learning the mapping between two image collections,rather specific images
    by trying to capture correspondences between higher-level appearance structures

3. Formulation

在这里插入图片描述

  • two mappings
    G : X --> Y and F : Y --> X.
  • two adversarial discriminators :
    DX aims to distinguish between images { x } and translate images { F(y) };
    DY aims to discriminate between { y } and { G(x) }.
  • objective contains two types of terms:
    adversarial losses: matching the distribution(generated images) to the data distribution(in target domain); cycle consistency losses: prevent the learned mappings G and F from contradicting each other.
  • our model can be viewed as training two “autoencoders”:
    F⭕G : X -->X jointly with another G⭕F : Y --> Y
    these autoencoders each have special internal structures:
    map image to itself via an intermediate representation that is a translation of the image into another domain
    “adversarial autoencoders”:use adversarial loss to train the bottleneck layer of an autoencoder to match an arbitrary target distribution.

4. Implementation

( 1 ) Network Architecture

  • generative networks:
    two stride-2 convolutions,
    several residual blocks
    two fractionally-strided convolutions with stride 1/2 .
    We use 6 blocks for128x128 images and 9 blocks for 256x256 and higher-resolution training images.
    we use instance normalization
  • discriminator networks :use 70x70 PatchGANs
    which aim to classify whether 70x70 overlapping image patches are real or fake.

( 2 )Training details

  • replace the negative log likelihood objective by a least-squares loss
  • update the discriminators using a history of generated images rather than the ones produced by the latest generators
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值