Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks 总结

最新推荐文章于 2024-09-11 22:39:35 发布

红薯塔就是爱太阳啊

最新推荐文章于 2024-09-11 22:39:35 发布

阅读量530

点赞数

分类专栏：论文总结文章标签： Unpaired Image-to-Image Translation Cycle Consistency

本文链接：https://blog.csdn.net/weixin_42270275/article/details/84861994

版权

论文总结专栏收录该内容

14 篇文章 0 订阅

订阅专栏

Abstract

Image-to-image translation training set of aligned image pairs
However ，paired training data will not be available
Our goal： G : X --> Y using an adversarial loss
Because mapping is highly under-constrained, couple with an inverse mapping F : Y —> X
and introduce a cycle consistency loss to enforce F(G(X)) ≈ X (and vice versa)

1. Introduction

“translate” from one set into the other described as image-to-image translation
Years of research have produced powerful translation systems in the supervised setting,
where example image pairs are available
However, obtaining paired training data can be difficult and expensive
seek an algorithm that can learn to translate without paired input-output examples
assume there is some underlying relationship between the domains ，and seek to learn that relationship
train a mapping G : X --> Y . However, all input images map to the same output image
translation should be “cycle consistent” ==> translator G : X -> Y and another translator F : Y -> X,
training both the mapping G and F simultaneously, using an adversarial loss
and adding a cycle consistency loss that encourages F(G(x)) ≈ x and G(F(y)) ≈ y.

2. Related work

GANs：
Recent methods adopt conditional image generation applications
GANs’ success is adversarial loss：forces the generated images to be indistinguishable from real photos.
We adopt translated images cannot be distinguished from images in the target domain.
Image-to-Image Translation：
开始于 non-parametric texture model on a single input-output training image pair.
More recent ：use a dataset learn a parametric translation function using CNNs
Our approach :builds on the “pix2pix” uses a conditional GAN,without paired training examples.
Unpaired Image-to-Image Translation
（1）之前：use adversarial networks with additional terms
to enforce the output to be close to the input in a predefined metric space,
such as class label space ,image pixel space , and image feature space
（2）does not rely on any task-specific, predefined similarity function between the input and output,
nor do we assume that the input and output have to lie in the same low-dimensional embedding space.
Cycle Consistency
using transitivity as a way to regularize structured data
similar to our work：use a cycle consistency loss as a way of using transitivity to supervise CNN training.
In this work：we introducing a similar loss to push G and F to be consistent with each other
Concurrent with our work：independently use a similar objective for unpaired image-to-image translation, inspired by dual learning in machine translation
Neural Style Transfer
(1) synthesizes a novel image by combining the content of one image with the style of another image
based on matching the Gram matrix statistics of pre-trained deep features.
(2) our primary focus：learning the mapping between two image collections，rather specific images
by trying to capture correspondences between higher-level appearance structures

3. Formulation

在这里插入图片描述

two mappings ：
G : X --> Y and F : Y --> X.
two adversarial discriminators :
DX aims to distinguish between images { x } and translate images { F(y) };
DY aims to discriminate between { y } and { G(x) }.
objective contains two types of terms:
adversarial losses: matching the distribution(generated images) to the data distribution(in target domain); cycle consistency losses： prevent the learned mappings G and F from contradicting each other.
our model can be viewed as training two “autoencoders”：
F⭕G : X -->X jointly with another G⭕F : Y --> Y
these autoencoders each have special internal structures:
map image to itself via an intermediate representation that is a translation of the image into another domain
“adversarial autoencoders”：use adversarial loss to train the bottleneck layer of an autoencoder to match an arbitrary target distribution.

4. Implementation

( 1 ) Network Architecture

generative networks：
two stride-2 convolutions,
several residual blocks
two fractionally-strided convolutions with stride 1/2 .
We use 6 blocks for128x128 images and 9 blocks for 256x256 and higher-resolution training images.
we use instance normalization
discriminator networks ：use 70x70 PatchGANs
which aim to classify whether 70x70 overlapping image patches are real or fake.

( 2 )Training details

replace the negative log likelihood objective by a least-squares loss
update the discriminators using a history of generated images rather than the ones produced by the latest generators