patch-GAN(pixel2pixel)：Image-to-Image Translation with Conditional Adversarial Networks

最新推荐文章于 2022-11-08 11:51:42 发布

NoTime4Emotion

最新推荐文章于 2022-11-08 11:51:42 发布

阅读量1.1k

点赞数

分类专栏： Methodology 文章标签：深度学习机器学习神经网络

本文链接：https://blog.csdn.net/qq_42192910/article/details/104590272

版权

Methodology 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

Image-to-Image Translation with Conditional Adversarial Networks

Paper:https://arxiv.org/pdf/1611.07004.pdf
Code:https://github.com/affinelayer/Pix2Pix-tensorflow
Tips：CVPR2017的一篇paper。
（阅读笔记）

1.Main idea

用条件GAN解决图像到图像的转换问题。a general-purpose solution to image-to-image translation problems.
去学习损失函数来实现图像到图像的映射关系。learn a loss function to train this mapping.

2.Intro

类似于语言翻译，给出了图像到图像的定义解释。we define automatic image-to-image translation as the task of translating one possible representation of a scene into another.
虽然CNN已经取得了很优秀的结果，但还是需要一个目标函数。In other words, we still have to tell the CNN what we wish it to minimize.
得益于GAN，所以可以直接学到一个高维的Loss function。
之前的大多相关工作都是学习图像与图像之间的结构形式的损失，然后介绍了条件GAN的发展。

3.Details

目标函数与原始GAN的目标函数差不多，只是添加了L1损失，如下式：
$\begin{aligned} \mathcal{L}_{L1}(G) &= \mathbb{E}_{x,y,z} \left[ \|y-G(x,z) \|_1 \right] \\ \end{aligned}$
$\begin{aligned} \arg \min_G & \max_D \mathcal{L}_{cGAN}(G,D) + \lambda \mathcal{L}_{L1}(G) \\ \end{aligned}$
注意到如果不加噪声 $z$ ，那么生成器只会学习到定式的函数（只会输出与输入 $x$ 很类似的结果），这样的结果是不够好的。
生成器和U-net类似，自编码器并有跳跃连接的形式。
判别器是一个马尔科夫过程（patchGAN），并不是整张图片进行判别，而是一个区域一个区域（patch）的判别，最后结果求平均得分。This discriminator tries to classify if each $\times N$ patch in an image is real or fake.
这样以后，运行速度更快，参数更少，也能得到很好的结果。produce high quality results; has fewer parameters, runs faster, and can be applied to arbitrarily large images.
但是代码的实现却还是和其他GAN一样，并没有发现patch的具体设置，于是：
The difference between a PatchGAN and regular GAN discriminator is that rather the regular GAN maps from a 256x256 image to a single scalar output, which signifies “real” or “fake”, whereas the PatchGAN maps from 256x256 to an $\times N$ array of outputs $X$ , where each $X_{ij}$ signifies whether the patch $i, j$ in the image is real or fake.
参考:https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/issues/39
Maybe it would have been better if we called it a “Fully Convolutional GAN” like in FCNs, it is the same idea.

NoTime4Emotion

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
patch-GAN(pixel2pixel)：Image-to-Image Translation with Conditional Adversarial Networks

Image-to-Image Translation with Conditional Adversarial NetworksPaper:https://arxiv.org/pdf/1611.07004.pdfCode:https://github.com/affinelayer/Pix2Pix-tensorflowTips：CVPR2017的一篇paper。（阅读笔记）...
复制链接

扫一扫