Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation（CVPR20）

最新推荐文章于 2020-11-15 10:51:22 发布

o0Helloworld0o

最新推荐文章于 2020-11-15 10:51:22 发布

阅读量653

点赞数

分类专栏：读书笔记

本文链接：https://blog.csdn.net/o0Helloworld0o/article/details/108322412

版权

40 篇文章 1 订阅

订阅专栏

No Independent Component for Encoding (NICE).

以domain $y$ 上的判别器 $D_y$ 为例， $D_y$ 的结构包括encoder $E_y^D$ ，以及classifier $C_y$

$D_y$ 不断学习到判别图像是否属于domain $y$ 的能力，因此encoder $E_y^D$ 提取的特征是非常有用的，于是 $y\rightarrow x$ 的生成器可以复用 $E_y^D$

在这里插入图片描述

Multi-Scale Discriminators $D_x$ and $D_y$ .

第1处结构上的改进，判别器的结构使用multi-scale structure

之前的文章中也使用了Multi-Scale Discriminators，具体来说，将图像down-sampling为一系列尺寸，然后将这一系列图像送入一系列判别器中

本文采用的做法更加efficient，具体Multi-Scale Discriminators的结构如Figure 2所示，总共设置了3级 $\left \{ C_x^0, C_x^1, C_x^2 \right \}$

简单来说就是图像经过encoder之后的feature map送入 $C_x^0$ ，然后经过卷积得到feature map送入 $C_x^1$ ，最后再经过卷积得到feature map送入 $C_x^2$

第2处结构上的改进，对于U-GAT-IT中的CAM attention，本文将它升级为残差的版本

$E_x(x)$ 是encoder得到的feature map，利用CAM学习到一个weight $w$ ，U-GAT-IT的做法是使用 $w$ 对 $E_x(x)$ 进行加权，得到reweighted feature map（又称attention map）

本文的做法是引入一个trainable parameter $\gamma$ ，来线性组合原始 $E_x(x)$ 与加权的 $E_x(x)$ ，即 $\gamma\times w\times E_x(x) + E_x(x)$

第3处结构上的改进，对判别器使用spectral normalization

因为Encoder是复用的，所以 it will incur inconsistency if we apply conventional adversarial training.（缺乏一个理论上的解释）

To overcome this defect, we decouple the training of $E_x$ from that of the generator $G_{x\rightarrow y}$ .

关注

专栏目录