3 Disentangled Representation for I2I Translation
two visual domains:
X
∈
R
H
×
W
×
3
\mathcal{X}\in\mathbb{R}^{H\times W\times 3}
X∈RH×W×3,
Y
∈
R
H
×
W
×
3
\mathcal{Y}\in\mathbb{R}^{H\times W\times 3}
Y∈RH×W×3
unpaired samples:
x
∈
X
x\in\mathcal{X}
x∈X,
y
∈
Y
y\in\mathcal{Y}
y∈Y
如Fig.3所示,整个framework包含
- two content encoders { E X c , E Y c } \left \{ E_\mathcal{X}^c, E_\mathcal{Y}^c \right \} {EXc,EYc}
- two attribute encoders { E X a , E Y a } \left \{ E_\mathcal{X}^a, E_\mathcal{Y}^a \right \} {EXa,EYa}
- two generators { G X , G Y } \left \{ G_\mathcal{X}, G_\mathcal{Y} \right \} {GX,GY}
- two discriminators { D X , D Y } \left \{ D_\mathcal{X}, D_\mathcal{Y} \right \} {DX,DY}
- one content discriminator D c D^c Dc
3.1 Disentangle Content and Attribute Representations
Our approach embeds input images onto a shared content space C \mathcal{C} C, and domain-
speci c attribute spaces, A X \mathcal{A}_\mathcal{X} AX and A Y \mathcal{A}_\mathcal{Y} AY.
这个思想特别好,因为无论是哪一个domain,content信息是公共的,与domain无关的
但attribute是带有各自domain特性的,是domain相关的
4个Encoder所做的事情
{
z
x
c
,
z
x
a
}
=
{
E
X
c
(
x
)
,
E
X
a
(
x
)
}
z
x
c
∈
C
,
z
x
a
∈
A
X
{
z
y
c
,
z
y
a
}
=
{
E
Y
c
(
y
)
,
E
Y
a
(
y
)
}
z
y
c
∈
C
,
z
y
a
∈
A
Y
(
1
)
\begin{aligned} &\left \{ z_x^c, z_x^a \right \}=\left \{ E_\mathcal{X}^c(x), E_\mathcal{X}^a(x) \right \}\qquad z_x^c\in\mathcal{C}, z_x^a\in\mathcal{A}_\mathcal{X} \\ &\left \{ z_y^c, z_y^a \right \}=\left \{ E_\mathcal{Y}^c(y), E_\mathcal{Y}^a(y) \right \}\qquad z_y^c\in\mathcal{C}, z_y^a\in\mathcal{A}_\mathcal{Y} \qquad(1) \end{aligned}
{zxc,zxa}={EXc(x),EXa(x)}zxc∈C,zxa∈AX{zyc,zya}={EYc(y),EYa(y)}zyc∈C,zya∈AY(1)
对于
{
E
X
c
,
E
Y
c
}
\left \{ E_\mathcal{X}^c, E_\mathcal{Y}^c \right \}
{EXc,EYc},共享最后一层
对于
{
G
X
,
G
Y
}
\left \{ G_\mathcal{X}, G_\mathcal{Y} \right \}
{GX,GY},共享第一层
Through weight sharing, we force the content representation to be mapped onto the same space.
Q:不知道作者是否做了对比实验来验证共享的好处
为了进一步增强content空间的公共性,引入一个content discriminator
D
c
D^c
Dc,用于辨别
{
z
x
c
,
z
y
c
}
\left \{ z_x^c, z_y^c \right \}
{zxc,zyc},于是有content adversarial loss如下
L
a
d
v
c
o
n
t
e
n
t
=
(
2
)
L_{adv}^{content}= \qquad(2)
Ladvcontent=(2)
注:在content空间添加判别器,最终会使得两个domain的content分布逼近,由此实现domain无关,这一点和自己的ACMMM18论文思想一样的,叙述上也是从information的角度来描述
3.2 Cross-cycle Consistency Loss
利用Encoder,将图像 x , y x, y x,y分别分解为content成分和attribute成分,然后进行“移花接木”