III. ATTRIBUTE GAN (ATTGAN)
前提:所有attribute都是binary型的
A. Testing Formulation
定义输入图像为 x a \mathbf{x^a} xa,包含 n n n个attribute a = [ a 1 , ⋯ , a n ] \mathbf{a}=\left [ a_1, \cdots, a_n \right ] a=[a1,⋯,an]
encoder网络
G
e
n
c
G_{enc}
Genc将
x
a
\mathbf{x^a}
xa编码为latent representation
z
\mathbf{z}
z
z
=
G
e
n
c
(
x
a
)
(
3
)
\mathbf{z} = G_{enc}(\mathbf{x^a}) \qquad(3)
z=Genc(xa)(3)
target attribute记为 b = [ b 1 , ⋯ , b n ] \mathbf{b}=\left [ b_1, \cdots, b_n \right ] b=[b1,⋯,bn]
decoder网络
G
d
e
c
G_{dec}
Gdec将
z
\mathbf{z}
z和
b
\mathbf{b}
b作为输入,生成图像
x
b
^
\mathbf{x^{\hat{b}}}
xb^
x
b
^
=
G
d
e
c
(
z
,
b
)
(
4
)
\mathbf{x^{\hat{b}}} = G_{dec}(\mathbf{z}, \mathbf{b}) \qquad(4)
xb^=Gdec(z,b)(4)
综合公式(3)和(4),有
x
b
^
=
G
d
e
c
(
G
e
n
c
(
x
a
)
,
b
)
(
5
)
\mathbf{x^{\hat{b}}} = G_{dec}(G_{enc}(\mathbf{x^a}), \mathbf{b}) \qquad(5)
xb^=Gdec(Genc(xa),b)(5)
B. Training Formulation
整个训练过程是无监督的,因为ground truth x b \mathbf{x^b} xb是未知的
Reconstruction Loss
我们希望只编辑attribute改变的部分,同时保留其它attribute不变,因此引入reconstruction learning(文章给出了2个理由,感觉略显牵强)
令
b
=
a
\mathbf{b}=\mathbf{a}
b=a,得到生成图像
x
a
^
\mathbf{x^{\hat{a}}}
xa^
x
a
^
=
G
d
e
c
(
z
,
a
)
(
6
)
\mathbf{x^{\hat{a}}} = G_{dec}(\mathbf{z}, \mathbf{a}) \qquad(6)
xa^=Gdec(z,a)(6)
那么
x
a
^
\mathbf{x^{\hat{a}}}
xa^与
x
a
\mathbf{x^a}
xa应该比较近似,于是关于生成器
G
G
G的Reconstruction Loss定义如下
min
G
e
n
c
,
G
d
e
c
L
r
e
c
=
E
x
a
∼
p
d
a
t
a
∥
x
a
−
x
a
^
∥
1
(
11
)
\underset{G_{enc},G_{dec}}{\min}\ \mathcal{L}_{rec}=\mathbb{E}_{\mathbf{x^a}\sim p_{data}} \left \| \mathbf{x^a}-\mathbf{x^{\hat{a}}} \right \|_1 \qquad(11)
Genc,Gdecmin Lrec=Exa∼pdata∥∥xa−xa^∥∥1(11)
使用
ℓ
1
\ell_1
ℓ1 loss相较于
ℓ
2
\ell_2
ℓ2 loss不容易模糊
Attribute Classification Constraint
生成图像 x b ^ \mathbf{x^{\hat{b}}} xb^应该确保包含属性 b \mathbf{b} b,因此引入一个attribute classifier C C C
于是关于生成器
G
G
G的Attribute Classification Constraint定义如下
min
G
e
n
c
,
G
d
e
c
L
c
l
s
g
=
E
x
a
∼
p
d
a
t
a
,
b
∼
p
a
t
t
r
[
ℓ
g
(
x
a
,
b
)
]
(
7
)
\underset{G_{enc}, G_{dec}}{\min}\mathcal{L}_{cls_g}=\mathbb{E}_{\mathbf{x^a}\sim p_data, \mathbf{b}\sim p_{attr}}\left [ \ell_g\left ( \mathbf{x^a}, \mathbf{b} \right ) \right ] \qquad(7)
Genc,GdecminLclsg=Exa∼pdata,b∼pattr[ℓg(xa,b)](7)
ℓ
g
(
x
a
,
b
)
=
∑
i
=
1
n
−
b
i
log
C
i
(
x
b
^
)
−
(
1
−
b
i
)
log
(
1
−
C
i
(
x
b
^
)
)
(
8
)
\ell_g(\mathbf{x^a}, \mathbf{b})=\sum_{i=1}^{n}-b_i\log C_i\left ( \mathbf{x^{\hat{b}}} \right )-(1-b_i)\log\left ( 1-C_i\left ( \mathbf{x^{\hat{b}}} \right ) \right ) \qquad(8)
ℓg(xa,b)=i=1∑n−bilogCi(xb^)−(1−bi)log(1−Ci(xb^))(8)
attribute classifier
C
C
C的训练目标如下
min
C
L
c
l
s
c
=
E
x
a
∼
p
d
a
t
a
[
ℓ
r
(
x
a
,
a
)
]
(
9
)
\underset{C}{\min}\ \mathcal{L}_{cls_c}=\mathbb{E}_{\mathbf{x^a}\sim p_data}\left [ \ell_r(\mathbf{x^a}, \mathbf{a}) \right ] \qquad(9)
Cmin Lclsc=Exa∼pdata[ℓr(xa,a)](9)
ℓ
r
(
x
a
,
a
)
=
∑
i
=
1
n
−
a
i
log
C
i
(
x
a
)
−
(
1
−
a
i
)
log
(
1
−
C
i
(
x
a
)
)
(
10
)
\ell_r(\mathbf{x^a}, \mathbf{a})=\sum_{i=1}^{n}-a_i\log C_i\left ( \mathbf{x^a} \right )-(1-a_i)\log\left ( 1-C_i\left ( \mathbf{x^a} \right ) \right ) \qquad(10)
ℓr(xa,a)=i=1∑n−ailogCi(xa)−(1−ai)log(1−Ci(xa))(10)
Adversarial Loss
使用WGAN-GP版本的adversarial Loss,判别器 D D D和生成器 G G G的目标函数分别如下
min
∥
D
∥
L
⩽
1
L
a
d
v
d
=
−
E
x
a
∼
p
d
a
t
a
D
(
x
a
)
+
E
x
a
∼
p
d
a
t
a
,
b
∼
p
a
t
t
r
D
(
x
b
^
)
(
12
)
\underset{\left \| D \right \|_L\leqslant 1}{\min} \mathcal{L}_{adv_{d}}=-\mathbb{E}_{\mathbf{x^a}\sim p_{data}}D(\mathbf{x^a})+\mathbb{E}_{\mathbf{x^a}\sim p_{data},\mathbf{b}\sim p_{attr}}D\left ( \mathbf{x^{\hat{b}}} \right ) \qquad(12)
∥D∥L⩽1minLadvd=−Exa∼pdataD(xa)+Exa∼pdata,b∼pattrD(xb^)(12)
min
G
e
n
c
,
G
d
e
c
L
a
d
v
g
=
−
E
x
a
∼
p
d
a
t
a
,
b
∼
p
a
t
t
r
D
(
x
b
^
)
(
13
)
\underset{G_{enc},G_{dec}}{\min}\mathcal{L}_{adv_g}=-\mathbb{E}_{\mathbf{x^a}\sim p_{data},\mathbf{b}\sim p_{attr}}D\left ( \mathbf{x^{\hat{b}}} \right ) \qquad(13)
Genc,GdecminLadvg=−Exa∼pdata,b∼pattrD(xb^)(13)
Overall Objective
生成器
G
G
G的目标函数如下
min
G
e
n
c
,
G
d
e
c
L
e
n
c
,
d
e
c
=
λ
1
L
r
e
c
+
λ
2
L
c
l
s
g
+
L
a
d
v
g
(
14
)
\underset{G_{enc},G_{dec}}{\min}\mathcal{L}_{enc,dec}=\lambda_1\mathcal{L}_{rec}+\lambda_2\mathcal{L}_{cls_g}+\mathcal{L}_{adv_g} \qquad(14)
Genc,GdecminLenc,dec=λ1Lrec+λ2Lclsg+Ladvg(14)
判别器
D
D
D和attribute classifier
C
C
C的目标函数如下
min
D
,
C
L
d
i
s
,
c
l
s
=
λ
3
L
c
l
s
c
+
L
a
d
v
d
(
15
)
\underset{D,C}{\min}\ \mathcal{L}_{dis,cls}=\lambda_3\mathcal{L}_{cls_c}+\mathcal{L}_{adv_d} \qquad(15)
D,Cmin Ldis,cls=λ3Lclsc+Ladvd(15)
C. Why are attribute-excluding details preserved?
AttGAN执行了multi-task learning,一个是face reconstruction task,另一个是attribute editing task
作者认为这两个task是高度相似的,它们之间的transferability gap非常小,因此the detail preservation ability learned from the face reconstruction task can be easily transfered to the attribute editing task
D. Extension for Attribute Style Manipulation
参考文献[28]和[26],引入一组style controllers
θ
=
[
θ
1
,
⋯
,
θ
i
,
⋯
,
θ
n
]
\theta=\left [ \theta_1, \cdots, \theta_i, \cdots, \theta_n \right ]
θ=[θ1,⋯,θi,⋯,θn],然后maximize the mutual information between the controllers and the output images to make them highly correlated
具体来说,如Figure 3所示,额外引入一个style predictor
Q
Q
Q,encoder网络
G
d
e
c
G_{dec}
Gdec额外接收
θ
\theta
θ作为输入,生成具备target attribute
b
\mathbf{b}
b和style controller
θ
\theta
θ的图像
x
θ
^
b
^
\mathbf{x^{\hat{\theta}\hat{b}}}
xθ^b^
x
θ
^
b
^
=
G
d
e
c
(
G
e
n
c
(
x
a
)
,
θ
,
b
)
(
16
)
\mathbf{x^{\hat{\theta}\hat{b}}}=G_{dec}\left ( G_{enc}(\mathbf{x^a}), \theta, \mathbf{b} \right ) \qquad(16)
xθ^b^=Gdec(Genc(xa),θ,b)(16)
style controller
θ
\theta
θ与生成图像
x
∗
x^*
x∗之间的mutual information定义如下
I
(
θ
;
x
∗
)
=
max
Q
E
θ
∼
p
(
θ
)
,
x
∗
∼
p
(
x
∗
∣
θ
)
[
log
Q
(
θ
∣
x
∗
)
]
+
c
o
n
s
t
(
17
)
I\left ( \theta;x^* \right )=\underset{Q}{\max}\ \mathbb{E}_{\theta\sim p(\theta), x^*\sim p(x^*|\theta)}\left [ \log Q(\theta|x^*) \right ] + const \qquad(17)
I(θ;x∗)=Qmax Eθ∼p(θ),x∗∼p(x∗∣θ)[logQ(θ∣x∗)]+const(17)
故生成器
G
G
G新增一项损失函数如下
max
G
e
n
c
,
G
d
e
c
I
(
θ
;
x
∗
)
(
18
)
\underset{G_{enc}, G_{dec}}{\max}I\left ( \theta;\mathbf{x^*} \right ) \qquad(18)
Genc,GdecmaxI(θ;x∗)(18)