2. Method
学习的目标是an unconditional generative model that captures the internal statistics of a single training image x x x
不同于纹理生成(texture generation),本文针对的图像都是general natural images
2.1. Multi-scale architecture
对于输入图像
x
x
x的pyramid
{
x
0
,
⋯
,
x
N
}
\left \{ x_0,\cdots,x_N \right \}
{x0,⋯,xN},对应各自的生成器
{
G
0
,
⋯
,
G
N
}
\left \{ G_0,\cdots,G_N \right \}
{G0,⋯,GN},其中
x
n
x_n
xn是将
x
x
x尺寸缩小
r
n
r^n
rn倍的图像,
r
>
1
r\gt1
r>1是一个超参数,每一个
G
n
G_n
Gn对应一个判别器
D
n
D_n
Dn
训练首先从
x
N
x_N
xN这一尺寸开始,
G
N
G_N
GN将高斯白噪声
z
N
z_N
zN转换为图像
x
~
N
\tilde{x}_N
x~N
x
~
N
=
G
N
(
z
N
)
(
1
)
\tilde{x}_N=G_N(z_N) \qquad(1)
x~N=GN(zN)(1)
x
~
N
\tilde{x}_N
x~N包含了图像的general layout以及object的global structure,后续的
G
n
(
n
<
N
)
G_n(n\lt N)
Gn(n<N)逐渐地增加各种细节
如Figure 5所示,
G
n
G_n
Gn接收的输入有2个,1是高斯白噪声
z
n
z_n
zn,2是上一个尺度生成图像的上采样版本
(
x
~
n
+
1
)
↑
r
\left ( \tilde{x}_{n+1} \right )\uparrow^r
(x~n+1)↑r
x
~
n
=
G
n
(
z
n
,
(
x
~
n
+
1
)
↑
)
,
n
<
N
(
2
)
\tilde{x}_n=G_n\left ( z_n, \left ( \tilde{x}_{n+1} \right )\uparrow \right ), \quad n\lt N \qquad(2)
x~n=Gn(zn,(x~n+1)↑),n<N(2)
更具体来说,
G
n
G_n
Gn执行的操作如下,是一种残差的操作
x
~
n
=
(
x
~
n
+
1
)
↑
r
+
ψ
n
(
z
n
+
(
x
~
n
+
1
)
↑
r
)
(
3
)
\tilde{x}_n=\left ( \tilde{x}_{n+1} \right )\uparrow^r+\psi_n\left ( z_n+\left ( \tilde{x}_{n+1} \right )\uparrow^r \right ) \qquad(3)
x~n=(x~n+1)↑r+ψn(zn+(x~n+1)↑r)(3)
其中
ψ
n
\psi_n
ψn是一个ConvNet,包含了5个block,每个block是Conv(3x3)-BatchNorm-LeakyReLU
2.2. Training
训练是从coarsest scale到finest scale,每一个GAN在训练好之后,就保持fixed状态
对于第
n
n
n个GAN,损失函数包括adversarial term以及reconstruction term
min
G
n
max
D
n
L
a
d
v
(
G
n
,
D
n
)
+
α
L
r
e
c
(
G
n
)
(
4
)
\underset{G_n}{\min}\ \underset{D_n}{\max}\ \mathcal{L}_{adv}(G_n,D_n)+\alpha\mathcal{L}_{rec}(G_n) \qquad(4)
Gnmin Dnmax Ladv(Gn,Dn)+αLrec(Gn)(4)
Adversarial loss
使用WGAN-GP loss
Reconstruction loss
必须保证存在一组noise,能够重构出原始图像
x
x
x
因此事先选取一组
{
z
N
r
e
c
,
z
N
−
1
r
e
c
,
⋯
,
z
0
r
e
c
}
=
{
z
∗
,
0
,
⋯
,
0
}
\left \{ z_N^{rec},z_{N-1}^{rec},\cdots,z_0^{rec} \right \}=\left \{ z^*,0,\cdots,0 \right \}
{zNrec,zN−1rec,⋯,z0rec}={z∗,0,⋯,0},生成得到
{
x
~
N
r
e
c
,
x
~
N
−
1
r
e
c
,
⋯
,
x
~
0
r
e
c
}
\left \{ \tilde{x}_N^{rec},\tilde{x}_{N-1}^{rec},\cdots,\tilde{x}_0^{rec} \right \}
{x~Nrec,x~N−1rec,⋯,x~0rec}
于是对于
n
<
N
n\lt N
n<N
L
r
e
c
=
∥
G
n
(
0
,
(
x
~
n
+
1
r
e
c
)
↑
r
)
−
x
n
∥
2
(
5
)
\mathcal{L}_{rec}=\left \| G_n\left ( 0,\left ( \tilde{x}_{n+1}^{rec} \right )\uparrow^r \right ) -x_n\right \|^2 \qquad(5)
Lrec=∥∥Gn(0,(x~n+1rec)↑r)−xn∥∥2(5)
对于
n
=
N
n=N
n=N,
L
r
e
c
=
∥
G
N
(
z
∗
)
−
x
N
∥
2
\mathcal{L}_{rec}=\left \| G_N(z^*)-x_N \right \|^2
Lrec=∥GN(z∗)−xN∥2