style-GAN Generator 代码解析

最新推荐文章于 2022-07-23 09:30:11 发布

Loiser1

最新推荐文章于 2022-07-23 09:30:11 发布

阅读量1k

点赞数

分类专栏： GAN 文章标签： pytorch 深度学习

本文链接：https://blog.csdn.net/Loiser1/article/details/119810551

版权

GAN 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

文章目录

style-GAN Generator 代码解析

style-GAN Generator 代码解析

PART 2 of Sefa study:

对于style-GAN Generator部分的代码解析

代码仓库:(sefa/models at master · genforce/sefa · GitHub)
style-GAN论文讲解[参考](StyleGAN-基于样式的生成对抗网络（论文阅读总结） - 知乎 (zhihu.com))
代码中每个模块的原理作用[解析参考](StyleGAN 和 StyleGAN2 的深度理解 - 知乎 (zhihu.com))
styleGAN2有部分添加，而styleGAN的Synthesis部分参考PGGAN
下面主要是梳理代码思路

< 在这里插入图片描述

代码主要分为两个部分:

Mapping network
Synthesis network

Mapping主要网络组块在 $class\quad DenseBlock()$

Synthesis主要的网络组块在 $class\quad ConvBlock()$ 依靠输入参数(‘position’)不同进行调整

最后组装为 $class\,StyleGANGenerator(nn.Module)$

StyleGAN部分

分辨率resolution在 $2^3\sim 2^{10}$

将潜变量空间$z\in Z(dim=512)\rightarrow w\in W(dim=512) $

$num\_layers=log_2(\frac{2\cdot res}{4})*2$

输入 $z : d i m = 2, s h a p e = [z . s h a p e [0], 512]$
先过mapping层得到dict{‘z’: z;‘label’:None;‘w’:w}

其中 $z=pixel\_norm(z)$

$z$ 过8层 $F C$ ，维度不变Linear+relu 变成w

保持输入输出channels=512, $w : [z . s h a p e [0], 512]$
```
self.add_module(f'dense{i}',
				DenseBlock(in_channels=in_channels,
                           out_channels=out_channels,
                           use_wscale=self.use_wscale,#=True
                           lr_mul=self.lr_mul))#=0.01
```
两个tricks(option):
- truncation（self.training=True进行自学习，用truncation技巧处理每层得到的w）
  
  ps:其中使用all_gather函数源自.sync_op.py文件，主要把分布运算的w汇总求均值 $\bar w$
- style_mix (有0.9的可能性在任意mapping层以后用新的z在后面层产生的w代替原来这些层的w)
一层truncation：
- $layer\_num=log_2(\frac{2\cdot resolution}{4})*2$ ,通过repeat改变mapping layer输出的w的结构为 $w:[w.shape[0],layer\_num,512]$
$const\quad 4\times4\times512$ 过synthesis层加入噪声生成图片

synthesis:分层
- PS：函数 $def\quad get\_nf(self,res):return\quad min\lbrace(4^2*(2^{10})/res),512\rbrace$ 即卷积feature maps的个数
- 这里lod(layer of detail是从0开始加细节的)
  
  那么网络结构就是:(详细每层运算见下面convblock部分)
  - 第0层layer0 初始化x,把channel调整到512
  - 循环下面block:
  - start:
  - 双数层(第0个block没有)channel减少一半（通过conv2d），res增加一倍(通过upsample) ,且加噪声、style_code、即B+AdaIN+A
  - 单数层channel 只做B+AdaIN+A
  - 最后将16channels转为3channels生成图片
  - end.
- 第0层: res=init_res: $2^2$ : get_nf(res)=512
```
# 'Const'
self.add_module(layer_name,#=layer0
                ConvBlock(in_channels=self.get_nf(res),
                          out_channels=self.get_nf(res),
                          resolution=self.init_res,#4
                          w_space_dim=self.w_space_dim,
                          position='const_init',
                          use_wscale=self.use_wscale))
```
- 双数层 res= $2^{3,4,...,10}$ , $layer\_name=layer2,4,...,16$ 分辨率每层*2
```
# 'Conv0_up'
self.add_module(layer_name,
                ConvBlock(in_channels=self.get_nf(res // 2),#*2
                          out_channels=self.get_nf(res),
                          resolution=res,
                          w_space_dim=self.w_space_dim,
                          upsample=True,
                          fused_scale=fused_scale,
                          use_wscale=self.use_wscale))
```
- 单数层 $res=2^{2,3,...,10}$ ,layer1,3,…,17 保持分辨率再次卷积
```
#第一层叫‘Conv’,后面层叫'Conv1'
self.add_module(layer_name,
                ConvBlock(in_channels=self.get_nf(res),
                          out_channels=self.get_nf(res),
                          resolution=res,
                          w_space_dim=self.w_space_dim,
                          use_wscale=self.use_wscale))
```
- 输出层 output0,1,2,…,8
```
self.add_module(f'output{block_idx}',
                ConvBlock(in_channels=self.get_nf(res),
                          out_channels=self.image_channels,#3
                          resolution=res,
                          w_space_dim=self.w_space_dim,
                          position='last',
                          kernel_size=1,
                          padding=0,
                          use_wscale=self.use_wscale,
                          wscale_gain=1.0,
                          activation_type='linear'))
```

DenseBlock部分（Mapping主要组件）

每层适用wscale: $\sqrt{\frac{2}{kernel\_size*kernel\_size*channels}}$ 平衡Conv对参数大小的影响

输入/把x整理为: $[x . s h a p e [0], - 1]$ 的格式
过F.linear层
relu激活
输出: $x.shape[0],out\_channels]$

Truncation部分（trick1）

这部分原理在开头原理参考网址上说的很清楚

class TruncationModule(nn.Module):

def将数据整理为: $w.shape[0],num\_layer,512]$ 的形式

做的工作是将trunc_layers以上的层得到的w的值之间以trunc_psi的比例等距缩小，这里没有求

ConvBlock部分（Synthesis主要组件）

每层适用wscale: $\sqrt{\frac{2}{kernel\_size*kernel\_size*channels}}$ 平衡Conv对参数大小的影响

parameter: $weight.shape=[out\_channels,in\_channels,ker,ker]$

其中卷积使用F.conv2d接口解释

position=‘const_init’:B+AdaIN+A
- 初始化x: $o n e s [w . s h a p e [0], 512, 4, 4]$
- 对x加noise,bias,relu,pixel_norm
- style(x,w),得到x,style
position=None, upsample=True:
- out_channels=3
- upsample把图res变大一倍: $x:[x.shape[0],in\_channels,2*res,2*res]$
- $conv2d:x\circ weight([out\_channels,in\_channels,ker=3,ker=3])\rightarrow\\ [x.shape[0],out\_channels,2*res,2*res]$
- 再进行blur:对所有图进行相同卷积一次
- 对x加noise,bias,relu,pixel_norm (B+AdaIN+A)
- style(x,w),得到x,style
position=None,upsample=None:
- 除去upsample,out_channel=in_channel 重复上述情况
position=last:
- upsample=nn.Indentity()
- $conv2d:x\circ weight([out\_channels,in\_channels,ker=1,ker=1])\rightarrow\\ [x.shape[0],out\_channels,res,res]$
- x+bias ,return x

class StyleModLayer

这层是将mapping layer得到的w与图片融合
在这里插入图片描述

A:先Linear将 $:[w.shape[0],512]\rightarrow [w.shape[0],out\_channels*2]$
上式 $out\_channels$ 分为两部分 $y_1,y_2:[w.shape[0],1,out\_channels,1,1]$
对x进行缩放平移 $x=x*(1+y_1)+y_2$

class BlurLayer

class BlurLayer部分初始化 $3\times 3$ kernel，再对channels广播:得到 $k e r n e l . s h a p e = [c h a n n e l s, 1, 3, 3]$
传到 class Blur: forward部分实现conv2d: $x:[x.shape[0],channels,res,res]\circ kernel\rightarrow y:[x.shape[0],channels,res,res]$

并在backward部分求梯度torch.autograd.Function使用参考
输出y是用同一个kernel

class NoiseApplyingLayer

输入 $x = [x . s h a p e [0], c h a n n e l s, r e s, r e s]$
$N o i s e : r a n d n (1, 1, r e s, r e s) * w e i g h t (1, c h a n n e l s, 1, 1)$ # noise对所有样本一致
parameter: $w e i g h t s$
输出 $x + n o i s e$

class UpsamplingLayer

输入x:[x.shape[0],channels,res,res]
interpolate插值上采样->[x.shape[0],channels,2res,2res]

Loiser1

关注

0
点赞
踩
6

收藏

觉得还不错? 一键收藏
1
评论
style-GAN Generator 代码解析

文章目录style-GAN Generator 代码解析StyleGAN部分DenseBlock部分（Mapping主要组件）Truncation部分（trick1）ConvBlock部分（Synthesis主要组件）class StyleModLayerclass BlurLayerclass NoiseApplyingLayerclass UpsamplingLayerstyle-GAN Generator 代码解析PART 2 of Sefa study:对于style-GAN Generato
复制链接

扫一扫