starGAN复现及遇到问题

Ellie进化中的程序猿

已于 2022-07-13 16:54:04 修改

阅读量503

点赞数

分类专栏：机器学习和深度学习文章精读文章标签：人工智能 python 算法

于 2022-06-29 10:34:48 首次发布

原文链接：https://blog.csdn.net/weixin_41770169/article/details/81033806

版权

机器学习和深度学习文章精读专栏收录该内容

4 篇文章 0 订阅

订阅专栏

starGAN _v2 论文讲解
paper: https://arxiv.org/abs/1912.01865
code: https://github.com/clovaai/stargan-v2
目标：1）diversity of generated images
2) scalability over multiple domains.
问题

An ideal image-to-image translation method should be able to
synthesize images considering the diverse styles in each domain.
However, designing and learning such models become complicated as
there can be arbitrarily large number of styles and domains in the
dataset.

  当有大量style和domain的时候需要生成的模型太多。导致问题复杂化。
  例子：

having K domains, these methods require to train K(K-1) generators to handle translations between each and every domain, limiting their practical usage.

解决方法

our goal is to train a single generator G that can generate diverse images of each domain y that corresponds to the image x.

生成一个生成器，可以生成x图片在每个y领域对应的图片。

网络结构
请添加图片描述
generator G
输入 image x and s，which is provide by the mapping network F or the style encoder E
输出 image G（x，s），s 是y domain的某一风格，x 是输入图片，生成有s风格的x图片

mapping network F(x的新domain）

The mapping network learns to transform random Gaussian noise into a style code
输入 a latent code z （16） and a domain y
输出 s=Fy（Z），F由多分枝的MLP组成，反映所有domain的不同风格。

style encoder E（x对应的domain）

the encoder learns to extract the style code from a given reference image.

输入 image x ，x的对应domain y，using the reference images

输出 s=Ey（x)

Considering multiple domains, both modules have multiple output branches, each
of which provides style codes for a specific domain

discriminator

The discriminator distinguishes between real and fake images from multiple domains. Note that all modules except the generator contain multiple output branches, one of which is selected when training the corresponding domain

loss
请添加图片描述
对抗loss，x image，y domain，D discriminator，G越小越好，D越大越好
翻译成普通话就是，D大了，说明判别器特别好用的前提下，我G生成器生成的东西loss还很小。

请添加图片描述
风格重建loss，
s mapping生成的s 和真实的x在其domain的styleencoder生成的s之间的battle，越小越好
cycleloss 证明你在改变某一个风格的时候，这个domian里面的其他style没有被你给改变。越小越好

s区分尖是E 就是style encoder，～是style encoder
风格多样性loss，两种风格差的越大越好，so这个在总loss里面是减➖
请添加图片描述

先更新到这里哈
后面是复现的时候遇到的问题和新学的东西吧啦吧啦～

nn.DataParallel 用于多GPU加速
https://zhuanlan.zhihu.com/p/102697821

mapping-network中
nn.linear
https://blog.csdn.net/qq_42079689/article/details/102873766
nn.Linear（）是用于设置网络中的全连接层的，需要注意在二维图像处理的任务中，全连接层的输入与输出一般都设置为二维张量，形状通常为[batch_size, size]，不同于卷积层要求输入输出是四维张量

请添加图片描述

问题，在服务器上没有src 和ref
这是因为在服务器上，这个存放文件夹为空

请添加图片描述
OSError: image file is truncated解决思路及方案。
https://blog.csdn.net/weixin_41770169/article/details/81033806
conda_device无法多个一起运用,奇怪的是，三个不可，两个okay ，why？
我猜是因为用到了nn.DataParallel，复制到第一块板子的时间不及时导致的数据没有读取进去。
两个板子也不行了😂
请添加图片描述