【论文研读】GAN 逆映射 综述 GAN Inversion: A Survey 关键部分翻译 研读思考

一个人的意义不在于他的成就,而在于他所企求成就的东西

前言


本文未完成,感兴趣请收藏。

我渐渐意识到翻译这篇文章是件愚蠢的事,我本来就能读懂,为什么还要翻译一遍,只是单纯地在浪费生命。
从第二部分开始我只记录有意义的部分。

GAN的研究近年来层出不穷,来看引用240篇文章对GAN逆映射进行的盘点。本文为不改变意思,选取原文重点部分进行的整理,欢迎在评论区进行讨论。

GAN Inversion: A Survey.

Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, Ming-Hsuan Yang.

github 链接:[Papers on Generative Modeling] arxiv 2021. [PDF]


综述

摘要

GAN逆映射指将一张给定图片转换回一个预训练GAN模型的隐空间,生成器可以用其逆映射码得到可靠的图像重建。作为联接虚假与真实图像的新兴技术,GAN逆映射在例如StyleGAN, BigGAN这类预训练GAN模型对真实图像的编辑中起到重要作用。同时,GAN逆映射也对我们解释GAN的隐空间,洞悉逼真的图片是怎样生成的起到帮助。

本文通过聚焦GAN逆映射方向的最新算法与应用,来为读者提供对该领域的总体了解,涵盖了GAN逆映射的重要技术以及它们在图像恢复、图像处理方面的应用。我们进一步阐述了该领域未来发展方向的趋势和挑战。

索引项 - 生成对抗网络,可解释机器学习,图像重建,图像处理

1. 介绍

生成对抗网络(GAN)框架是一个估算数据点如何在概率性框架中生成的深度学习架构,它主要包含两个相互作用的神经网络:一个生成器G和一个判别器D,两者在一个对抗式的过程中联合训练。G的训练目标是生成出相似于真实数据的虚假数据,而D的训练目标是甄别真实数据与虚假数据。通过对抗训练,G可以生成符合真实数据分布的虚假数据。近年来,GAN被应用在从图像转换,图像处理到图像复原的大量任务。

大量GAN模型,例如 PCGAN,BigGAN,StyleGAN已经被研发应用于从随机噪声输入生成高质量多样化的图像。近来研究表明,GAN可以在中间特征和隐空间有效编码富语义信息。这些理论可以通过改变隐空间编码,合成具有变化多样特性的图像。然而,因为GAN缺乏推理能力与编码器,这种处理只能应用于从GAN生成的图像,不能用于真实图像。

相反,GAN逆映射旨在将给定的图像转换回预训练的GAN模型的隐空间中,图像随后可以被生成器的逆向编码忠实重建。GAN逆映射使得现有的隐空间中发现的可控方向能够适用于真实的图像编辑,而不需要点对点的监督或耗费高的优化。如图Fig. 1. 所示,在一张真实图像被传入隐空间后,我们可以变化它某一特定方向的编码来修改响应的图像属性。随着GAN与可解释机器学习技术的快速发展,GAN逆映射不仅提供一个可供替代的灵活的图像编辑框架,也帮助我们理解深度生成模型的内在机制。

Fig. 1.
Fig. 1. GAN逆映射图示。与生成器G的常规采样与生成过程不同,GAN逆映射将一张给定的图片x映射到隐空间,取得隐编码 z ∗ z^* z。依据 x ∗ = G ( z ∗ ) x^* = G(z^*) x=G(z)可得重建图片 x ∗ x^* x。通过在不同的可解释方向上变化隐编码 z ∗ z^* z,例如 z ∗ + n 1 z^* + n_1 z+n1 z ∗ + n 2 z^* + n_2 z+n2,其中 n 1 n_1 n1 n 2 n_2 n2分别在隐空间塑造年龄和微笑,我们可以编辑真实图像相应的属性。

本文首次对GAN逆映射进行了一个综合考察。首先,我们对于GAN逆映射的所有方面层次化、结构化地进行了综合性系统性的回顾并分析了内在原理。第二,我们对GAN逆映射方法进行了比较总结。第三,我们讨论了挑战和开放性问题,并发现了未来研究的趋势。

2. 准备工作

2.1 问题定义

一个非条件GAN的生成器学习映射 G : Z → X G: Z \to X G:ZX,当 z 1 , z 2 ∈ Z z_1,z_2 \in Z z1,z2Z Z Z Z空间接近时,相应的图像 x 1 , x 2 ∈ X x_1,x_2 \in X x1,x2X也看起来相似,GAN逆映射将数据 x x x映射回潜在表征 z ∗ z^* z,或者找到一个完全可以由训练有素的生成器 G G G合成的,保持接近真实图像 x x x的图像 x ∗ x^* x

记录需要逆映射的信号为 x ∈ R n x \in \R^n xRn,生成模型 G : R n 0 → R n G: \R^{n_0}\to \R^{n} G:Rn0Rn ,隐向量 z ∈ R n 0 \textbf{z} \in \R^{n_0} zRn0,得到下列逆映射问题: z ∗ = a r g m i n ℓ ( G ( z ) , x ) , \textbf{z}^*= argmin\ell(G(z),x), z=argmin(G(z),x), 其中 ℓ ( ⋅ ) \ell(\cdot) ()是图像或特征空间的距离度量, G G G为前馈神经网络。通常地, ℓ ( ⋅ ) \ell(\cdot) ()可以基于 ℓ 1 \ell_1 1 ℓ 2 \ell_2 2,perceptual,LPIPS。因为 G ( z ) G(\textbf{z} ) G(z)的非凸性这也通常被认为是一个非凸问题,GAN逆映射理论聚焦于图像内容的重建。

2.2 训练的GAN模型和数据集

近来研究包括DCGAN [31], WGAN [32], PGGAN [14], BigGAN [15], StyleGAN [16] ,StyleGAN2 [17] 。数据集包括面部数据集(CelebAHQ [14], FFHQ [16], [17], AnimeFaces [33] ,AnimalFace [34]), 场景数据集(LSUN [35]), 和目标数据集 (LSUN [35],ImageNet [36]).

2.2.1 GAN模型

DCGAN在判别器中使用卷积,在生成器中使用分数步长卷积。


文章资料

作者的github中整理了大量的资料,推荐进行阅读,我将GAN系列的原文列在下面,方便难以访问原连接的朋友。

inverted pretrained model

StyleGAN2-Ada: Training Generative Adversarial Networks with Limited Data.

Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, Timo Aila.

NeurIPS 2020. [PDF] [Github] [Steam StyleGAN2-ADA]

StyleGAN2: Analyzing and Improving the Image Quality of StyleGAN.

Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila.

CVPR 2020. [PDF] [Offical TF][PyTorch][Unoffical Tensorflow 2.0]

A Style-Based Generator Architecture for Generative Adversarial Networks.

Tero Karras, Samuli Laine, Timo Aila.

CVPR 2019. [PDF] [Offical TF]

Progressive Growing of GANs for Improved Quality, Stability, and Variation.

Tero Karras, Timo Aila, Samuli Laine, Jaakko Lehtinen.

ICLR 2018. [PDF] [Offical TF]

inversion paper

Only a Matter of Style: Age Transformation Using a Style-Based Regression Model.

Yuval Alaluf, Or Patashnik, Daniel Cohen-Or.

arxiv 2021. [PDF] [Github]

e4e: Designing an Encoder for StyleGAN Image Manipulation.

Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, Daniel Cohen-Or.

arxiv 2021. [PDF] [Github]

Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search.

Federico A. Galatolo, Mario G.C.A. Cimino, Gigliola Vaglini.

arxiv 2021. [PDF]

Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation.

Peiye Zhuang, Oluwasanmi Koyejo, Alexander G. Schwing.

ICLR 2021. [PDF]

Exploring Adversarial Fake Images on Face Manifold.

Dongze Li, Wei Wang, Hongxing Fan, Jing Dong.

arxiv 2021. [PDF]

OSTeC: One-Shot Texture Completion.

Baris Gecer, Jiankang Deng, Stefanos Zafeiriou.

arxiv 2021. [PDF] [Github]

Improved StyleGAN Embedding: Where are the Good Latents?

Peihao Zhu, Rameen Abdal, Yipeng Qin, Peter Wonka.

arxiv 2020. [PDF]

Learning a Deep Reinforcement Learning Policy Over the Latent Space of a Pre-trained GAN for Semantic Age Manipulation.

Kumar Shubham, Gopalakrishnan Venkatesh, Reijul Sachdev, Akshi, Dinesh Babu Jayagopi, G. Srinivasaraghavan.

arxiv 2020. [PDF]

Lifting 2D StyleGAN for 3D-Aware Face Generation.

Yichun Shi, Divyansh Aggarwal, Anil K. Jain.

arxiv 2020. [PDF]

Navigating the GAN Parameter Space for Semantic Image Editing.

Anton Cherepkov, Andrey Voynov, Artem Babenko.

arxiv 2020. [PDF] [Github]

Augmentation-Interpolative AutoEncoders for Unsupervised Few-Shot Image Generation.

Davis Wertheimer, Omid Poursaeed, Bharath Hariharan.

arxiv 2020. [PDF]

Mask-Guided Discovery of Semantic Manifolds in Generative Models.

Mengyu Yang, David Rokeby, Xavier Snelgrove.

Workshop on Machine Learning for Creativity and Design (NeurIPS) 2020. [PDF] [Github]

Unsupervised Discovery of Disentangled Manifolds in GANs.

Yu-Ding Lu, Hsin-Ying Lee, Hung-Yu Tseng, Ming-Hsuan Yang.

arxiv 2020. [PDF]]

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation.

Zongze Wu, Dani Lischinski, Eli Shechtman.

arxiv 2020. [PDF]

DeepLandscape: Adversarial Modeling of Landscape Videos.

E. Logacheva, R. Suvorov, O. Khomenko, A. Mashikhin, and V. Lempitsky.

ECCV 2020. [PDF] [Github] [Project]

Learning a Deep Reinforcement Learning Policy Over the Latent Space of a Pre-trained GAN for Semantic Age Manipulation.

Kumar Shubham, Gopalakrishnan Venkatesh, Reijul Sachdev, Akshi, Dinesh Babu Jayagopi, G. Srinivasaraghavan.

arxiv 2020. [PDF]

DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs.

yaxing wang, Lu Yu, Joost van de Weijer.

NeurIPS 2020. [PDF] [Github]

GAN Steerability without optimization.

Nurit Spingarn-Eliezer, Ron Banner, Tomer Michaeli.

arxiv 2020. [OpenReview] [PDF]

On The Inversion Of Deep Generative Models (When and How Can Deep Generative Models be Inverted?).

Aviad Aberdam, Dror Simon, Michael Elad.

ICLR 2021. [PDF] [OpenReview]

PIE: Portrait Image Embedding for Semantic Control.

A. Tewari, M. Elgharib, M. BR, F. Bernard, H-P. Seidel, P. P‌érez, M. Zollhöfer, C.Theobalt.

SIGGRAPH Asia 2020. [PDF] [Project]

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation.

Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or.

CVPR 2021. [PDF] [Github] [Project]

Understanding the Role of Individual Units in a Deep Neural Network.

David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou, Antonio Torralba.

National Academy of Sciences 2020. [PDF] [Github] [Project]

Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2 Network.

Jialu Huang, Jing Liao, Sam Kwong.

arxiv 2020. [PDF]

SeFa: Closed-Form Factorization of Latent Semantics in GANs.

Yujun Shen, Bolei Zhou.

arxiv 2020. [PDF] [Github] [Project]

Collaborative Learning for Faster StyleGAN Embedding.

Shanyan Guan, Ying Tai, Bingbing Ni, Feida Zhu, Feiyue Huang, Xiaokang Yang.

arxiv 2020. [PDF]

Disentangling in Latent Space by Harnessing a Pretrained Generator.

Yotam Nitzan, Amit Bermano, Yangyan Li, Daniel Cohen-Or.

arxiv 2020. [PDF]

Face Identity Disentanglement via Latent Space Mapping.

Yotam Nitzan, Amit Bermano, Yangyan Li, Daniel Cohen-Or.

arxiv 2020. [PDF] [Github]

Transforming and Projecting Images into Class-conditional Generative Networks.

Minyoung Huh, Richard Zhang, Jun-Yan Zhu, Sylvain Paris, Aaron Hertzmann.

arxiv 2020. [PDF] [Github] [Project]

Interpreting the Latent Space of GANs via Correlation Analysis for Controllable Concept Manipulation.

Ziqiang Li, Rentuo Tao, Hongjing Niu, Bin Li.

arxiv 2020. [PDF]

GANSpace: Discovering Interpretable GAN Controls.

Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, Sylvain Paris.

arxiv 2020. [PDF] [Github]

MimicGAN: Robust Projection onto Image Manifolds with Corruption Mimicking.

Rushil Anirudh, Jayaraman J. Thiagarajan, Bhavya Kailkhura, Timo Bremer.

IJCV 2020. [PDF]

StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows.

Rameen Abdal, Peihao Zhu, Niloy Mitra, Peter Wonka.

Siggraph Asia 2020. [PDF] [Github]

Rewriting a Deep Generative Model.

David Bau, Steven Liu, Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba.

ECCV 2020. [PDF] [Github]

StyleGAN2 Distillation for Feed-forward Image Manipulation.

Yuri Viazovetskyi, Vladimir Ivashkin, Evgeny Kashin.

ECCV 2020. [PDF] [Github]

In-Domain GAN Inversion for Real Image Editing.

Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou.

ECCV 2020. [PDF] [Project] [Github]

Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation.

Xingang Pan, Xiaohang Zhan, Bo Dai, Dahua Lin, Chen Change Loy, Ping Luo.

ECCV 2020. [PDF] [Github]

On the “steerability” of generative adversarial networks.

Ali Jahanian, Lucy Chai, Phillip Isola.

ICLR 2020. [PDF] [Project]

Unsupervised Discovery of Interpretable Directions in the GAN Latent Space.

Andrey Voynov, Artem Babenko.

ICML 2020. [PDF] [Github]

Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models.

Giannis Daras, Augustus Odena, Han Zhang, Alexandros G. Dimakis.

CVPR 2020. [PDF]

A Disentangling Invertible Interpretation Network for Explaining Latent Representations.

Patrick Esser, Robin Rombach, Björn Ommer.

CVPR 2020. [PDF] [Project] [Github]

Editing in Style: Uncovering the Local Semantics of GANs.

Edo Collins, Raja Bala, Bob Price, Sabine Süsstrunk.

CVPR 2020. [PDF] [Github]

Image Processing Using Multi-Code GAN Prior.

Jinjin Gu, Yujun Shen, Bolei Zhou.

CVPR 2020. [PDF] [Project] [Github]

Interpreting the Latent Space of GANs for Semantic Face Editing.

Yujun Shen, Jinjin Gu, Xiaoou Tang, Bolei Zhou.

CVPR 2020. [PDF] [Project] [Github]

Image2StyleGAN++: How to Edit the Embedded Images?

Rameen Abdal, Yipeng Qin, Peter Wonka.

CVPR 2020. [PDF]

Semantic Photo Manipulation with a Generative Image Prior.

David Bau, Hendrik Strobelt, William Peebles, Jonas, Bolei Zhou, Jun-Yan Zhu, Antonio Torralba.

SIGGRAPH 2019. [PDF]

Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?

Rameen Abdal, Yipeng Qin, Peter Wonka.

ICCV 2019. [PDF]

Seeing What a GAN Cannot Generate.

David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, Antonio Torralba.

ICCV 2019. [PDF] [PDF]

GAN-based Projector for Faster Recovery with Convergence Guarantees in Linear Inverse Problems.

Ankit Raj, Yuqi Li, Yoram Bresler.

ICCV 2019. [PDF]

Inverting Layers of a Large Generator.

David Bau, Jun-Yan Zhu, Jonas Wulff, William Peebles, Hendrik Strobelt, Bolei Zhou, Antonio Torralba.

ICCV 2019. [PDF]

Inverting The Generator Of A Generative Adversarial Network (II).

Antonia Creswell, Anil A Bharath.

TNNLS 2018. [PDF] [Github]

Invertibility of Convolutional Generative Networks from Partial Measurements.

Fangchang Ma, Ulas Ayaz, Sertac Karaman.

NeurIPS 2018. [PDF] [Github]

Metrics for Deep Generative Models.

Nutan Chen, Alexej Klushyn, Richard Kurle, Xueyan Jiang, Justin Bayer, Patrick van der Smagt.

AISTATS 2018. [PDF]

Towards Understanding the Invertibility of Convolutional Neural Networks.

Anna C. Gilbert, Yi Zhang, Kibok Lee, Yuting Zhang, Honglak Lee.

IJCAI 2017. [PDF]

One Network to Solve Them All - Solving Linear Inverse Problems using Deep Projection Models.

J. H. Rick Chang, Chun-Liang Li, Barnabas Poczos, B. V. K. Vijaya Kumar, Aswin C. Sankaranarayanan.

ICCV 2017. [PDF]

Inverting The Generator Of A Generative Adversarial Network.

Antonia Creswell, Anil Anthony Bharath.

NIPSW 2016. [PDF]

Generative Visual Manipulation on the Natural Image Manifold.

Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, Alexei A. Efros.

ECCV 2016. [PDF]

application:Compressed Sensing

Generator Surgery for Compressed Sensing.

Niklas Smedemark-Margulies, Jung Yeon Park, Max Daniels, Rose Yu, Jan-Willem van de Meent, Paul Hand.

arxiv 2021. [PDF] [Github]

Task-Aware Compressed Sensing with Generative Adversarial Networks.

Maya Kabkab, Pouya Samangouei, Rama Chellappa.

AAAI 2018. [PDF]


  • 20
    点赞
  • 61
    收藏
    觉得还不错? 一键收藏
  • 5
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值