GANs入门系列二

最新推荐文章于 2024-05-26 01:00:19 发布

chenhch8

最新推荐文章于 2024-05-26 01:00:19 发布

阅读量406

点赞数

本文链接：https://blog.csdn.net/deepinC/article/details/89309727

版权

这篇博客介绍了从Auto-Encoder (AE)、Variational Auto Encoder (VAE)到Generative Adversarial Networks (GANs)的基本概念和问题。文章通过对比AE/VAE的局限性，阐述了GANs的最大似然估计和基本思想，包括生成器G和判别器D的角色。还探讨了在实际训练中GAN存在的问题，如模式坍塌，并提出了相应的挑战。

摘要由CSDN通过智能技术生成

该博客是根据台大李宏毅老师的关于GAN的视频教程所整理的笔记，建议大家可以直接看这个老师的视频（因在youtube，故需翻墙）

文章目录

Generative Adversarial Network

Generative Adversarial Network

回顾

Auto-encoder (AE)

简单，但生成的图像的质量很no realistic，视觉上看会很模糊，更像是图像的求均运算

Variational Auto encoder (VAE)

NN Encoder会产生两个向量 $\sigma \in R^3$ , 然后再从一个正态分布中采样一个噪声 $\in R^3$ ，做 $c=\exp{(\sigma)} \odot e + m \in R^3$ ，最后将 $c$ 输入到NN Decoder中生成图像。优化的目标是使得 output 与 input 要尽可能地接近，同时为了引入正则项的目的是为了让 $\sigma, m$ 尽可能地接近于 0，这样在测试时就可以去掉NN Encoder而直接使用 $e$ 作为NN Decoder的输入了。

AE/VAE存在的问题：

即NN Decoder并不会产生真正realisctic的数据。例如图中的两张“7”，二者仅差了一个像素点，人会偏爱左边的，但对该模型来说，二者与真实的“7”均差了一个像素点，故无优劣之分

GAN

Maximum Likelihood Estimation

Given a data distribution $P_{data}(x)$
We have a distribution $P_G(x; \theta)$ parameterized by $\theta$
- $\it{E.g.}\ P_G(x; \theta)$ is a Gaussian Mixture Model, then $\theta$ are means and variances of the mixture Gaussian.
- goal: find $\theta$ that $P_G(x; \theta)$ close to $P_{data}(x)$
- note: $\color{#F00}{\text{the }x\text{ here donates the data sampling from (output of) distribution }P_{data} (P_G)\text{, not the input}}.$

Sampling $\{x^1,x^2,\dots,x^m\}$ from $P_{data}(x)$ , we can compute $P_G(x^i;\theta), i=1,2,\dots,m$ . Then Likelihood of generating the samples is: $L=\prod_{i=1}^mP_G(x^i;\theta)$ . Our goal is to find $\theta^*$ maximizing the likelihood: