CS231n
Lecture 13: Generative Models
Unsupervised Learning
相比于从有标注的训练数据中学习 f:x↦y f : x ↦ y 的有监督学习,无监督学习旨在学习无标注数据的隐含结构,包括聚类(K-means)、降维(PCA)、特征学习(Auto-encode)、密度估计等
Generative Models: Given training data, generate new samples from same distribution
实际上是一个密度估计问题(学习
pmodel(x)∼pdata(x)
p
m
o
d
e
l
(
x
)
∼
p
d
a
t
a
(
x
)
),包括两种方式
- 显式:显式定义并求解 pmodel(x) p m o d e l ( x )
- 隐式:学习一个依 pmodel(x) p m o d e l ( x ) 采样的模型而不显示定义它
应用
- artwork, super-resolution, colorization, etc
- Generative models of time-series data can be used for simulation and planning (reinforcement learning applications!)
- Training generative models can also enable inference of latent representations that can be useful as general features(这是个好思路)
分类
PixelRNN
从图像的左上开始从左到右从上到下依次遍历所有像素点,用RNN建模其中的依赖关系
PixelCNN
同PixelRNN只不过用CNN建模依赖关系
Training is faster than PixelRNN
显式建模的优势
- explicitly compute likelihood
p(x) - Explicit likelihood of training
data gives good evaluation
metric - Good samples
缺点:Sequential generation ⇒ ⇒ slow
Variational Auto-Encoder
Autoencoder:
x−→−−−encoderz−→−−−decoderx^,L(x)=∥x−x^∥2
x
→
encoder
z
→
decoder
x
^
,
L
(
x
)
=
‖
x
−
x
^
‖
2
, learning a lower-dimensional feature representation
z
z
from unlabeled training data . After training, throw away decoder, Encoder can be used to initialize a supervised model
Try generating new images from an autoencoder
⇒
⇒
VAE
- Choose prior p(z) p ( z ) to be simple, e.g. Gaussian ⇒p(z)∼N(0,1) ⇒ p ( z ) ∼ N ( 0 , 1 )
- Conditional p(x|z) p ( x | z ) is complex (generates image) ⇒ ⇒ represent with neural network
Train
理论上
p(z)=∫pθ(z)pθ(x|z)dz
p
(
z
)
=
∫
p
θ
(
z
)
p
θ
(
x
|
z
)
d
z
,但是没法对每一个
z
z
求解相应的,而且后验概率
解决方案:在VAE decoder模型的基础上再定义一个encoder qϕ(z|x) q ϕ ( z | x ) 以逼近 pθ(z|x) p θ ( z | x ) 。
编码器和解码器都是概率上的,输入 x x 编码器后根据得到 z|x∼N(μz|x,Σz|x) z | x ∼ N ( μ z | x , Σ z | x ) ,即隐空间中的表示;再由解码器 pθ(x|z) p θ ( x | z ) 将隐空间中的变量 z z 映射回输入空间得到,所以编码器和解码器也叫recognition/inference和generation网络
推导
前一部分是编码器的loss,后一部分是解码器的loss,二者同时训练。训练完毕后,仅使用解码器,就可以进行数据生成