Lecture 13: Generative Models

CS231n

Lecture 13: Generative Models

Unsupervised Learning

相比于从有标注的训练数据中学习 f:xy f : x ↦ y 的有监督学习,无监督学习旨在学习无标注数据的隐含结构,包括聚类(K-means)、降维(PCA)、特征学习(Auto-encode)、密度估计等

Generative Models: Given training data, generate new samples from same distribution
实际上是一个密度估计问题(学习 pmodel(x)pdata(x) p m o d e l ( x ) ∼ p d a t a ( x ) ),包括两种方式

  • 显式:显式定义并求解 pmodel(x) p m o d e l ( x )
  • 隐式:学习一个依 pmodel(x) p m o d e l ( x ) 采样的模型而不显示定义它

应用

  • artwork, super-resolution, colorization, etc
  • Generative models of time-series data can be used for simulation and planning (reinforcement learning applications!)
  • Training generative models can also enable inference of latent representations that can be useful as general features(这是个好思路)

分类

Generative modelsexplicit densityTractable density:Pixel RNN/CNNapproximate density{Variational:Variational Auto-EncoderMarkov chain: Boltzmann Machineimplicit density{Direct: GANMarkov chain: GSN Generative models { explicit density { Tractable density:Pixel RNN/CNN approximate density { Variational:Variational Auto-Encoder Markov chain: Boltzmann Machine implicit density { Direct: GAN Markov chain: GSN

PixelRNN

从图像的左上开始从左到右从上到下依次遍历所有像素点,用RNN建模其中的依赖关系

PixelCNN

同PixelRNN只不过用CNN建模依赖关系
Training is faster than PixelRNN
显式建模的优势

  • explicitly compute likelihood
    p(x)
  • Explicit likelihood of training
    data gives good evaluation
    metric
  • Good samples

缺点:Sequential generation slow

Variational Auto-Encoder

Autoencoder: xencoderzdecoderx^,L(x)=xx^2 x → encoder z → decoder x ^ , L ( x ) = ‖ x − x ^ ‖ 2 , learning a lower-dimensional feature representation z z from unlabeled training data x. After training, throw away decoder, Encoder can be used to initialize a supervised model
Try generating new images from an autoencoder VAE

zpθ(z)VAExpθ(x|z) z ∼ p θ ∗ ( z ) → VAE x ∼ p θ ∗ ( x | z )

  • Choose prior p(z) p ( z ) to be simple, e.g. Gaussian p(z)N(0,1) ⇒ p ( z ) ∼ N ( 0 , 1 )
  • Conditional p(x|z) p ( x | z ) is complex (generates image) represent with neural network

Train
理论上 p(z)=pθ(z)pθ(x|z)dz p ( z ) = ∫ p θ ( z ) p θ ( x | z ) d z ,但是没法对每一个 z z 求解相应的p(x|z),而且后验概率

pθ(z|x)=pθ(x|z)pθ(z)pθ(x) p θ ( z | x ) = p θ ( x | z ) p θ ( z ) p θ ( x )
也无法求解
解决方案:在VAE decoder模型的基础上再定义一个encoder qϕ(z|x) q ϕ ( z | x ) 以逼近 pθ(z|x) p θ ( z | x )
编码器和解码器都是概率上的,输入 x x 编码器后根据qϕ(z|x)得到 z|xN(μz|x,Σz|x) z | x ∼ N ( μ z | x , Σ z | x ) ,即隐空间中的表示;再由解码器 pθ(x|z) p θ ( x | z ) 将隐空间中的变量 z z 映射回输入空间得到x|zN(μx|z,Σx|z),所以编码器和解码器也叫recognition/inference和generation网络
推导
logpθ(x)=Ezqϕ(z|x)(logpθ(x))=Ez(logpθ(x|z)pθ(z)pθ(z|x))maxEz(logpθ(x|z))DKL(qϕ(z|x)||pθ(z)) log ⁡ p θ ( x ) = E z ∼ q ϕ ( z | x ) ( log ⁡ p θ ( x ) ) = E z ( log ⁡ p θ ( x | z ) p θ ( z ) p θ ( z | x ) ) ⇒ max E z ( log ⁡ p θ ( x | z ) ) − D K L ( q ϕ ( z | x ) | | p θ ( z ) )

前一部分是编码器的loss,后一部分是解码器的loss,二者同时训练。训练完毕后,仅使用解码器,就可以进行数据生成

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值