Generative models
- Auto-regressive Model(Component-by-Component)
- What is the best order for the component?
- Slow generation
- Variational Auto-encoder
- Optimizing the lower bound.
- Generative Adversasial Nerwork
- Unstable training
- Flow-based Generative Model
Flow-based Generative Model
A generator G is a network.The network defines a probability distribution p G p_G pG
z ∼ π ( z ) z \sim \pi(z) z∼π(z)
x = G ( z ) ∼ p G ( x ) x = G(z) \sim p_G(x) x=G(z)∼pG(x)
p G ( x ) = π ( z ) ∣ d e t ( J G − 1 ) ∣ p_G(x) = \pi(z)|det(J_{G^{-1}})| pG(x)=π(z)∣det(JG−1)∣
G ∗ = a r g m a x G ∑ i = 1 m l o g p G ( x i ) , { x 1 , x 2 , . . . , x m } f r o m p d a t a ( x ) G^* = arg max_G\sum_{i=1}^mlogp_G(x^i),\space \{x^1,x^2,...,x^m\}from\space p_{data}(x) G∗=argmaxGi=1∑mlogpG(xi), {x1,x2,...,xm}from pdata(x)
= a r g m a x G ∑ i = 1 m ( l o g π ( z ) + l o g ∣ d e t ( J f − 1 ) ∣ ) =argmax_G\sum_{i=1}^m(log\pi(z) + log|det(J_{f^{-1}})|) =argmaxGi=1∑m(logπ(z)+log∣det(Jf−1)∣)
= a r g m a x G ∑ i = 1 m ( l o g π ( G − 1 ( x i ) ) + l o g ∣ d e t ( J G − 1 ) ∣ ) =argmax_G\sum_{i=1}^m(log\pi(G^{-1}(x_i))+log|det(J_{G^{-1}})|) =argmaxGi=1∑m(logπ(G−1(xi))+log∣det(JG−1)∣)
≈ a r g m i n G K L ( p d a t a ∣ ∣ p G ) \approx argmin_{G}KL(p_{data}||p_G) ≈argminGKL(pdata∣∣pG)
Chain of generator G h , h = 1 , . . . , K G_h,h= 1,...,K Gh,h=1,...,K:
l o g p ( x i ) = l o g π ( z i ) + ∑ h = 1 K l o g ( d e t ( J G h − 1 ) ) logp(x_i) = log\pi(z_i) + \sum_{h=1}^Klog(det(J_{G_h^{-1}})) logp(xi)=logπ(zi)+h=1∑Klog(det(JGh−1))
to be continued…