论文总结
文章目录
AdaGAN: Boosting Generative Models
A Connection Between GAN, IRL and EBM
###Motivation
Then maximizing likelihood will lead to a distribution which “covers” all of the modes, but puts most of its mass in parts of the space that have negligible density under the data distribution.
A generator trained adversarially will instead try to “fill in” as many of modes as it can.
Complex multimodal distribution很难通过maximize likelihood评估密度分布。它会将尽可能地覆盖 P d a t a P_{data} Pdata而使其 P G P_{G} PG主要分布在各个mode之间的空白区域;而GAN则会尽量填满 P d a t a P_{data} Pdata而使 P G P_G PG更接近 P d a t a P_{data} Pdata。
###Summary
IRL methods are in fact mathematically equivalent to GANs.
In particular, a sample-based algorithm for maximum entropy IRL and a GAN.
Definition
Boltzmann distribution
p θ ( τ ) = 1 Z e − E θ ( τ ) p_{\theta}(\tau)=\frac{1}{Z}e^{-E_{\theta}(\tau)} pθ(τ)=Z1e−Eθ(τ).
Partition function
Z = ∫ e − E θ ( x ) d x Z=\int{e^{-E_{\theta}(x)}dx} Z=∫e−Eθ(x)dx.
Discriminator loss
L d i s c r i m i n a t o r ( D ) = E x ∼ p [ − log D ( x ) ] + E x ∼ G [ − log ( 1 − D ( x ) ) ] \mathcal{L}_{discriminator}(D)=\Bbb{E}_{x \sim p}[-\log D(x)]+\Bbb{E}_{x \sim G}[-\log (1-D(x))] Ldiscriminator(D)=Ex∼p[−logD(x)]+Ex∼G[−log(1−D(x))].
Generator loss
L g e n e r a t o r ( G ) = E x ∼ G [ − log D ( x ) ] + E x ∼ G [ log ( 1 − D ( x ) ) ] \mathcal{L}_{generator}(G)=\Bbb{E}_{x \sim G}[-\log D(x)]+\Bbb{E}_{x \sim G}[\log (1-D(x))] Lgenerator(G)=Ex∼G[−logD(x)]+Ex∼G[log(1−D(x))].
Calculation
使用cost function c θ c_{\theta} cθ表示energy E θ E_{\theta} Eθ。
Z = ∫ e − c θ ( τ ) d τ Z=\int{e^{-c_{\theta}(\tau)}d\tau} Z=∫e−cθ(τ)dτ
使用sampling distribution q ( τ ) q(\tau) q(τ)估算 Z Z Z。
L c o s t ( θ ) = E τ ∼ p [ c θ ( τ ) ] + log ( E τ ∼ q [ e − c θ ( τ ) q ( τ ) ] ) \mathcal{L}_{cost}(\theta)=\Bbb{E}_{\tau \sim p}[c_{\theta}(\tau)]+\log{(\Bbb{E}_{\tau \sim q}[\frac{e^{-c_{\theta}(\tau)}}{q(\tau)}])} Lcost(θ)=Eτ∼p[cθ(τ)]+log(Eτ∼q[q(τ)e−cθ(τ)])
通常最小化 q ( τ ) q(\tau) q(τ)与 1 Z e − c θ ( τ ) \frac{1}{Z}e^{-c_{\theta(\tau)}} Z1e−cθ(τ)的KL散度来更新 q ( τ ) q(\tau) q(τ),其等价于最小化learned cost同时最大化熵值(最小化交叉熵等价于最小化KL散度,花书P49)。
L s a m p l e r ( q ) = E τ ∼ q [ c θ ( τ ) ] + E τ ∼ q [ log q ( τ ) ] \mathcal{L}_{sampler}(q)=\Bbb{E}_{\tau \sim q}[c_{\theta}(\tau)]+\Bbb{E}_{\tau \sim q}[\log{q(\tau)}] Lsampler(q)=Eτ∼q[cθ(τ)]+Eτ∼q[logq(τ)]
为了防止