Gumbel 重参数化相关性质证明

Gumbel 的采样过程:

z = a r g m a x i { g i + l o g ( π i ) } , g i = − l o g ( − l o g ( u i ) ) , u i ∼ U ( 0 , 1 ) z=argmax_i \{g_i + log(\pi_i)\}, g_i = -log(-log(u_i)),u_i\sim U(0, 1) z=argmaxi{gi+log(πi)},gi=log(log(ui)),uiU(0,1)

采样得到的随机变量满足一下分布:

g i ∼ G u m b l e ( 0 , 1 ) ( 1 ) g_i \sim Gumble(0, 1) \quad (1) giGumble(0,1)(1)

h i = g i + l o g ( π i ) ∼ G u m b l e ( l o g ( π i ) , 1 ) ( 2 ) h_i = g_i + log(\pi_i)\sim Gumble(log(\pi_i), 1) \quad (2) hi=gi+log(πi)Gumble(log(πi),1)(2)

证明过程:

P ( u ) = P ( U ≤ u ) = u , u ∈ ( 0 , 1 ) P(u) =P(U\le u)= u, u\in(0, 1) P(u)=P(Uu)=u,u(0,1)

G = − l o g ( − l o g ( U ) ) , u ∈ ( 0 , 1 ) G = -log(-log(U)), u\in(0, 1) G=log(log(U)),u(0,1)

P ( g ) = P ( G ≤ g ) = P ( − l o g ( − l o g ( U ) ) ≤ g ) P(g) =P(G\le g) = P(-log(-log(U))\le g) P(g)=P(Gg)=P(log(log(U))g)

= P ( U ≤ e x p ( − e x p ( − g ) ) ) =P(U\le exp(-exp(-g))) =P(Uexp(exp(g)))

= e x p ( − e x p ( − g ) ) = exp(-exp(-g)) =exp(exp(g))

P ( g ) = e x p ( − e x p ( − g ) ) P(g) = exp(-exp(-g)) P(g)=exp(exp(g))

g i ∼ G u m b l e ( 0 , 1 ) g_i\sim Gumble(0, 1) giGumble(0,1)

h i = g i + l o g ( π i ) ∼ G u m b l e ( l o g ( π i ) , 1 ) h_i = g_i + log(\pi_i)\sim Gumble(log(\pi_i), 1) hi=gi+log(πi)Gumble(log(πi),1)

P ( Z = z ) = π i ( 3 ) P(Z=z) = \pi_i \quad(3) P(Z=z)=πi(3)

证明过程:

P ( Z = z ∣ U z = u z ) = ∏ i ≠ z P ( H i < g z + l o g ( π z ) ) P(Z=z | U_z = u_z) = \prod_{i\ne z} P(H_i < g_z + log(\pi_z)) P(Z=zUz=uz)=i=zP(Hi<gz+log(πz))

= ∏ i ≠ z P ( G i + l o g ( π i ) < g z + l o g ( π z ) ) =\prod_{i\ne z} P(G_i + log(\pi_i) < g_z + log(\pi_z)) =i=zP(Gi+log(πi)<gz+log(πz))

= ∏ i ≠ z P ( U i < u z p i / p z ) = \prod_{i\ne z} P(U_i < u_z^{p_i/p_z}) =i=zP(Ui<uzpi/pz)

= ∏ i ≠ z u z p i / p z = u z 1 / p z − 1 = \prod_{i\ne z} u_z^{p_i/p_z} = u_z^{1/p_z - 1} =i=zuzpi/pz=uz1/pz1

P ( Z = z ) = ∫ 0 1 P ( Z = z ∣ U z = u z ) P ( U z = u z ) d u z P(Z = z) = \int_0^1 P(Z=z|U_z = u_z)P(U_z=u_z) du_z P(Z=z)=01P(Z=zUz=uz)P(Uz=uz)duz

= ∫ 0 1 u z 1 / p z − 1 ∗ 1 ∗ d u z = \int_0^1 u_z^{1/p_z - 1} * 1 * du_z =01uz1/pz11duz

= 1 1 / p z u z 1 / p z ∣ 0 1 = \frac{1}{1/p_z}u_z^{1/p_z}|_0^1 =1/pz1uz1/pz01

= p z = p_z =pz

Z i − Z j ∼ L o g i s t c ( l o g ( π i ) − l o g ( π j ) , 1 ) ( 4 ) Z_i - Z_j \sim Logistc(log(\pi_i) - log(\pi_j), 1)\quad (4) ZiZjLogistc(log(πi)log(πj),1)(4)

证明过程:

X , Y X,Y X,Y分别代表 Z i , Z j Z_i,Z_j Zi,Zj.

F x ( x ) = e x p ( − e x p ( − ( x − l o g ( π x ) ) ) ) F_x(x) = exp(-exp(-(x-log(\pi_x)))) Fx(x)=exp(exp((xlog(πx))))

f X ( x ) = e x p ( − ( x − l o g ( π x ) ) − e x p ( − ( x − l o g ( π x ) ) ) ) f_X(x) = exp(-(x-log(\pi_x))-exp(-(x-log(\pi_x)))) fX(x)=exp((xlog(πx))exp((xlog(πx))))

F y ( y ) = e x p ( − e x p ( − ( y − l o g ( π y ) ) ) ) F_y(y) = exp(-exp(-(y-log(\pi_y)))) Fy(y)=exp(exp((ylog(πy))))

f Y ( y ) = e x p ( − ( y − l o g ( π y ) ) − e x p ( − ( x − l o g ( π y ) ) ) ) f_Y(y) = exp(-(y-log(\pi_y))-exp(-(x-log(\pi_y)))) fY(y)=exp((ylog(πy))exp((xlog(πy))))

Z = − Y Z = -Y Z=Y

F ( z ) = P ( Z < z ) = P ( − Y < z ) = P ( Y > − z ) = 1 − P ( Y < − z ) = 1 − e x p ( − e x p ( − ( − z − l o g ( π y ) ) ) ) F(z) = P(Z<z) = P(-Y < z) = P(Y > -z) = 1 - P(Y < -z) = 1 - exp(-exp(-(-z-log(\pi_y)))) F(z)=P(Z<z)=P(Y<z)=P(Y>z)=1P(Y<z)=1exp(exp((zlog(πy))))

= 1 − e x p ( − e x p ( z + l o g ( π y ) ) ) = 1 - exp(-exp(z + log(\pi_y))) =1exp(exp(z+log(πy)))

简化表达式,令 a = l o g ( π x ) , b = l o g ( π y ) a = log(\pi_x), b = log(\pi_y) a=log(πx),b=log(πy),则:

f z ( z ) = F ′ ( z ) = − e x p ( − e x p ( z + b ) ) ( − e x p ( z + b ) ) = e x p ( z + b − e x p ( z + b ) ) f_z(z) = F'(z) = -exp(-exp(z+b))(-exp(z + b)) = exp(z + b - exp(z+b)) fz(z)=F(z)=exp(exp(z+b))(exp(z+b))=exp(z+bexp(z+b))

Q = X − Y = X + Z Q = X - Y = X + Z Q=XY=X+Z

f q ( q ) = ∫ − ∞ + ∞ f ( x , q − x ) d x f_q(q) = \int_{-\infty}^{+\infty}f(x, q-x)dx fq(q)=+f(x,qx)dx

= ∫ − ∞ + ∞ f x ( x ) f z ( q − x ) d x = \int_{-\infty}^{+\infty} f_x(x)f_z(q-x)dx =+fx(x)fz(qx)dx

= ∫ − ∞ + ∞ e x p ( − ( x − a ) − e x p ( − ( x − a ) ) ) e x p ( q − x + b − e x p ( q − x + b ) ) d x = \int_{-\infty}^{+\infty} exp(-(x-a) -exp(-(x-a)))exp(q-x+b-exp(q-x+b))dx =+exp((xa)exp((xa)))exp(qx+bexp(qx+b))dx

x = − l o g ( − l o g ( u ) ) + a , u ∈ ( 0 , 1 ) x = -log(-log(u)) + a, u\in(0,1) x=log(log(u))+au(0,1), 则

d x = − 1 − l o g u − 1 u d u = − 1 u l o g u d u dx = -\frac{1}{-logu}\frac{-1}{u}du = -\frac{1}{ulogu}du dx=logu1u1du=ulogu1du

继续推导:

f q ( q ) = ∫ − ∞ + ∞ e x p ( l o g ( − l o g u ) − e x p ( l o g ( − l o g u ) ) ) e x p ( l o g ( − l o g u ) − a + q + b − e x p ( l o g ( − l o g u ) − a + q + b ) ) d x f_q(q) = \int_{-\infty}^{+\infty} exp(log(-logu) - exp(log(-logu))) exp (log(-logu) - a + q + b - exp(log(-logu) -a + q + b))dx fq(q)=+exp(log(logu)exp(log(logu)))exp(log(logu)a+q+bexp(log(logu)a+q+b))dx

= ∫ 0 1 ( − l o g u ) ( u ) ( − l o g u ) e x p ( − a + q + b ) e x p [ l o g u ∗ e x p ( − a + q + b ) ] − 1 u l o g u d u =\int_{0}^{1} (-logu)(u) (-logu)exp(-a+q+b)exp[logu * exp(-a +q + b)] \frac{-1}{ulogu}du =01(logu)(u)(logu)exp(a+q+b)exp[loguexp(a+q+b)]ulogu1du

c = e x p ( q − ( a − b ) ) c = exp(q-(a - b)) c=exp(q(ab)),则

f q ( q ) = ∫ 0 1 ( − l o g u ) ( u ) ( − l o g u ) ( c ) ( u c ) − 1 u l o g u d u f_q(q) = \int_{0}^{1} (-logu)(u)(-logu)(c)(u^c)\frac{-1}{ulogu}du fq(q)=01(logu)(u)(logu)(c)(uc)ulogu1du

= − c ∫ 0 1 u c l o g u d u = -c\int_{0}^{1} u^clogudu =c01uclogudu

= − c u c + 1 ( l o g u c + 1 − 1 ( c + 1 ) 2 ) ∣ 0 1 = -cu^{c + 1}(\frac{logu}{c + 1} - \frac{1}{(c+1)^2})|_{0}^{1} =cuc+1(c+1logu(c+1)21)01

= − c [ − 1 ( 1 + c ) 2 − 0 ] =-c[ - \frac{1}{(1+c)^2} - 0] =c[(1+c)210]

= c ( 1 + c ) 2 = \frac{c}{(1 + c)^2} =(1+c)2c

= e x p ( q − ( a − b ) ) ( 1 + e x p ( q − ( a − b ) ) 2 =\frac{exp(q - (a - b))}{(1 + exp(q - (a - b))^2} =(1+exp(q(ab))2exp(q(ab))

= e x p ( − ( q − ( a − b ) ) ) ( 1 + e x p ( − ( q − ( a − b ) ) ) 2 =\frac{exp(-(q - (a - b)))}{(1 + exp(-(q - (a - b)))^2} =(1+exp((q(ab)))2exp((q(ab)))

Q ∼ L o g s t i c ( a − b , 1 ) Q \sim Logstic(a - b, 1) QLogstic(ab,1)

  • 5
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值