数学公式推导_归一化指数函数_softmax

数学公式推导_归一化指数函数_softmax

核心
S ( y i ) = e y i Σ e y j j    S\left( y_i \right) =\frac{e^{y_i}}{\underset{j\,\, }{\varSigma e^{y_j}}} S(yi)=jΣeyjeyi
Derivative:
p i = e a i Σ k = 1 N e a k p_i=\frac{e^{a_i}}{\varSigma _{k=1}^{N}e^{a_k}} pi=Σk=1Neakeai
∂ p i ∂ a i = ∂ e a i Σ k = 1 N e a k ∂ a j \frac{\partial p_i}{\partial a_i}=\frac{\partial \frac{e^{a_i}}{\varSigma _{k=1}^{N}e^{a_k}}}{\partial a_j} aipi=ajΣk=1Neakeai

g ( x ) = e a i g\left( x \right) =e^{a_i} g(x)=eai
h ( x ) = Σ k = 1 N e a k h\left( x \right) =\varSigma _{k=1}^{N}e^{a_k} h(x)=Σk=1Neak

结合复合函数的求导公式
f ′ ( x ) = g ′ ( x ) h ( x ) − h ′ ( x ) g ( x ) h 2 ( x ) f'\left( x \right) =\frac{g'\left( x \right) h\left( x \right) -h'\left( x \right) g\left( x \right)}{h^2\left( x \right)} f(x)=h2(x)g(x)h(x)h(x)g(x)

分类讨论
when i 等于 j 正数
∂ e a i Σ k = 1 N e a k ∂ a j = e a i Σ k = 1 N e a k − e a j e a i ( Σ k = 1 N e a k ) 2    = e a i ( Σ k = 1 N e a k − e a j ) ( Σ k = 1 N e a k ) 2    = e a j Σ k = 1 N e a k − ( Σ k = 1 N e a k − e a j ) Σ k = 1 N e a k = p j ( 1 − p j ) \frac{\partial \frac{e^{a_i}}{\varSigma _{k=1}^{N}e^{a_k}}}{\partial a_j}=\frac{e^{a_i}\varSigma _{k=1}^{N}e^{a_k}-e^{a_j}e^{a_i}}{\left( \varSigma _{k=1}^{N}e^{a_k} \right) ^2} \\ \,\, =\frac{e^{a_i}\left( \varSigma _{k=1}^{N}e^{a_k}-e^{a_j} \right)}{\left( \varSigma _{k=1}^{N}e^{a_k} \right) ^2} \\ \,\, =\frac{e^{a_j}}{\varSigma _{k=1}^{N}e^{a_k}}-\frac{\left( \varSigma _{k=1}^{N}e^{a_k}-e^{a_j} \right)}{\varSigma _{k=1}^{N}e^{a_k}} \\ =p_j\left( 1-p_j \right) ajΣk=1Neakeai=(Σk=1Neak)2eaiΣk=1Neakeajeai=(Σk=1Neak)2eai(Σk=1Neakeaj)=Σk=1NeakeajΣk=1Neak(Σk=1Neakeaj)=pj(1pj)

when i 不等于 j 负数
∂ e a i Σ k = 1 N e a k ∂ a j = 0 × Σ k = 1 N e a k − e a j e a i ( Σ k = 1 N e a k ) 2    = e a j Σ k = 1 N e a k × − e a j Σ k = 1 N e a k = − p i p j \frac{\partial \frac{e^{a_i}}{\varSigma _{k=1}^{N}e^{a_k}}}{\partial a_j}=\frac{0\times \varSigma _{k=1}^{N}e^{a_k}-e^{a_j}e^{a_i}}{\left( \varSigma _{k=1}^{N}e^{a_k} \right) ^2} \\ \,\, =\frac{e^{a_j}}{\varSigma _{k=1}^{N}e^{a_k}}\times \frac{-e^{a_j}}{\varSigma _{k=1}^{N}e^{a_k}} \\ =-p_ip_j ajΣk=1Neakeai=(Σk=1Neak)20×Σk=1Neakeajeai=Σk=1Neakeaj×Σk=1Neakeaj=pipj

注:

  • 吐槽一下CSDN对LaTex的兼容性,很多语法存在bug
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值