-
1 GRU (Gated Recurrent Unit)
更新门(update gate):
z t = δ ( W ( z ) x t + U ( z ) h t − 1 ) z_t=\delta(W^{(z)}x_t+U^{(z)}h_{t-1}) zt=δ(W(z)xt+U(z)ht−1)
重置门(reset gate):
r z = δ ( W ( r ) x t + U ( r ) h t − 1 ) r_z=\delta(W^{(r)}x_t+U^{(r)}h_{t-1}) rz=δ(W(r)xt+U(r)ht−1)
记忆门(new memory state):
h t ^ = t a n h ( W x t + r t ∘ U h t − 1 ) \hat{h_t}=tanh(Wx_t+r_t\circ Uh_{t-1}) ht^=tanh(Wxt+rt∘Uht−1)
Final Hidden State:
h t = z t ∘ h t − 1 + ( 1 − z t ) ∘ h t ^ h_t=z_t\circ h_{t-1}+(1-z_t)\circ \hat{h_t} ht=zt∘ht−1+(1−zt)∘ht^ -
2 LSTM (long short term memories)
输入门(input gate):
i t = δ ( W ( i ) x t + U ( i ) h t − 1 ) i_t=\delta(W^{(i)}x_t+U^{(i)}h_{t-1}) it=δ(W(i)xt+U(i)ht−1)
遗忘门(forget gate):
f t = δ ( W ( f ) x t + U ( f ) h t − 1 ) f_t=\delta(W^{(f)}x_t+U^{(f)}h_{t-1}) ft=δ(W(f)xt+U(f)ht−1)
输出门(output gate):
o t = δ ( W ( o ) x t + U ( o ) h t − 1 ) o_t =\delta(W^{(o)}x_t+U^{(o)}h_{t-1}) ot=δ(W(o)xt+U(o)ht−1)
记忆门(new memory cell):
c t ^ = t a n h ( W ( c ) x t + U ( c ) h t − 1 ) \hat{c_t}=tanh(W^{(c)}x_t+U^{(c)}h_{t-1}) ct^=tanh(W(c)xt+U(c)ht−1)
最终记忆门(final memory cell):
c t = f t ∘ c t − 1 + i t ∘ c t ^ c_t=f_t\circ c_{t-1}+i_t\circ \hat{c_t} ct=ft∘ct−1+it∘ct^
最终输出:
h t = o t ∘ t a n h ( c t ) h_t=o_t\circ tanh(c_t) ht=ot∘tanh(ct)
cs224n-第9课(GRU-LSTM)
最新推荐文章于 2022-05-05 11:43:04 发布