Bi-LSTM

最新推荐文章于 2024-05-17 16:16:55 发布

weixin_40200315

最新推荐文章于 2024-05-17 16:16:55 发布

阅读量1.4k

点赞数

分类专栏：机器学习 python 深度学习理论文章标签： Bi-LSTM

本文链接：https://blog.csdn.net/weixin_40200315/article/details/97887473

版权

https://blog.csdn.net/vivian_ll/article/details/88974691
https://blog.csdn.net/jerr__y/article/details/70471066

在这里插入图片描述

Bi-LSTM大致的思路是这样的，看图中最下方的输入层，假设一个样本（句子）有10个 timestep （字）的输入 x1,x2,…,x10x1,x2,…,x10。现在有两个相互分离的 LSTMCell：

对于前向 fw_cell ，样本按照x1,x2,…,x10x1,x2,…,x10 的顺序输入 cell 中，得到第一组状态输出 {h1,h2,…,h10h1,h2,…,h10} ;
对于反向 bw_cell ，样本按照 x10,x9,…,x1x10,x9,…,x1 的反序输入 cell 中，得到第二组状态输出 {h10,h9,…,h1h10,h9,…,h1 };
得到的两组状态输出的每个元素是一个长度为 hidden_size 的向量（一般情况下，h1h1和h1h1长度相等）。现在按照下面的形式把两组状态变量拼起来{[h1h1,h1h1], [h2h2,h2h2], … , [h10h10,h10h10]}。
最后对于每个 timestep 的输入 xtxt, 都得到一个长度为 2*hidden_size 的状态输出 HtHt= [htht,htht]。然后呢，后面处理方式和单向 LSTM 一样。

def bi_lstm(X_inputs):
“”“build the bi-LSTMs network. Return the y_pred”""
*** 0.char embedding，请自行理解 embedding 的原理！！做 NLP 的朋友必须理解这个
embedding = tf.get_variable(“embedding”, [vocab_size, embedding_size], dtype=tf.float32)
X_inputs.shape = [batchsize, timestep_size] -> inputs.shape = [batchsize, timestep_size, embedding_size]
inputs = tf.nn.embedding_lookup(embedding, X_inputs)
** 1.LSTM 层 ***
lstm_fw_cell = rnn.BasicLSTMCell(hidden_size, forget_bi

最低0.47元/天解锁文章

weixin_40200315

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
Bi-LSTM

https://blog.csdn.net/vivian_ll/article/details/88974691https://blog.csdn.net/jerr__y/article/details/70471066Bi-LSTM大致的思路是这样的，看图中最下方的输入层，假设一个样本（句子）有10个 timestep （字）的输入 x1,x2,…,x10x1,x2,…,x10。现在有两...
复制链接

扫一扫