keras 双向LSTM
双向LSTM利用到了未来的信息,在一些文本分类和序列预测问题上可以做到比单向LSTM更好的效果,BiLSTM与LSTM相比,多了一个反向计算,同时利用正向方向的数据计算最终输出,关于LSTM的前向计算可以看这里
这里就只简单记录下keras 的BiLSTM参数个数计算,训练部分如下:
model = Sequential()
input_shape = (149,40)
model.add(Bidirectional(LSTM(units=20,return_sequences=True),input_shape=input_shape))
model.add(Dropout(0.5))
model.add(BatchNormalization())
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
# model.add(Dense(1, activation='sigmoid'))
# LSTM参数个数计算:ht-1与xt拼接、隐藏单元数、四个门的bias
# (20+40)*units*4+20*4
#
#
batch_size = 64
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_training, Y_training,
batch_size=batch_size,
epochs=30,
validation_data=(x_test, y_test),
verbose=1)
LSTM参数中的return_sequences=True表示每个时间步都输出,训练得到的模型如下:
my.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
bidirectional_1 (Bidirection (None, 149, 40) 9760
_________________________________________________________________
dropout_1 (Dropout) (None, 149, 40) 0
_________________________________________________________________
batch_normalization_1 (Batch (None, 149, 40) 160
_________________________________________________________________
time_distributed_1 (TimeDist (None, 149, 1) 41
=================================================================
Total params: 9,961
Trainable params: 9,881
Non-trainable params: 80
_________________________________________________________________
BiLSTM与LSTM相比就是多了个两倍的关系,summary中的第一层输出为(None, 149, 40),则一个单向LSTM的输出为(None, 149, 20),即对应了训练参数中的units=20,所以参数个数计算也就跟单向LSTM一样了,以上模型中的RNN层参数个数为 ((40+20)∗20∗4+20∗4)∗2=9760 ( ( 40 + 20 ) ∗ 20 ∗ 4 + 20 ∗ 4 ) ∗ 2 = 9760