- 填充序列
https://tensorflow.google.cn/beta/guide/keras/masking_and_padding
raw_inputs = [
[83, 91, 1, 645, 1253, 927],
[73, 8, 3215, 55, 927],
[711, 632, 71]
]
# By default, this will pad using 0s; it is configurable via the
# "value" parameter.
# Note that you could "pre" padding (at the beginning) or
# "post" padding (at the end).
# We recommend using "post" padding when working with RNN layers
# (in order to be able to use the
# CuDNN implementation of the layers).
padded_inputs = tf.keras.preprocessing.sequence.pad_sequences(raw_inputs,
padding='post')
print(padded_inputs)
-
Mask
经过Pad现在所有样本都具有统一的长度,必须告知模型某些部分数据实际上是填充的,应该被忽略。这种机制就是Mask。
There are three ways to introduce input masks in Keras models:
- Add a keras.layers.Masking layer.
- Configure a keras.layers.Embedding layer with mask_zero=True.
- Pass a mask argument manually when calling layers that support this argument (e.g. RNN layers).