Char_Level_CNN_Model
模型介绍
模型参数总览
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
sent_input (InputLayer) (None, 1014) 0
_________________________________________________________________
embedding_14 (Embedding) (None, 1014, 128) 8960
_________________________________________________________________
conv1d_30 (Conv1D) (None, 1008, 256) 229632
_________________________________________________________________
activation_20 (Activation) (None, 1008, 256) 0
_________________________________________________________________
max_pooling1d_13 (MaxPooling (None, 336, 256) 0
_________________________________________________________________
conv1d_31 (Conv1D) (None, 330, 256) 459008
_________________________________________________________________
activation_21 (Activation) (None, 330, 256) 0
_________________________________________________________________
max_pooling1d_14 (MaxPooling (None, 110, 256) 0
_________________________________________________________________
conv1d_32 (Conv1D) (None, 108, 256) 196864
_________________________________________________________________
activation_22 (Activation) (None, 108, 256) 0
_________________________________________________________________
conv1d_33 (Conv1D) (None, 106, 256) 196864
_________________________________________________________________
activation_23 (Activation) (None, 106, 256) 0
_________________________________________________________________
conv1d_34 (Conv1D) (None, 104, 256) 196864
_________________________________________________________________
activation_24 (Activation) (None, 104, 256) 0
_________________________________________________________________
conv1d_35 (Conv1D) (None, 102, 256) 196864
_________________________________________________________________
activation_25 (Activation) (None, 102, 256) 0
_________________________________________________________________
max_pooling1d_15 (MaxPooling (None, 34, 256) 0
_________________________________________________________________
flatten_5 (Flatten) (None, 8704) 0
_________________________________________________________________
dense_13 (Dense) (None, 1024) 8913920
_________________________________________________________________
dropout_9 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_14 (Dense) (None, 1024) 1049600
_________________________________________________________________
dropout_10 (Dropout) (None, 1024) 0
_________________________________________________________________
dense_15 (Dense) (None, 4) 4100
=================================================================
Total params: 11,452,676
Trainable params: 11,452,676
Non-trainable params: 0
_________________________________________________________________
可以看出该模型参数量较大,因此对数据集的要求也很庞大,CPU跑如此大的数据集应该很不方便,但如果数据集太小,那么就会很容易overfit这些参数。
逐层介绍
Input Layer
input_size = 1014
,表示输入层每次输入的单词量为1014,输入层无参数, 多退少补,每行就是1014。
# Input
inputs = Input(shape=(input_size,), name='sent_input', dtype='int64') # shape=(?, 1014)
Embedding Layer
需要识别的单词类别是69个, 'abcdefghijklmnopqrstuvwxyz0123456789-,;.!?:\'"/\\|_@#$%^&*~+-=<>()[]{}'
, 其他无法识别的得用一个<\UNK>
去代替,所以识别单词数是70个, 而embedding_size = 128
, 所以这一层的参数数是 128 ∗ 70 = 8960 128*70 = 8960 128∗70=8960。
# Embedding layer
conv = Embedding(alphabet_size+1, embedding_size, input_length=input_size)(