dropout层_关于keras的Input层与embedding层全解析

最新推荐文章于 2024-08-15 09:42:02 发布

weixin_39917718

最新推荐文章于 2024-08-15 09:42:02 发布

阅读量295

点赞数

文章标签： dropout层 dropout层的作用 embedding层 keras embedding层 keras 自定义层input 只能输入正整数和负整数的input

欢迎关注本人推荐系统公众号：Tiany_RecoSystem

在keras中自定义一个self attention层案例：

class Self_Attention(Layer):
def __init__(self, output_dim, **kwargs):
self.output_dim = output_dim
super(Self_Attention, self).__init__(**kwargs)
def build(self, input_shape):
# 为该层创建一个可训练的权重
#inputs.shape = (batch_size, time_steps, seq_len)
self.kernel = self.add_weight(name='kernel',
shape=(3,input_shape[2], self.output_dim),
initializer='uniform',
trainable=True)
super(Self_Attention, self).build(input_shape) # 一定要在最后调用它
def call(self, x):
Query = K.dot(x, self.kernel[0])
Key = K.dot(x, self.kernel[1])
Value = K.dot(x, self.kernel[2])
print("Query.shape", Query.shape)
print("K.permute_dimensions(Key, [0, 2, 1]).shape", K.permute_dimensions(Key, [0, 2, 1]).shape)
QK = K.batch_dot(Query, K.permute_dimensions(Key, [0, 2, 1]))
QK = QK / (self.output_dim**0.5)
QK = K.softmax(QK)
print("QK.shape",QK.shape)
Z = K.batch_dot(QK, Value)
return Z
def compute_output_shape(self, input_shape):
return (input_shape[0],input_shape[1],self.output_dim)
max_features = 20000 #字典内总的单词数/特征数
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words = max_features)
#标签转换为独热编码
y_train, y_test = pd.get_dummies(y_train), pd.get_dummies(y_test)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
#%%数据归一化处理
maxlen = 64
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
batch_size = 32
embed_size=128
S_inputs = Input(shape=(maxlen,), dtype='int32')
#Embedding(max_features, embed_size): 20000 * 128的embedding_matrix矩阵
embeddings_S_inputs = Embedding(max_features, embed_size)(S_inputs)
#embeddings_S_inputs : (None, maxlen, embed_size)
#独热编码后的S_inuts，存放着每个特征的index，然后到embedding矩阵中取
O_seq = Self_Attention(embed_size)(embeddings_S_inputs)
O_seq = GlobalAveragePooling1D()(O_seq)
O_seq = Dropout(0.5)(O_seq)

不看batch_size的话，其实我们输入的是一个64维度的向量，正好对应了上面的maxlen.

而我们的生成的Embedding(max_features, embed_size) 是一个20000 * 128的embedding_matrix矩阵

那么embeddings_S_inputs 为 (None, maxlen, embed_size)
独热编码后的S_inuts，存放着每个特征的index，然后到embedding矩阵中取

于是：embedding_1 (Embedding) 为 (None, 64, 128)

因此：

我们embedding层的作用，是将正整数下标转换为具有固定大小的向量。如[[4], [20]]->[[0.25,0.1], [0.6,-0.2]]。一定要注意到一个下标对应一个向量。所以embedding层，其本质就是，我们用下标去寻找对应的映射！

weixin_39917718

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫