- tf.reduce_sum()
- tf.nn.embedding_lookup()
- tf.matmul()
归一化功能涉及函数:
reduce_sum() 用于计算张量tensor沿着某一维度的和,可以在求和后降维。具体参考:彻底理解 tf.reduce_sum()
tf.reduce_sum(
input_tensor,
axis=None,
keepdims=None,
name=None,
reduction_indices=None,
keep_dims=None)
- input_tensor:待求和的tensor;
- axis:指定的维,如果不指定,则计算所有元素的总和;
- keepdims:是否保持原有张量的维度,设置为True,结果保持输入tensor的形状,设置为False,结果会降低维度,如果不传入这个参数,则系统默认为False;
- name:操作的名称;
- reduction_indices:在以前版本中用来指定轴,已弃用;
- keep_dims:在以前版本中用来设置是否保持原张量的维度,已弃用;
vocabulary_size = 10
embedding_size = 4
embeddings = tf.truncated_normal([vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size))
with tf.Session() as session:
# print(session.run(embeddings))
norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
normalized_embeddings = embeddings / norm
print(session.run(norm))
print("####",norm.shape, embeddings.shape)
print(session.run(normalized_embeddings))
res:
tf.nn.embedding_lookup函数
tf.nn.embedding_lookup函数的用法主要是选取一个张量里面索引对应的元素。tf.nn.embedding_lookup(tensor, id):tensor就是输入张量,id就是张量对应的索引,tensor、id 可以使np对象或者tensor对象。
valid_dataset = tf.constant([1, 2, 3], dtype=tf.int32)
b = tf.nn.embedding_lookup(embeddings, valid_dataset)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(b))
print("------------------")
print(sess.run(embeddings))
res:
相似度计算。一般,向量的相似度指的是余弦相似度,即对应位置相乘。计算过程:a:shape(10, 4) b:shape(3,4),a 中每个向量与 b中每个向量的相似度结果:c:shape(10, 3)。计算时需要b转置
涉及的函数tf.matmul()
vocabulary_size = 10
embedding_size = 4
embeddings = tf.truncated_normal([vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size))
norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
normalized_embeddings = embeddings / norm ## shape(10, 4)
valid_dataset = tf.constant([1, 2, 3], dtype=tf.int32)
valid_embeddings = tf.nn.embedding_lookup(normalized_embeddings, valid_dataset) ## shape(3, 4)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# print(sess.run(valid_embeddings))
# print("------------------")
# print(sess.run(embeddings))
similarity = tf.matmul(
normalized_embeddings, valid_embeddings, transpose_b=True)
print(sess.run(similarity)) ## shape(10, 3)
print(similarity.shape)
res: