利用已经训练好的word embedding,进行判断词相似度、word analogy、以及除去性别歧视偏差等操作
cosine相似度
# GRADED FUNCTION: cosine_similarity
def cosine_similarity(u, v):
"""
Cosine similarity reflects the degree of similariy between u and v
Arguments:
u -- a word vector of shape (n,)
v -- a word vector of shape (n,)
Returns:
cosine_similarity -- the cosine similarity between u and v defined by the formula above.
"""
distance = 0.0
### START CODE HERE ###
# Compute the dot product between u and v (≈1 line)
dot = u.dot(v)
# Compute the L2 norm of u (≈1 line)
norm_u = np.linalg.norm(u)
# Compute the L2 norm of v (≈1 line)
norm_v = np.linalg.norm(v)
# Compute the cosine similarity defined by formula (1) (≈1 line)
cosine_similarity = dot / (norm_u * norm_v)
### END CODE HERE ###
return cosine_similarity
就是求两个向量的夹角余弦,利用np的范数很容易
word analogy
进行词类比,(男 女 国王)推出皇后。原理就是遍历整个字典,寻找embedding - ‘国王’ 与 ’女‘ - ’男‘ 最相似的单词。
# GRADED FUNCTION: complete_analogy
def complete_analogy(word_a, word_b, word_c, word_to_vec_map):
"""
Performs the word analogy task as explained above: a is to b as c is to ____.
Arguments:
word_a -- a word, string
word_b -- a word, string
word_c -- a word, string
word_to_vec_map -- dictionary that maps