词嵌入-相似度计算

最新推荐文章于 2024-05-23 11:20:43 发布

YingJingh

最新推荐文章于 2024-05-23 11:20:43 发布

阅读量426

点赞数

分类专栏： python 文章标签： python

本文链接：https://blog.csdn.net/Hekena/article/details/119891499

版权

python 专栏收录该内容

35 篇文章 0 订阅

订阅专栏

#利用维基百科训练的模型，完成词嵌入
import tensorflow_hub as hub

embed = hub.load("https://tfhub.dev/google/Wiki-words-500/2")
embeddings = embed(["cat is on the mat", "dog is in the fog"])
english_sentences = ["dog", "Puppies are nice.", "I enjoy taking long walks along the beach with my dog."]
english_embedding=embed(english_sentences)


print(embeddings)
print(english_embedding)
print(english_embedding.shape)````


```python
#第二个词嵌入模型代码


import tensorflow_hub as hub
import numpy as np
import tensorflow_text

# Some texts of different lengths.
english_sentences = ["dog", "Puppies are nice.", "I enjoy taking long walks along the beach with my dog."]
italian_sentences = ["cane", "I cuccioli sono carini.", "Mi piace fare lunghe passeggiate lungo la spiaggia con il mio cane."]
chinese_sentences = ['狗','狗是友好的','我喜欢和狗狗一起散步']

embed = hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual/3")

# Compute embeddings.
en_result = embed(english_sentences)
it_result = embed(italian_sentences)
ch_result = embed(chinese_sentences)

# Compute similarity matrix. Higher score indicates greater similarity.
similarity_matrix_it = np.inner(en_result, it_result)
similarity_matrix_ja = np.inner(en_result, ch_result)