[利用Jina Embeddings构建智能嵌入模型：从文本到图像的全面指南]

sjufgwgfhoia

于 2024-10-03 00:48:42 发布

阅读量168

点赞数 1

文章标签： jina python

本文链接：https://blog.csdn.net/sjufgwgfhoia/article/details/142687361

版权

# 引言

在自然语言处理与计算机视觉的交汇点上，嵌入模型成为了将人类语言与图像转化为数值向量的强大工具。Jina Embeddings通过简化API调用，使开发者能够轻松创建基于文本和图像的嵌入。这篇文章将带你了解如何使用Jina Embeddings构建自己的嵌入模型。

# 主要内容

## 安装必要的库

首先，我们需要安装`langchain-community`库，这是使用Jina Embeddings所需的依赖。

```bash
pip install -U langchain-community

导入所需的库

在开始编码之前，请导入所需的Python库。

import requests
from langchain_community.embeddings import JinaEmbeddings
from numpy import dot
from numpy.linalg import norm
from PIL import Image

使用Jina Embeddings API进行文本嵌入

Jina Embeddings提供了文本嵌入功能，使得文本可以转换为数值向量。

text_embeddings = JinaEmbeddings(
    jina_api_key="jina_*",  # 使用API代理服务提高访问稳定性
    model_name="jina-embeddings-v2-base-en"
)

text = "This is a test document."

query_result = text_embeddings.embed_query(text)

print(query_result)

doc_result = text_embeddings.embed_documents([text])

print(doc_result)

使用Jina CLIP进行图像和文本嵌入

Jina CLIP模型可以同时处理图像和文本，为图像数据的处理带来便利。

multimodal_embeddings = JinaEmbeddings(
    jina_api_key="jina_*",  # 使用API代理服务提高访问稳定性
    model_name="jina-clip-v1"
)

image = "https://avatars.githubusercontent.com/u/126733545?v=4"
description = "Logo of a parrot and a chain on green background"

im = Image.open(requests.get(image, stream=True).raw)
print("Image:")
im.show()

image_result = multimodal_embeddings.embed_images([image])

print(image_result)

description_result = multimodal_embeddings.embed_documents([description])

print(description_result)

计算余弦相似度

计算两个嵌入向量之间的余弦相似度，用于判断图像与描述的相关性。

cosine_similarity = dot(image_result[0], description_result[0]) / (
    norm(image_result[0]) * norm(description_result[0])
)

print(cosine_similarity)