基于Amazon Bedrock打造Claude3 Opus智能助理

近期,Anthropic 发布了其最新的大模型 Claude3。截止本文撰写时,Claude3 Opus、Claude3 Sonnet、Claude3 Haiku 均已在 Amazon Bedrock 可用,随着 Amazon Bedrock 可提供越来越多的大模型,您可以在您的应用场景里将其落地,以便扩展您丰富的业务功能。

在此博客里,我们将通过构建一个前端示例(基于开源项目 ChatGPT-Next-Web)并通过 Amazon CDK 部署,来演示如何将您的 AI Chat 助手接入 Amazon Bedrock,其中根据不同场景快速提问(预置多种 Prompt)并实现多模态、流式输出(打字机效果)等能力。

我们在该方案主要使用到如下服务和组件:

  • Amazon Elastic Container Service(Amazon ECS)是一项完全托管的容器编排服务,可帮助您更有效地部署、管理和扩展容器化的应用程序。
  • Amazon Elastic Container Registry(Amazon ECR)是完全托管式容器注册表,提供高性能托管,让您能在任何地方可靠地部署应用程序映像和构件。
  • Amazon Bedrock 是一项完全托管的服务,通过单个 API 提供来自 AI21 Labs、Anthropic、Cohere、Meta、Stability AI 和 Amazon 等领先人工智能公司的高性能基础模型(FM),以及通过安全性、隐私性和负责任的 AI 构建生成式人工智能应用程序所需的一系列广泛功能。
  • Amazon Cognito Customer Identity And Access Management.

事先准备:

  • 您需要确保您已有亚马逊云科技 Global 账号,能够访问美东 1 区(us-east-1)、美西 2 区(us-west-2)、东京区(ap-northeast-1)、新加坡区(ap-southeast-1)其中任意一区,且能够在 Amazon Bedrock 控制台申请 Claude3 的权限,请参考:Amazon Bedrock endpoints and quotas – AWS General ReferenceAmazon Bedr…。如您需要 Claude3 Opus 相关功能,截止本文撰写时,Claude3 Opus 仅支持美西 2 区(us-west-2)使用。
  • 您需要准备亚马逊云科技账号上述区域的 Access Key、Secret Access Key,且此账号拥有足够 IAM 权限能够调用 Amazon Bedrock。
  • 您需要准备一台已安装Docker、Node.js(v>=18)、AWS SDK环境的电脑/EC2服务器,将上述步骤的Access Key、Secret Access Key 通过aws configure配置,上述所有服务均启动。

演示

架构设计

阐述:

  1. 项目代码通过 Next.js 实现(React.js 的一种 SSR 框架)。
  2. Next.js 的 UI 层负责页面渲染,UI 逻辑实现,向服务端发请求,预置 Prompt 和模型参数等功能。
  3. Next.js API 层负责暴露 API 供 UI 层调用,Server 层负责拼装请求大模型的参数,实现 Agent,通过 AWS SDK for javascript 调用 Amazon Bedrock。
  4. Amazon Cognito 负责鉴权,与 UI 层代码嵌入,具体实现逻辑请参考 Build & connect backend – JavaScript – AWS Amplify Documentation。您也可以根据您实际情况修改 app/components/home.tsx 中的代码来修改全局渲染逻辑,接入不同的鉴权场景。

请求流程:

  1. 用户在浏览器打开部署好的项目网址(通过 NLB 暴露)。
  2. 流量先到 Amazon ECS 判断未登录后,重定向到 Amazon Cognito 的鉴权页面,登录成功后请求流量打到 Amazon ECS,返回正常的页面。
  3. 输入问题之后,UI 层的 JS 代码会将历史对话上下文,参数配置,发送给 Server 端 API,Server 端拼装参数并请求 Amazon Bedrock。
  4. 得到返回后直接将 Bedrock response 中的 body 转成 ReadableStream,返回给 UI 层。UI 层实现流式响应、上屏逻辑。
  5. 如果您在设置中选择压缩历史记录上下文,每次对话正确响应后,UI 层会将聊天上下文自动发请求到服务端,并要求大模型对其内容进行总结归纳。

完整代码原文传送门:基于Amazon Bedrock打造Claude3 Opus智能助理-国外VPS网站icon-default.png?t=N7T8https://www.vps911.com/vpscp/1657.html

实现向 Bedrock 请求

UI 层修改

此项目的开源代码默认不支持接入 Claude3,或接入 Amazon Bedrock 托管的 Claude3,需要在代码中手动修改。

首先您需要在 app/constant.ts 文件中,添加 Claude3 的模型。

export enum ServiceProvider {
  Claude = "Claude",
  ......
}

export enum ModelProvider {
  AmazonClaude = "AmazonClaude",
  ......
}

export const SUMMARIZE_MODEL = "anthropic-claude3-sonnet";

export const AmazonPath = {
  ChatPath: "v1/chat/completions",
};

export const DEFAULT_MODELS = [
  {
    name: "Claude 3 Opus",
    available: true,
    provider: {
      id: "anthropic",
      providerName: "Amazon Bedrock",
      providerType: "anthropic-claude3-opus",
    },
  },
  {
    name: "Claude 3 Sonnet",
    available: true,
    provider: {
      id: "anthropic",
      providerName: "Amazon Bedrock",
      providerType: "anthropic-claude3-sonnet",
    },
  },
  {
    name: "Claude 3 Haiku",
    available: true,
    provider: {
      id: "anthropic",
      providerName: "Amazon Bedrock",
      providerType: "anthropic-claude3-haiku",
    },
  },
  {
    name: "Claude 2.1",
    available: true,
    provider: {
      id: "anthropic",
      providerName: "Amazon Bedrock",
      providerType: "anthropic-claude21",
    },
  },
  {
    name: "Claude 2",
    available: true,
    provider: {
      id: "anthropic",
      providerName: "Amazon Bedrock",
      providerType: "anthropic-claude2",
    },
  },
] as const;

app/client/platforms 下新建 claude.ts 逻辑可参考此文件夹下其他文件。

【可选】由于项目默认不支持设置 Claude 相关的 top_p,top_k 等参数,需要对设置场景进行修改。

app/components/model-config.tsx
<ListItem
        title={Locale.Settings.TopP.Title}
        subTitle={Locale.Settings.TopP.SubTitle}
      >
        <InputRange
          value={(props.modelConfig.top_p ?? 1).toFixed(1)}
          min="0"
          max="1"
          step="0.1"
          onChange={(e) => {
            props.updateConfig(
              (config) =>
                (config.top_p = ModalConfigValidator.top_p(
                  e.currentTarget.valueAsNumber
                ))
            );
          }}
        ></InputRange>
      </ListItem>
      {/* 新增 */}
      <ListItem
        title={Locale.Settings.TopK.Title}
        subTitle={Locale.Settings.TopK.SubTitle}
      >
        <InputRange
          value={(props.modelConfig.top_k ?? 250).toFixed(1)}
          min="0"
          max="500"
          step="1"
          onChange={(e) => {
            props.updateConfig(
              (config) =>
                (config.top_k = ModalConfigValidator.top_k(
                  e.currentTarget.valueAsNumber
                ))
            );
          }}
        ></InputRange>
      </ListItem>
app/store/config.ts
......

modelConfig: {
    model: "Claude 3 Sonnet" as ModelType,
    temperature: 0.5,
    top_p: 0.9,
    top_k: 250,
    max_tokens: 2048,
    presence_penalty: 0,
    frequency_penalty: 0,
    sendMemory: true,
    historyMessageCount: 4,
    compressMessageLengthThreshold: 1000,
    enableInjectSystemPrompts: true,
    template: DEFAULT_INPUT_TEMPLATE,
  },
  
......

export const ModalConfigValidator = {
  ......
  temperature(x: number) {
    return limitNumber(x, 0, 2, 1);
  },
  top_p(x: number) {
    return limitNumber(x, 0, 1, 1);
  },
  // 新增
  top_k(x: number) {
    return limitNumber(x, 0, 500, 1);
  },
};
app/client/api.ts
export interface LLMConfig {
  model: string;
  temperature?: number;
  top_p?: number;
  stream?: boolean;
  presence_penalty?: number;
  frequency_penalty?: number;
  top_k?: number;
  max_token?: number;
}

Server 层

新建 API,app/api 下新建 2 层文件夹:claude/[…path],随后在 app/api/claude/[…path]下新建文件 route.ts,接受 UI 层传入的请求信息。

创建处理请求的逻辑:新建 app/api/cluadeServices.ts,参考 AWS SDK for JavaScript v3

import { AWSBedrockAnthropicStream, AWSBedrockStream } from "ai";

export const requestAmazonClaude = async (req: NextRequest) => {
  const controller = new AbortController();
  const timeoutId = setTimeout(
    () => {
      controller.abort();
    },
    10 * 60 * 1000 // 10分钟超时
  );
  const requestBodyStr = await streamToString(req.body);
  const requestBody = JSON.parse(requestBodyStr);
  // 创建Bedrock Client
  const bedrockruntime = new BedrockRuntimeClient(AWS_PARAM);
  let response;
  // 发起Bedrock请求
  try {
    response = await bedrockResponse(
      bedrockruntime,
      requestBody.model.includes(CLAUDE3_KEY)
        ? requestBody.messages
        : getClaude2Prompt(requestBody.messages),
      requestBody.temperature,
      toInt(requestBody.max_tokens, 8192),
      requestBody.top_p,
      requestBody.top_k,
      requestBody.model
    );
  } finally {
    clearTimeout(timeoutId);
  }
  if (typeof response === "string") {
    return new NextResponse(response, {
      headers: {
        "Content-Type": "text/plain",
      },
    });
  }
  // 处理Claude3响应 https://sdk.vercel.ai/docs/guides/providers/aws-bedrock
  if (requestBody.model.includes(CLAUDE3_KEY)) {
    const stream = AWSBedrockStream(
      response,
      undefined,
      (chunk) => chunk.delta?.text
    );
    return new NextResponse(stream, {
      headers: {
        "Content-Type": "application/octet-stream",
      },
    });
  }
  // 处理Claude2响应
  const stream = AWSBedrockAnthropicStream(response);

  return new NextResponse(stream, {
    headers: {
      "Content-Type": "application/octet-stream",
    },
  });
};


// 具体拼装参数的逻辑
const bedrockResponse = async (
  bedrockruntime: { send: (arg0: any) => any },
  bodyData: any,
  temperature: number,
  max_tokens_to_sample: number,
  top_p: number,
  top_k: number,
  model: string
) => {
  if (model.includes(CLAUDE2_KEY)) {
    const requestBodyPrompt = {
      prompt: bodyData,
      max_tokens_to_sample,
      temperature, // 控制随机性或创造性,0.1-1 推荐0.3-0.5,太低没有创造性,一点输入错误/语意不明确就会产生比较大偏差,且过低不会有纠错机制
      top_p, // 采样参数,从 tokens 里选择 k 个作为候选,然后根据它们的 likelihood scores 来采样
      top_k, // 设置越大,生成的内容可能性越大;设置越小,生成的内容越固定;
    };
    console.log("<----- Request Claude2 Body -----> ", requestBodyPrompt);
    const command = new InvokeModelWithResponseStreamCommand({
      body: JSON.stringify(requestBodyPrompt),
      modelId:
        model === "anthropic-claude21"
          ? "anthropic.claude-v2:1"
          : "anthropic.claude-v2",
      contentType: "application/json",
      accept: "application/json",
    });
    const response = await bedrockruntime.send(command);
    return response;
  }
  if (model.includes(CLAUDE3_KEY)) {
    const requestBody = await changeMsgToStand(bodyData, true);
    const requestBodyPrompt = {
      anthropic_version:
        (ClaudeEnum as any)[model]?.knowledge_date ||
        ClaudeEnum["anthropic-claude3-sonnet"].knowledge_date,
      messages: requestBody,
      max_tokens: max_tokens_to_sample,
      temperature,
      top_p,
      top_k,
    };
    console.log("<----- Request Claude3 Body -----> ", requestBodyPrompt);
    const command = new InvokeModelWithResponseStreamCommand({
      body: JSON.stringify(requestBodyPrompt),
      modelId:
        (ClaudeEnum as any)[model]?.model_id ||
        ClaudeEnum["anthropic-claude3-sonnet"].model_id,
      contentType: "application/json",
      accept: "application/json",
    });
    const response = await bedrockruntime.send(command);
    // 注意:Claude 3 流式返回的数据结果与Claude2不同,请注意处理
    return response;
  }
};


export const ClaudeEnum = {
  "anthropic-claude3-sonnet": {
    knowledge_date: "bedrock-2023-05-31",
    model_id: "anthropic.claude-3-sonnet-20240229-v1:0",
  },
  "anthropic-claude3-haiku": {
    knowledge_date: "bedrock-2023-05-31",
    model_id: "anthropic.claude-3-haiku-20240307-v1:0",
  },
  "anthropic-claude3-opus": {
    knowledge_date: "bedrock-2023-05-31",
    model_id: "anthropic.claude-3-opus-20240229-v1:0",
  },
};

export const CLAUDE3_KEY = "anthropic-claude3";
export const CLAUDE2_KEY = "anthropic-claude2";

流式输出(Streaming output)

服务端:得到响应时,不要直接解析。逻辑如下:

  1. 调用 InvokeModelWithResponseStream API,请求 Amazon Bedrock。
  2. 得到响应response,但此时请不要直接解析 response 的 body(是一个流),否则还是在中间 Server 层串行等待,且性能很差,如需一次性返回全部响应 body,请使用 InvokeModel API。注意:Amazon Bedrock 上各模型的响应的 Body 类型各不相同,Claude2 与 Claude3 响应也不同。详情请参考文档:AWS SDK for JavaScript v3。
  3. 响应 response 的 Body 需要进行流透传,返回给浏览器端,这一步可以手动将 response 的 Body 解析成流,也可以使用第三方的库。
  4. 这里推荐使用 npm 的 ai 库(AWS Bedrock – Vercel AI SDK),但部分场景(如 Claude3 的响应)仍需手动处理。参考如下:
    // Server
    
    import { AWSBedrockAnthropicStream, AWSBedrockStream } from "ai";
    
    ......
    
    // Claude 2/2.1
    const claude2Stream = AWSBedrockAnthropicStream(claude2Response);
    return new NextResponse(claude2Stream, {
      headers: {
        "Content-Type": "application/octet-stream",
      },
    });
    
    // Claude 3 改了响应格式
    const claude3Stream = AWSBedrockStream(
          claude3Response,
          undefined,
          (chunk) => chunk.delta?.text
        );
    return new NextResponse(claude3Stream, {
      headers: {
        "Content-Type": "application/octet-stream",
      },
    });
    
    // Bedrock 上其他大模型类似

    UI 层(接受流式返回)

    ......
            fetchEventSource(chatPath, {
              ...chatPayload,
              async onopen(res) {
                clearTimeout(requestTimeoutId);
                const contentType = res.headers.get("content-type");
                console.log(
                  "[Claude] request response content type: ",
                  contentType
                );
                if (contentType?.startsWith("text/plain")) {
                  responseText = await res.clone().text();
                  return finish();
                }
                if (
                  res.body &&
                  contentType?.startsWith("application/octet-stream")
                ) {
                  try {
                    const reader = res.clone().body?.getReader();
                    const decoder = new TextDecoder("utf-8");
                    let result = (await reader?.read()) || { done: true };
                    while (!result.done) {
                      responseText += decoder.decode(result.value, {
                        stream: true,
                      });
                      continueMsg();
                      try {
                        result = (await reader?.read()) || { done: true };
                      } catch {
                        break;
                      }
                    }
                  } finally {
                  }
                  return finish();
                }
              
    ......

Amazon Bedrock Limit

Amazon Bedrock 每个账号在每个 region 请求上限为每分钟 60 次,如果您需要更多的请求量,请联系 Amazon 的支持人员,协助您提升上限。具体请参考:Quotas for Amazon Bedrock – Amazon Bedrock

如您仅希望在本地实现限流,可以通过 Nginx 反向代理、NLB 请求限制等方式实现,以下是通过 Next.js 的中间层代码进行限制:

根目录下创建 middleware.ts

您可以参考 Next.js 的官方文档,实现您的 API 拦截、请求限制等功能:Routing: Middleware | Next.js

import { NextRequest, NextResponse } from "next/server";

import rateLimit from "./app/utils/rate-limiter";

const requestCount = 0; // 初始化总请求计数
const lastResetTime = Date.now(); // 初始化上次重置时间
export async function middleware(req: NextRequest) {
  const path = req.nextUrl.pathname;
  // 只对 /api/claude 路由进行限流
  if (path.includes("api/claude/")) {
    try {
      const result = await rateLimit(
        requestCount,
        lastResetTime,
        60,
        60 * 1000
      );
      // 每分钟最多 60 次请求 bedrock限制一分钟最多60次请求 如有需要可以提Case提高上限
      // 参考https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase
      if (!result.success) {
        return NextResponse.json(
          { error: "Too many requests, please try again later." },
          { status: 429 }
        );
      }
    } catch (err) {
      console.error("Error rate limiting", err);
      return NextResponse.json(
        { error: "Error rate limiting" },
        { status: 500 }
      );
    }
  }

  return NextResponse.next();
}

export const config = {
  matcher: "/api/claude/:path*",
};
let requestCount = 0;
let lastResetTime = 0;

export async function rateLimit(
  requestCount: number,
  lastResetTime: number,
  limit: number,
  duration: number,
) {
  const now = Date.now();
  // 如果距离上次重置已经超过了一分钟,则重置计数
  if (now - lastResetTime > duration) {
    requestCount = 1;
    lastResetTime = now;
  } else {
    requestCount++;
  }
  if (requestCount > limit) {
    return { success: false, count: requestCount };
  }
  return { success: true, count: requestCount };
}

实现多模态

目前 Claude3 各模型支持解析图片,您可参考官方文档:Vision

【代码改动】:您可以在 app/utils.ts 文件的 isVisionModel 方法中,设置对模型是否支持图片的判断,来放开项目对多模态的支持。

注意:本项目中 message 对图片 base64 的 type/对象名称为 image_url,与 Claude3type 名为 image 不同,需要您手动转换。

对于 PDF 等其他类型文件,可以通过第三方插件(如 react-pdf:react-pdf – npm)将其在浏览器端转换成图片。出于性能和安全性考虑,这里我们也推荐您在浏览器端实现文件解析和转换,而不是在 Next.js 的服务端。

react-pdf 在构建时需要确保您的构建环境有正确的 canvas 版本(或采用有图形化界面的环境进行构建),且 next.config.js 已按其 npm 文档中配置。

部署

可参考以下 CDK,或您通过 PM2 在 EC2 上运行。

export interface ApplicationLoadBalancerProps {
  readonly internetFacing: boolean;
}

export interface NetworkProps {
  readonly vpc: IVpc;
  readonly subnets?: SubnetSelection;
}

export interface AiChatECSProps {
  readonly networkProps: NetworkProps;
  readonly region: string;
  readonly accountId: string;
  readonly cognitoInfo: any;
  readonly accessKey?: string;
  readonly secretAccessKey?: string;
  readonly bucketName?: string;
  readonly bucketArn?: string;
  readonly partition: string;
}


export class EcsStack extends Construct {
  readonly securityGroup: ISecurityGroup;
  public ecsIP: string;

  constructor(scope: Construct, id: string, props: AiChatECSProps) {
    super(scope, id);
    const repositoryName = "commonchatui-docker-repo";

    // create ecr
    const repository = new Repository(this, "CommonchatuiRepository", {
      repositoryName: "commonchatui-docker-repo",
      imageTagMutability: TagMutability.MUTABLE,
      removalPolicy: RemovalPolicy.DESTROY,
    });
    // deploy front docker image
    const ui_image = new DockerImageAsset(this, "commonchatui_image", {
      directory: path.join(__dirname, "your_next_app_dic"), // 改成您前端应用的代码目录地址
      file: "Dockerfile",
      platform: Platform.LINUX_AMD64,
    });
    const imageTag = "latest";
    const dockerImageUri = `${props.accountId}.dkr.ecr.${props.region}.amazonaws.com/${repositoryName}:${imageTag}`;

    // upload front docker image to ecr
    const ecrDeploy = new ECRDeployment(this, "commonchat_image_deploy", {
      src: new DockerImageName(ui_image.imageUri),
      dest: new DockerImageName(dockerImageUri),
    });

    new CfnOutput(this, "ECRRepositories", {
      description: "ECR Repositories",
      value: ecrDeploy.node.addr,
    }).overrideLogicalId("ECRRepositories");

    new CfnOutput(this, "ECRImageUrl", {
      description: "ECR image url",
      value: dockerImageUri,
    }).overrideLogicalId("ECRImageUrl");

    // create ecs security group
    this.securityGroup = new SecurityGroup(this, "commonchat_sg", {
      vpc: props.networkProps.vpc,
      description: "Common chart security group",
      allowAllOutbound: true,
    });
    // add inter rule
    this.securityGroup.addIngressRule(
      Peer.anyIpv4(),
      Port.tcp(443),
      "Default ui 443 port"
    );
    this.securityGroup.addIngressRule(
      Peer.anyIpv4(),
      Port.tcp(80),
      "Default ui 80 port"
    );
    // add endpoint
    props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCECREP", {
      service: InterfaceVpcEndpointAwsService.ECR,
      privateDnsEnabled: true,
      securityGroups: [this.securityGroup],
    });
    props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCECRDockerEP", {
      service: InterfaceVpcEndpointAwsService.ECR_DOCKER,
      privateDnsEnabled: true,
      securityGroups: [this.securityGroup],
    });
    props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCLogEP", {
      service: InterfaceVpcEndpointAwsService.CLOUDWATCH_LOGS,
      privateDnsEnabled: true,
      securityGroups: [this.securityGroup],
    });

    props.networkProps.vpc.addGatewayEndpoint("CommonChatVPCS3", {
      service: GatewayVpcEndpointAwsService.S3,
    });
    props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCLogECS", {
      service: InterfaceVpcEndpointAwsService.ECS,
      privateDnsEnabled: true,
      securityGroups: [this.securityGroup],
    });
    props.networkProps.vpc.addInterfaceEndpoint("CommonChatVPCLogECSAgent", {
      service: InterfaceVpcEndpointAwsService.ECS_AGENT,
      privateDnsEnabled: true,
      securityGroups: [this.securityGroup],
    });
    props.networkProps.vpc.addInterfaceEndpoint(
      "CommonChatVPCLogECSTelemetry",
      {
        service: InterfaceVpcEndpointAwsService.ECS_TELEMETRY,
        privateDnsEnabled: true,
        securityGroups: [this.securityGroup],
      }
    );

    // create ecr service
    const ecsService = this.createECSGroup(props, imageTag, repository);

    // create lb to ecs
    this.createNlbEndpoint(props, ecsService, [80]);
  }
  private createECSGroup(
    props: AiChatECSProps,
    imageTag: string,
    repository: IRepository
  ) {
    const ecsClusterName = "CommonchatUiCluster";
    const cluster = new Cluster(this, ecsClusterName, {
      clusterName: "commonchat-ui-front",
      vpc: props.networkProps.vpc,
      enableFargateCapacityProviders: true,
    });

    const taskDefinition = new FargateTaskDefinition(
      this,
      "commonchatui_deploy",
      {
        cpu: 2048,
        memoryLimitMiB: 4096,
        runtimePlatform: {
          operatingSystemFamily: OperatingSystemFamily.LINUX,
          cpuArchitecture: CpuArchitecture.of("X86_64"),
        },
        family: "CommonchatuiDeployTask",
        taskRole: this.getTaskRole(this, "CommonchatuiDeployTaskRole"),
        executionRole: this.getExecutionTaskRole(
          this,
          "CommonchatuiDeployExecutionTaskRole"
        ),
      }
    );
    const portMappings = [
      {
        containerPort: 80,
        hostPort: 80,
        protocol: Protocol.TCP,
        appProtocol: AppProtocol.http,
        name: "app_port",
      },
      {
        containerPort: 443,
        hostPort: 443,
        protocol: Protocol.TCP,
        appProtocol: AppProtocol.http,
        name: "app_port_443",
      },
    ];
    const envConfig: any = {
      DEFAULT_REGION: props.region,
      BUCKET_NAME: props.bucketName,
      USER_POOL_ID: props.cognitoInfo.userPoolId,
      USER_POOL_CLIENT_ID: props.cognitoInfo.userPoolClientId,
    };
    if (props.accessKey && props.secretAccessKey) {
      envConfig.ACCESS_KEY = props.accessKey;
      envConfig.SECRET_ACCESS_KEY = props.secretAccessKey;
    }
    taskDefinition.addContainer("CommonchatuiContainer", {
      containerName: "commonchatui_container",
      image: ContainerImage.fromEcrRepository(repository, imageTag),
      essential: true,
      cpu: 2048,
      memoryLimitMiB: 4096,
      portMappings: portMappings,
      environment: envConfig,
      logging: LogDriver.awsLogs({
        streamPrefix: "commonchat_ui",
      }),
    });
    // 给ECS容器的角色配置权限
    taskDefinition.addToTaskRolePolicy(
      new PolicyStatement({
        effect: Effect.ALLOW,
        actions: [
          "bedrock:InvokeModel",
          "bedrock:InvokeModelWithResponseStream",
        ],
        resources: ["*"],
      })
    );

    return new FargateService(this, "CommonchatuiService", {
      serviceName: "commonchat-ui-service",
      cluster: cluster,
      taskDefinition: taskDefinition,
      desiredCount: 1,
      assignPublicIp: false,
      platformVersion: FargatePlatformVersion.LATEST,
      securityGroups: [this.securityGroup],
      vpcSubnets: { subnetType: SubnetType.PRIVATE_WITH_EGRESS },
      capacityProviderStrategies: [{ capacityProvider: "FARGATE", weight: 2 }],
      propagateTags: PropagatedTagSource.TASK_DEFINITION,
      maxHealthyPercent: 100,
      minHealthyPercent: 0,
    });
  }
  private getExecutionTaskRole(self: this, roleId: string): IRole {
    // throw new Error('Method not implemented.');
    return new Role(self, roleId, {
      assumedBy: new ServicePrincipal("ecs-tasks.amazonaws.com"),
    });
  }
  private getTaskRole(self: this, roleId: string): IRole {
    return new Role(self, roleId, {
      assumedBy: new ServicePrincipal("ecs-tasks.amazonaws.com"),
      managedPolicies: [
        ManagedPolicy.fromAwsManagedPolicyName(
          "service-role/AmazonECSTaskExecutionRolePolicy"
        ),
      ],
    });
  }

  // 如果使用自有证书 选择ALB
  private createNlbEndpoint(
    props: AiChatECSProps,
    ecsService: FargateService,
    servicePorts: Array<number>
  ) {
    const nlb = new NetworkLoadBalancer(this, "CommonchatUiLoadBalancer", {
      loadBalancerName: "commonchat-ui-service",
      internetFacing: true,
      crossZoneEnabled: false,
      vpc: props.networkProps.vpc,
      vpcSubnets: { subnetType: SubnetType.PUBLIC } as SubnetSelection,
    });
    servicePorts.forEach((itemPort) => {
      const listener = nlb.addListener(`CommonchatUiLBListener-${itemPort}`, {
        port: itemPort,
        protocol: LBProtocol.TCP_UDP,
      });
      const targetGroup = new NetworkTargetGroup(
        this,
        `CommonchatUiLBTargetGroup-${itemPort}`,
        {
          targetGroupName: "commonchat-ui-service-target",
          vpc: props.networkProps.vpc,
          port: itemPort,
          protocol: LBProtocol.TCP_UDP,
          targetType: TargetType.IP,
          healthCheck: {
            enabled: true,
            interval: Duration.seconds(180),
            healthyThresholdCount: 2,
            unhealthyThresholdCount: 2,
            port: itemPort.toString(),
            protocol: LBProtocol.TCP,
            timeout: Duration.seconds(10),
          },
        }
      );
      listener.addTargetGroups(
        `CommonChatUiLBTargetGroups-${itemPort}`,
        targetGroup
      );

      targetGroup.addTarget(
        // targetGroups
        ecsService.loadBalancerTarget({
          containerName: "commonchattui_container",
          containerPort: itemPort,
        })
      );
    });
    new CfnOutput(this, "FrontUiUrlDefault", {
      description: "Common chart ui url by default",
      value: `http://${nlb.loadBalancerDnsName}`, // alb
    }).overrideLogicalId("FrontUiUrlDefault");
  }
}

总结

本文中,我们介绍了如何使用开源框架 ChatGPT-Next-Web,接入您 AWS 账号下的 Amazon Bedrock [Claude 3 Sonnet(Text Image)、Claude 3 Haiku(Text Image)]大模型,实现一个轻量级的多模态聊天机器人,并实现流式输出和部署。

更多相关内容:国外VPS资讯网

  • 13
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值