aws lambda_将自然js推理模型部署到AWS Lambda

最新推荐文章于 2024-08-13 08:09:24 发布

Big黄勇

最新推荐文章于 2024-08-13 08:09:24 发布

阅读量466

点赞数

文章标签： java

原文链接：https://medium.com/javascript-in-plain-english/deploying-natural-js-inference-model-to-aws-lambda-ff2f9719b5d0

版权

aws lambda

This article is a comprehensive guide that explains how inference models trained using Natural JS are deployed to a key-protected API using AWS Lambda & API gateway. By the end of this article, you should be able to:

本文是一本全面的指南，介绍了如何使用AWS Lambda和API网关将使用Natural JS训练的推理模型部署到受密钥保护的API。在本文末尾，您应该能够：

Understand what is an inference model.
了解什么是推理模型。
Have a basic understanding of ML pipelines and how they relate to inference models.
对ML管道及其与推理模型的关系有基本的了解。
Advantages of deploying inference models inside serverless APIs.
在无服务器API内部署推理模型的优势。
Build a serverless API with embedded inference model
使用嵌入式推理模型构建无服务器API
Deploy a Lambda function protected by API gateway, usage plan, rate limit, and API keys.
部署受API网关，使用计划，速率限制和API密钥保护的Lambda函数。
Automate common tasks related to Lambda function development.
自动执行与Lambda函数开发相关的常见任务。

If you’re keen to know how the model used in this article was trained, I would recommend reading the previous article in this ML blog series.

如果您想知道如何训练本文中使用的模型，建议您阅读本ML博客系列中的上一篇文章。

机器学习管道和推理模型 (Machine Learning Pipeline & Inference Model)

Image for post — A typical machine learning pipeline that produces inferences model(s).

An inference model is the final product of a machine learning pipeline. A typical ML pipeline is composed of several steps that aim to produce inference models. Finally, these pipelines can vary on a case-to-case basis and could produce one or more models at the end of the process.

推理模型是机器学习管道的最终产品。典型的ML管道由旨在生成推理模型的几个步骤组成。最后，这些管道可能会因情况而异，并可能在流程结束时生成一个或多个模型。

The ML pipeline that I’ve personally liked to use on my projects is composed of the following steps:

我个人喜欢在项目中使用的ML管道由以下步骤组成：

Data Collection — Step where machine learning engineers and data scientists collect and collate data from different sources so that it can be utilized for model training purposes.
数据收集 -机器学习工程师和数据科学家从不同来源收集和整理数据的步骤，可将其用于模型训练目的。
Data Pre-processing — Step where the collected data is cleansed and transformed. This step is completely optional but often improves the quality and accuracy of results produced by inference models.
数据预处理 -清理和转换收集的数据的步骤。此步骤是完全可选的，但通常可以提高推理模型产生的结果的质量和准确性。
Algorithm Selection — Step where machine learning engineers pick one or more algorithm that performs well in solving the problem statement.
算法选择 -机器学习工程师选择一种或多种在解决问题陈述方面表现良好的算法的步骤。
Validation — “Performing well” is then determined by validation techniques such as K-fold & 80/20 techniques. I will write another article that showcases this specific step to improve our inference model’s current accuracy (71%).
验证 -然后通过验证技术(例如K折和80/20技术)确定“性能良好”。我将写另一篇文章，展示该特定步骤以提高我们的推理模型的当前准确性(71％)。
Final Training — Step where you train the model(s) that will be embedded within your production APIs.
最终培训 - 培训将嵌入在生产API中的模型的步骤。

At the end of these pipelines, inference models are used in business applications for predicting results from a given input. In the context of our solution, our inference model is built to classify a piece of text between the following labels:

在这些管道的最后，推理模型在业务应用程序中用于预测给定输入的结果。在我们的解决方案的上下文中，我们建立了推理模型，以在以下标签之间对一段文本进行分类：

Hateful Speech
仇恨言论
Offensive Speech
攻击性言论
Neither
都没有

在API中部署推理模型 (Deploying the Inference Model in an API)

Embedding machine learning models inside APIs is the most common way of serving their capabilities. It is more ideal to expose it using APIs rather than embedding it to your client-side applications because:

将机器学习模型嵌入API中是最常用的功能。最好使用API公开它，而不是将其嵌入到客户端应用程序中，因为：

Model weights are often big and not ideal for transfer over the network which slows down the loading of frontend applications
模型权重通常很大，对于通过网络传输不理想，这会减慢前端应用程序的加载速度
It allows you to protect your models from weight-poisoning-attacks on the client-side
它使您可以保护模型免受客户端的重毒攻击
It allows you to protect your model from plagiarism.
它使您可以保护模型免遭窃。
It allows you to apply security measures (rate-limiting, usage quotas & API keys) to your machine learning models
它允许您将安全措施(速率限制，使用配额和API密钥)应用于机器学习模型
It allows you to serve it to different consumers
它使您可以为不同的消费者提供服务

为什么选择AWS Lambda？ (Why AWS Lambda?)

Individuals & organizations would want to deploy machine learning models to AWS Lambda because it allows them to exploit the following advantages:

个人和组织希望将机器学习模型部署到AWS Lambda，因为它允许他们利用以下优势：

Quick prototyping without an upfront cost
快速制作原型，无需支付前期费用
Ease of decommissioning if experiments failed
如果实验失败，则易于退役
Ease of scaling once the prototype is successful
原型成功后易于扩展
Doesn’t require you to pay idle compute-time
不需要您支付空闲的计算时间
Doesn’t require tons of attention and management from IT & security guys
不需要IT和安全人员的大量关注和管理
It is easier to implement an API key protection mechanism with the help of AWS API gateways
借助AWS API网关更容易实现API密钥保护机制
Native canary deployment for model performance comparisons with the help of SAM templates
本地金丝雀部署，借助SAM模板进行模型性能比较

You can find more information about the advantages of using serverless architectures from this link.

您可以从此链接中找到有关使用无服务器架构的优势的更多信息。

注入AI的API解决方案 (AI-Infused API Solution)

Now that we’ve discussed why ML models are better served inside an API, we can proceed by explaining the high-level overview of our ML-powered API design.

既然我们已经讨论了为什么ML模型可以更好地在API中提供服务，那么我们可以从解释基于ML的API设计的高级概述入手。

In the diagram above:

在上图中：

Developers build the AI-infused API package and upload it to an S3 bucket
开发人员构建注入AI的API程序包并将其上传到S3存储桶
Lambda service retrieves the deployment artifact and CloudFormation templates from the S3 bucket
Lambda服务从S3存储桶中检索部署工件和CloudFormation模板
The model is embedded inside a Lambda-based API
该模型嵌入在基于Lambda的API中
Lambda-based API is protected by an AWS API gateway
基于Lambda的API受AWS API网关保护
The API gateway is protected by an API Key
API网关受API密钥保护
Consumers invoke endpoints on the API Gateway
使用者在API网关上调用端点

克隆源代码 (Clone the Source Code)

To run and test the inference endpoints, kindly clone the whole implementation on your machine from this folder of my book’s code repository in Github.

要运行和测试推理端点，请从我在Github中的代码存储库的此文件夹中，在您的计算机上克隆整个实现。

Once the cloning process completes, you shall see the following folder structure (Scaffold) on your machine.

克隆过程完成后，您将在计算机上看到以下文件夹结构(Scaffold)。

安装API依赖项 (Install API Dependencies)

To install our API’s dependencies (Node modules for Natural JS), you can run the script named 001_install_dependencies.sh in your terminal.

要安装我们API的依赖项(Natural JS的Node模块)，您可以在终端中运行名为001_install_dependencies.sh的脚本。

What this script does is that it seeks all folders inside the root of the project and installs node modules if they don’t exist yet in your development machine.

该脚本的作用是查找项目根目录内的所有文件夹，并在开发机器中尚不存在的节点模块上安装节点模块。

I’ve automated dependency installation this way because a real-life serverless project will contain more than one Lambda-based API and installing node modules across all those endpoints can take a good amount of your time if you do it manually.

我已经通过这种方式自动安装了依赖项，因为一个真实的无服务器项目将包含多个基于Lambda的API，并且如果您手动执行操作，那么在所有这些端点上安装节点模块可能会花费大量时间。

检查API代码 (Inspecting the API Code)

To make the code a little bit more readable, I’ve stripped down some of the comments that you’ll probably find in the source code in Github.

为了使代码更具可读性，我删除了一些您可能会在Github的源代码中找到的注释。

Here is the logic breakdown of this code:

这是此代码的逻辑细分：

The module starts with import to buildResponse & loadFrozenModel functions from their utility files.
该模块首先从其实用程序文件的import导入buildResponse和loadFrozenModel函数。
We defined a constant variable for the path of the model’s weights.
我们为模型权重的路径定义了一个常量。
We defined a variable called model and assigned null to it. Its purpose is to prevent unnecessary reloading of the inference model’s weights from the disk on warm starts.
我们定义了一个名为model的变量，并为其分配了null。其目的是防止在热启动时从磁盘不必要地重新加载推理模型的权重。
We’re responding with HTTP 200 when the function is invoked using the OPTIONS verb.
当使用OPTIONS动词调用函数时，我们将使用HTTP 200进行响应。
We’re responding with HTTP 422 (Unprocessable entity) when the body or the text to classify were not provided.
当未提供要分类的正文或文本时，我们使用HTTP 422(不可处理实体)进行响应。
We’re loading the inference model if the request is a cold start.
如果请求是冷启动的，我们正在加载推理模型。
If all parameters and conditions are valid, we run a classification using the inference model.
如果所有参数和条件均有效，则使用推理模型进行分类。
We then respond with HTTP 200 with the text classification label.
然后，我们使用带有文本分类标签的HTTP 200进行响应。

使用SAM定义API网关 (Defining API Gateway using SAM)

To aggregate lambda functions under a single contact point from the consumer point-of-view, we will use an API gateway. The template above defines our API Gateway.

从使用者的角度来看，要在单个联系点下聚合lambda函数，我们将使用API网关。上面的模板定义了我们的API网关。

We then define an API key that will be required from consuming parties to authenticate incoming HTTP requests. We’ve also defined a usage plan that enforces the following:

然后，我们定义一个API密钥，使用方将需要该API密钥来验证传入的HTTP请求。我们还定义了一个使用计划，该计划强制执行以下操作：

The quota that limits the incoming requests to 10,000 calls per month
该配额将传入请求限制为每月10,000个呼叫
Throttle settings of 1,000 calls for steady-state rate-limiting per second
节气门设置为1,000，要求每秒进行稳态速率限制
Throttle burst limit that prevents the API gateway from being overwhelmed (set to 1,000)
油门脉冲爆发限制，可防止API网关不堪重负(设置为1,000)

Lambda函数定义 (Lambda Function Definition)

We’re defining a dynamic function name that appends environment code, app name, and our lambda function’s name (infer-hate-speech).
我们正在定义一个动态函数名称，该名称会附加环境代码，应用程序名称和lambda函数的名称( infer-hate-speech )。
Specify the code URI of the endpoint (infer directory)
指定端点(推断目录)的代码URI
Specify the Lambda functions execution entry point (app.execute)
指定Lambda函数执行入口点(app.execute)
Specify a maximum request timeout in seconds (30)
以秒为单位指定最大请求超时(30)
Specify a function runtime (nodejs10.x)
指定函数运行时(nodejs10.x)
Specify a memory size (256mb)
指定内存大小(256mb)
Specify environment variables for debugging purposes on later tutorials
在以后的教程中指定用于调试目的的环境变量
Attach POST & OPTIONS verb events of the API gateway as triggers
将API网关的POST＆OPTIONS动词事件附加为触发器
Tag it for resource group management
对其进行标记以进行资源组管理

代码发布 (Code Release)

I’ve created an awesome bash script that enables you guys to run guided deployments which can be found here. If you are interested in releasing it using your own set of bash scripts, you run the following commands on the project root:

我创建了一个很棒的bash脚本，使您可以运行引导式部署，可以在此处找到。如果您有兴趣使用自己的bash脚本集来释放它，请在项目根目录上运行以下命令：

If you’ve provided valid values to our awesome guided CLI, you should get a cloud formation stack similar to the one below:

如果您已为我们出色的指导CLI提供了有效值，则应获得类似于以下内容的云形成堆栈：

测试我们的API (Testing our API)

To test our model’s integration with our API Gateway & Lambda function:

要测试模型与API网关和Lambda函数的集成，请执行以下操作：

Open the API Gateway in your AWS console.
在您的AWS控制台中打开API网关。
Select the API gateway with named dev-ninja-hate-inference-APIs.
选择具有命名dev-ninja-hate-inference-API的API网关。
Provide desired text to classify in the following payload format
提供所需的文本以按照以下有效负载格式分类

Click the test button and you should see the following result
单击测试按钮，您应该看到以下结果

退役 (Decommissioning)

After you’ve finished fiddling and testing with the solution. I would highly recommend decommissioning the cloud formation stack if you want to keep your AWS environments clean.

完成摆弄和测试解决方案之后。如果您想保持AWS环境清洁，我强烈建议停用云形成堆栈。

It is also fine if you wish to keep the CloudFormation stack on your ecosystems for future testing or demo purposes. Keeping the stack doesn’t cost money unless you start hitting 1 million requests a month.

如果您希望将CloudFormation堆栈保留在您的生态系统中以供将来测试或演示，也可以。除非您每月开始处理100万个请求，否则保持成本不花钱。

If you decide to decommission the API stack sample, you can utilize a decommissioning bash script that I’ve built from this link.

如果您决定停用API堆栈示例，则可以利用我从此链接构建的停用bash脚本。

The capability to leave the cloud formation stack running inside our AWS ecosystems showcases serverless’ cost efficiency which cannot be easily achieved with Docker-based inference models because it requires you to pay for the K8s control plane (75 bucks a month) and worker nodes (Depends on whether AWS Fargate or EC2 machines were used).

保持云形成堆栈在我们的AWS生态系统中运行的功能展示了无服务器的成本效率，而基于Docker的推理模型无法轻松实现这种成本效率，因为它需要您支付K8s控制平面(每月75美元)和工作节点(取决于是否使用了AWS Fargate或EC2计算机)。

结论 (Conclusion)

In this article, we’ve learned that:

在本文中，我们了解到：

Inference models are the final product of training machine learning models
推理模型是训练机器学习模型的最终产品
Inference models could be embedded either inside client-side apps or APIs
推理模型可以嵌入在客户端应用程序或API中
We’ve learned the benefits of encapsulating our inference models inside serverless APIs
我们已经了解了将推理模型封装在无服务器API中的好处
We’ve seen an actual implementation of AI-infused APIs that were protected by API Gateway & API keys
我们已经看到了由API网关和API密钥保护的注入AI的API的实际实现
We’ve made a comparison of experimentation cost between embedding ML models inside serverless & Docker-based APIs
我们已经比较了将ML模型嵌入无服务器和基于Docker的API中的实验成本

下一步是什么？ (What’s Next?)

In the previous article, we’ve trained our inference model inside our development machines. However, we can take this to the next level by running our model training inside lambda functions that can be triggered by dropping a new dataset on an S3 bucket.

在上一篇文章中，我们在开发机器中训练了推理模型。但是，我们可以通过在lambda函数中运行模型训练将其提升到一个新的水平，该函数可以通过在S3存储桶上放置新数据集来触发。