问答机器人接口python_设计用于机器学习工程的Python接口

问答机器人接口python

In order to do machine learning engineering, a model must first be deployed, in most cases as a prediction API. In order to make this API work in production, model serving infrastructure must first be built. This includes load balancing, scaling, monitoring, updating, and much more.

为了进行机器学习工程,必须首先部署模型,在大多数情况下,模型应作为预测API。 为了使该API在生产中起作用,必须首先构建模型服务基础结构。 这包括负载平衡,扩展,监视,更新等等。

At first glance, all of this work seems familiar. Web developers and DevOps engineers have been automating microservice infrastructure for years now. Surely, we can repurpose their tools?

乍一看,所有这些工作似乎都很熟悉。 Web开发人员和DevOps工程师多年来一直在实现微服务基础架构的自动化。 当然,我们可以重新利用他们的工具吗?

Unfortunately, we can’t—at least, not precisely.

不幸的是,我们不能-至少不是精确地。

While ML infrastructure is similar to traditional DevOps, it is just ML-specific enough to make standard DevOps tools suboptimal. This is why we built Cortex, our open source platform for machine learning engineering.

虽然ML基础结构类似于传统的DevOps,但它是特定于ML的,足以使标准DevOps工具欠佳。 这就是为什么我们构建了Cortex(我们的机器学习工程开源平台)的原因

At a very high level, Cortex is designed to make it easy to deploy a model locally or to the cloud, automating all the underlying infrastructure. A central component of the platform is the Predictor interface—a programmable Python interface through which developers can write prediction APIs.

在很高的层次上,Cortex旨在简化在本地或云中部署模型的过程,从而使所有基础架构自动化。 平台的中心组件是Predictor接口,这是一个可编程的Python接口,开发人员可以通过该接口编写预测API。

Designing a Python interface specifically for serving predictions as web requests was a challenge we spent months on (and are still improving). Here, I want to share some of the design principles we’ve developed so far:

设计一个专门用于将预测作为Web请求提供服务的Python接口是我们花了几个月(并且还在不断改进)的挑战。 在这里,我想分享我们到目前为止开发的一些设计原则:

1. Predictor只是一个Python类 (1. A Predictor is just a Python class)

At the core of Cortex’s architecture is our concept of a Predictor, which is essentially a prediction API, including all of the request handling code and dependencies. The Predictor interface enforces some simple requirements for those prediction APIs.

Cortex体系结构的核心是Predictor的概念, Predictor本质上是一种预测API,包括所有请求处理代码和依赖项。 Predictor接口对这些预测API提出了一些简单的要求。

Because Cortex takes a microservices approach to model serving, the Predictor interface is strictly concerned with two things:

由于Cortex采用微服务方法来建模服务,因此Predictor接口严格关注两件事:

  • Initializing models

    初始化模型
  • Serving predictions

    服务预测

In that spirit, Cortex’s Predictor interface requires two functions, __init__() and predict(), that do more or less what you’d expect:

本着这种精神,Cortex的Predictor接口需要两个函数__init__()predict() ,它们或多或少地实现了您的期望:

After initialization, you can think of a Predictor as just a Python object, whose single predict() function gets called when users query an endpoint.

初始化之后,您可以将Predictor视为一个Python对象,当用户查询端点时,将调用该对象的单个predict()函数。

One of the big benefits of this approach is that it is intuitive to anyone with software engineering experience. There’s no need to touch your data pipeline or model training code. A model is just a file, and a Predictor is just an object that imports it and runs a predict() method.

这种方法的最大好处之一是,它对于具有软件工程经验的任何人都是直观的。 无需触摸数据管道或模型训练代码。 模型只是一个文件,而Predictor只是一个导入它并运行predict()方法的对象。

Beyond its syntactic appeal, however, the approach offers some key benefits in how it complements Cortex’s broader approach.

但是,除了语法上的吸引力外,该方法在补充Cortex的更广泛方法方面提供了一些关键优势。

2.预测只是一个HTTP请求 (2. A prediction is just an HTTP request)

One of the complexities of building an interface for serving predictions in production is that inputs will almost certainly differ, at least in format, from the model’s training data.

构建用于为生产中的预测提供服务的界面的复杂性之一是,输入至少与模型的训练数据至少在格式上肯定会有所不同。

This works on two levels:

这在两个级别上起作用:

  • The body of a POST request is not a NumPy array or whatever data structure your model was trained to process.

    POST请求的主体不是NumPy数组,也不是模型经过训练可以处理的任何数据结构。
  • Machine learning engineering is all about using models to build software, which often times means using models on data they were not trained to process, e.g. using GPT-2 to write folk music.

    机器学习工程就是使用模型来构建软件,这通常意味着对未经训练的数据使用模型,例如使用GPT-2 编写民间音乐

The Predictor interface, therefore, cannot be opinionated about the inputs and outputs of a prediction API. A prediction is just an HTTP request, and the developer is free to process it however they want. If, for example, they want to deploy a multi-model endpoint and query different models based on request params, they can do that:

因此,对于预测API的输入和输出,不能使用Predictor接口。 预测只是一个HTTP请求,开发人员可以根据需要随意处理它。 例如,如果他们想部署多模型端点并根据请求参数查询不同模型,则可以这样做:

And while this interface gives developers freedom in terms of what they can do with their API, it also provides some natural scoping that enables Cortex to be more opinionated on the infrastructure side.

尽管此接口使开发人员可以自由使用API​​进行操作,但它还提供了一些自然的作用域,使Cortex在基础架构方面更具主张。

For example, under the hood Cortex uses FastAPI to setup request routing. There are a number of processes Cortex sets up at this layer that relate to autoscaling, monitoring, and other infrastructure features which could become very complex if developers were required to implement routing.

例如, Cortex在后台使用FastAPI设置请求路由。 Cortex在此层设置了许多与自动缩放,监视和其他基础结构功能有关的过程,如果需要开发人员实施路由,这些过程可能会变得非常复杂。

But, because each API has a single predict() method, every API has the same number of routes—one. Being able to assume this allows Cortex to do a lot more at the infrastructure level without limiting engineers.

但是,由于每个API都有一个单独的predict()方法,所以每个API都有相同数量的路由-一个。 能够做到这一点使Cortex可以在基础架构级别上做更多的事情,而不会限制工程师。

3.服务模型只是微服务 (3. A served model is just a microservice)

Scale is a chief concern for anyone using machine learning in production. Models can get huge (GPT-2 is roughly 6 GB), are computationally expensive, and can have high latency. Especially for realtime inference, scaling up to handle traffic is a challenge—even more so if you’re budget constrained.

对于在生产中使用机器学习的任何人来说,规模都是一个主要问题。 模型可能会变得很大(GPT-2大约为6 GB),计算量很大,并且可能具有高延迟。 尤其是对于实时推理,扩展规模以处理流量是一个挑战,如果您的预算有限,就更是如此。

To solve for this, Cortex treats Predictors as microservices, which can be horizontally scaled. More specifically, when a developer hits $ cortex deploy, Cortex containerizes the API, spins up a cluster provisioned for inference, and deploys. Then, it exposes the API as a web service behind a load balancer, and configures autoscaling, updating, and monitoring:

为了解决这个问题,Cortex将Predictors视为可以水平缩放的微服务。 更具体地说,当开发人员点击$ cortex deploy ,Cortex将容器化API,旋转为推理而配置的集群,然后进行部署。 然后,它将API作为Web服务公开在负载均衡器后面,并配置自动缩放,更新和监视:

Image for post
Cortex Docs Cortex Docs

The Predictor interface is fundamental to this process, even though it “just” a Python interface.

Predictor接口是此过程的基础,即使它“只是” Python接口也是如此。

What the Predictor interface does is enforce a packaging of code in such a way that it becomes a single, atomic unit of inference. All the request handling code you need for a single API is contained within a Predictor. This makes it easy for Cortex to scale Predictors:

Predictor接口的作用是强制执行代码打包,以使其成为单个原子的推理单元。 单个API所需的所有请求处理代码都包含在Predictor中。 这使Cortex轻松缩放预测变量:

Image for post
Cortex GitHub Cortex GitHub

In this way, engineers don’t have to do any extra work—unless they want to tweak things, of course—to prepare an API for production. A Cortex deployment is production-ready by default.

这样,工程师不必做任何额外的工作-当然,除非他们想进行调整-可以为生产准备API。 默认情况下,Cortex部署可用于生产。

机器学习工程的界面-强调“工程” (An interface for machine learning engineering—emphasis on the “engineering”)

A constant theme in all of the above points is the balance of flexibility and ease-of-use. If the goal is to allow machine learning engineers to build whatever they want, Cortex needs to be:

上述所有方面的不变主题是灵活性和易用性之间的平衡。 如果目标是允许机器学习工程师构建他们想要的任何东西,那么Cortex需要:

  • Flexible enough for engineers to implement any idea.

    足够灵活,工程师可以实施任何想法。
  • Opinionated enough to automate the infrastructure work that obstructs engineers.

    自以为是足以使阻碍工程师的基础架构工作自动化。

Striking this balance is a constant challenge, particularly as the world of machine learning engineering is young and constantly changing.

达到这种平衡是一个持续的挑战,尤其是在机器学习工程世界还处于不断变化之中的今天。

However, by focusing solely on what it takes to build software out of models—the “engineering” in machine learning engineering—we believe we can walk that tightrope.

但是,通过仅专注于从模型中构建软件的需要(机器学习工程中的“工程”),我们相信我们可以走钢丝。

If contributing to the machine learning engineering ecosystem sounds interesting to you, we’re always happy to meet new contributors at Cortex.

如果对您的机器学习工程生态系统做出贡献听起来很有趣,我们很高兴 在Cortex结识新的贡献者

翻译自: https://towardsdatascience.com/designing-a-python-interface-for-machine-learning-engineering-ae308adc4412

问答机器人接口python

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值