把模型部署在管道_您应该在哪里部署模型

最新推荐文章于 2024-10-03 09:00:56 发布

weixin_26750481

最新推荐文章于 2024-10-03 09:00:56 发布

阅读量95

点赞数

文章标签： python 机器学习 java 人工智能 tensorflow

原文链接：https://towardsdatascience.com/where-should-you-deploy-your-model-8b67328b37c3

版权

把模型部署在管道

You’ve gone through the effort in cleaning your data. You’ve engineered features, destroyed missing values, and created a model that is well-suited to predict your target. You are finally ready to put this model into production, but the overwhelming amount of options you have to deploy your model with is overwhelming. Although we are very fortunate to live in an era where cloud computing has taken a large domain in the machine-learning market, this also has a draw-back in that all of the options can often make it difficult to narrow your decision down to which one is going to best suite your model and your personal situation.

您已经完成了清理数据的工作。您已经设计了功能，破坏了缺失的值并创建了一个非常适合预测目标的模型。您终于可以将这种模型投入生产了，但是部署模型所需要的大量选择是压倒性的。尽管我们很幸运地生活在一个云计算已在机器学习市场占据了很大份额的时代，但这也有一个缺点，那就是所有选项通常都很难使您将决策范围缩小到一种是最适合您的模型和您的个人情况。

While Amazon Web Services (AWS) is, of course, a great option for most applications, it is also rather expensive and simply not needed for a lot of rather basic endpoints. On the other hand, automated services like Heroku make it incredibly easy and even free to host models, but also make the endpoints usability relatively limited in scope. With all of these great options balancing together in the domain of endpoint deployment, what is the best option to apply to different situations?

当然，对于大多数应用程序来说，Amazon Web Services(AWS)是一个不错的选择，但它也相当昂贵，并且对于许多相当基本的终端节点根本不需要。另一方面，像Heroku这样的自动化服务使托管模型变得非常容易甚至免费，而且使端点的可用性在范围上受到限制。在端点部署的领域中，所有这些出色的选择都得到了平衡，在不同情况下的最佳选择是什么？

选项№1：AWS (Option №1: AWS)

Make no mistake, in terms of performance to cost, AWS has nearly every other service completely blown out of the water. This is why many jobs have AWS experience as a requirement for employment. For many enterprise applications, AWS is a nearly perfect tool because it can allow for grouped instances scaled using compute and cost across a large scale of different servers.

毫无疑问，就性能到成本而言，AWS几乎将所有其他服务彻底摧毁了。这就是为什么许多工作需要AWS经验作为就业要求的原因。对于许多企业应用程序而言，AWS是近乎完美的工具，因为它可以允许使用计算实例和成本跨大量不同服务器扩展规模的分组实例。

While AWS is a great solution for models that need a lot of computational power, it can certainly be argued that a line needs to be drawn at some point in the ratio of price to performance. AWS can get rather expensive rather quickly, and this might not be optimal if you plan to use a lot of computation with small returns. If it isn’t possible for you to make your money back on the computation, then it likely isn’t worth hosting in the first place. For some applications that might be intensive over the long term, but not necessarily need an immediate delivery of power, a more frugal option might be necessary. On top of that, AWS is notoriously hard to use. In my experience, I have run into situations where I have been unable to find options in the AWS dashboard, or actually been blocked from doing certain things via SSH by AWS, which has certainly been frustrating.

尽管AWS是需要大量计算能力的模型的理想解决方案，但可以肯定地说，在某些时候必须以价格与性能之比划清界限。 AWS很快会变得相当昂贵，如果您打算使用大量回报少的计算，这可能不是最佳选择。如果您不可能从计算中赚钱，那么一开始就不值得托管。对于某些可能需要长期使用，但不一定需要立即供电的应用，可能需要更省钱的选择。最重要的是，众所周知，AWS很难使用。根据我的经验，我遇到无法在AWS仪表板中找到选项的情况，或者实际上被AWS阻止通过SSH通过SSH进行某些操作，这确实令人沮丧。

选项№2：Linode (Option №2: Linode)

If you’re like me and you use Linux everyday, know how to use NGINX and Apache, and understand how to run a Unix command line, Linode could quite possibly be the best option for you. Linode has the advantage of not only being cheap, but also similarly to AWS being extremely scalable. Need more memory for a specific operation on your server? Linode allows you to pick individual parts to be used for your server, so your memory issue would be solved. Here is an article I wrote which goes into more depth on the benefits of Linode:

如果您像我一样，并且每天都使用Linux，并且知道如何使用NGINX和Apache，并且了解如何运行Unix命令行，那么Linode很可能是您的最佳选择。 Linode的优势不仅在于价格便宜，而且与AWS一样具有极高的可扩展性。需要更多内存用于服务器上的特定操作？ Linode允许您选择要用于服务器的各个部件，因此可以解决您的内存问题。这是我写的一篇文章，深入探讨了Linode的好处：

The great thing about Linode is that it’s not built to do one thing or the other. While this means that a Linode server is incredibly dynamic and free, it also means that it can be a bit more complicated than some of the other options you might have at your disposals. With a Linode server, you’re going to need to do the “ Docker stuff” and setup web-servers and things of that nature. If you’re not prepared to do so, then Linode might not be the best option for you personally, regardless as to whether or not it is for the model you are looking to deploy. Luckily, I have published quite a few different articles on how to setup an NGINX server and deploy Gunicorn3 into production, like this one:

Linode的伟大之处在于它并不是为做一件事情而做的。尽管这意味着Linode服务器是动态且免费的，但是这也意味着它可能比您可以使用的其他一些选项更为复杂。使用Linode服务器，您将需要执行“ Docker任务”并设置Web服务器以及类似性质的东西。如果您不准备这样做，那么Linode可能不是您个人的最佳选择，无论它是否适合您要部署的模型。幸运的是，我发表了很多不同的文章，介绍如何设置NGINX服务器以及将Gunicorn3部署到生产环境中，例如：

Linode is a great middle-ground between something managed yet complex like AWS and something incredibly managed but incredibly simple like Heroku. While it might not provide the same raw performance, or perfect connections that AWS can provide, it runs at a simple flat rate that you can expand as you desire.

Linode是介于诸如AWS之类的可管理但复杂的事物与诸如Heroku之类的令人难以置信的管理但极其简单的事物之间的中间地带。虽然它可能无法提供AWS可以提供的原始性能或完美的连接，但它以简单的固定费用运行，您可以根据需要进行扩展。

选项№3：Heroku (Option №3: Heroku)

Where you deploy your model is a question which has an answer that depends not only on what you seek to accomplish with your model, but also what you yourself are capable of. If you are a beginner and just started managing virtual environments, Heroku is a great option to get you familiar with the deployment process to some extent. Heroku is a service that will setup a unique virtual environment for your services and deploy them automatically with your dependencies loaded from said environment.

您在哪里部署模型，这个问题的答案不仅取决于您要使用模型完成的工作，还取决于您自己的能力。如果您是初学者并且刚刚开始管理虚拟环境，那么Heroku是使您在某种程度上熟悉部署过程的绝佳选择。 Heroku是一项服务，它将为您的服务设置一个独特的虚拟环境，并在从该环境中加载依赖项的情况下自动部署它们。

Although it is rather convenient, Heroku only gives a free-tier user about three deployments before begging them for money. Another problem with a Heroku app is that it is embedded into the Heroku system by default, which gives you very little flexibility to add anything or expand your services in the future. Models that automatically train themselves, for example, would be a very difficult thing to implement on a Heroku platform.

尽管这很方便，但是Heroku在讨价还价之前只向自由层用户提供了三个部署。 Heroku应用程序的另一个问题是，默认情况下它是嵌入到Heroku系统中的，这给您将来增加任何内容或扩展服务的灵活性非常小。例如，在Heroku平台上实施自动训练的模型将是一件非常困难的事情。

选项№4：部署自己！ (Option №4: Deploy Yourself!)

If you are capable, or just really able to apply yourself and maybe have an old computer lying around, there is absolutely nothing stopping you from hosting your endpoints all by yourself! While there are disadvantages to self-hosting like reduced internet speeds and electricity costs, there are also some very real and convincing advantages that you might want to consider before throwing the idea away entirely. The biggest advantage to hosting your own server is that you have control over the hardware involved. While this is somewhat of a negative, as it can mean some rather high start-up cost, it also means that you can upgrade all of your components freely as you desire.

如果您有能力，或者真的有能力运用自己的能力，或者周围有一台旧计算机，那么绝对没有什么可以阻止您自己托管所有端点！尽管自托管服务有一些缺点，例如降低的互联网速度和电费，但在完全放弃创意之前，您可能还需要考虑一些非常真实而令人信服的优点。托管自己的服务器的最大优势是您可以控制所涉及的硬件。尽管这在一定程度上是不利的，因为这可能意味着相当高的启动成本，但也意味着您可以根据需要自由升级所有组件。

Having the freedom to do whatever you need to do with your server, access it locally, or even go non-headless can be a serious advantage. Sometimes it might be more convenient to mount a flash-drive to transfer a large file, rather than secure copying it over. The advantage of having your own physical server is like the advantage to having more storage on your phone itself, rather than cloud storage; there is no middleman, and you are the sole proprietor of your server. If running a server is something that you might wish to pursue, I have written a tutorial on how to do exactly that here:

拥有处理服务器所需的一切自由，在本地访问它，甚至变得无头的自由都是一个重要的优势。有时，安装闪存驱动器来传输大文件可能比将其安全地复制过来更方便。拥有自己的物理服务器的好处就像在电话本身上拥有更多存储而不是云存储一样。没有中间人，您是服务器的唯一所有者。如果您希望运行服务器，那么我在这里编写了一个有关如何执行此操作的教程：

While there are many more options than these four for deployment of course, I thought this is a great outline of the general server hosts that you might find across the board capable of deploying endpoints. For me personally, I enjoy working with as few restrictions as possible, and I am pretty good with Linux. As a result, I would consider the Linode and self-deployment options the most. The option you select is going to depend entirely on what your model is for, how it works, what it needs, as well as your dev-ops skills as a developer. I also would like to thank you for reading this article. If you happen to be reading this article on September 1st — September 2nd, I would like to demonstrate a self-hosted HTTP server that will only be up for those two days, so I hope you enjoy my surprise!

当然，除了这四个选项之外，还有很多其他选项可供选择，但我认为这是概述可以部署端点的通用服务器主机的概述。就我个人而言，我喜欢尽可能少地使用限制，并且我对Linux相当满意。因此，我将最优先考虑Linode和自我部署选项。您选择的选项将完全取决于模型的用途，模型的工作原理，需求以及开发人员的开发技能。我还要感谢您阅读本文。如果您恰好在9月1日至9月2日阅读本文，我想演示一个仅托管这两天的自托管HTTP服务器，因此希望您感到惊喜！

Here is a link for those of you on desktop:

这是台式机上适合您的链接：

http://172.223.154.77:8000/desktop.html

And here is a link for the mobile users out there!:

这里是移动用户的链接！：

http://172.223.154.77:8000/mobile/mobile.html