如果您不将Docker用于数据科学项目,那么您将生活在1985年

重点 (Top highlight)

One of the hardest problems that new programmers face is understanding the concept of an ‘environment’. An environment is what you could say, the system that you code within. In principal it sounds easy, but later on in your career you begin to understand just how difficult it is to maintain.

新程序员面临的最困难的问题之一是了解“环境”的概念。 您可以说的是环境,即您在其中编码的系统。 从原则上讲,这听起来很容易,但是在职业生涯的后期,您开始了解维护的难易程度。

The reason being is that libraries and IDE’s and even the Python Code itself goes through updates and version changes, then sometimes, you’ll update one library, and a separate piece of code will fail, so you’ll need to go back and fix it.

原因是库和IDE甚至Python代码本身都会进行更新和版本更改,因此有时您将更新一个库,而另一段代码将失败,因此您需要返回并进行修复它。

Moreover, if we have multiple projects being developed at the same time, there can be dependency conflicts, which is when things really get ugly as code fails directly because of another piece of code.

而且,如果我们同时开发多个项目,则可能存在依赖冲突,这是当代码由于另一段代码而直接失败时,事情变得非常难看。

Also, say you want to share a project to a team mate working on a different OS, or even ship your project that you’ve built on your Mac to a production server on a different OS, would you have to reconfigure your code? Yes, you probably will have to.

另外,假设您想与在不同OS上工作的团队共享一个项目,或者甚至将在Mac上构建的项目运送到在不同OS上的生产服务器上,是否需要重新配置代码? 是的,您可能必须这样做。

So to mitigate any of these issues, containers were proposed as a method to separate projects and the environments that they exist within. A container is basically a place where an environment can run, separate to everything else on the system. Once you define what’s in your container, it becomes so much easier to recreate the environment, and even share the project with teammates.

因此,为了缓解这些问题中的任何一个,提出了将containers作为一种将项目及其所处环境分开的方法。 一个 container 基本上是一个可以运行环境的地方,与系统上的所有其他地方分开。 一旦定义了container,中的container,就可以轻松地重新创建环境,甚至与队友共享项目。

要求 (Requirements)

To get started, we need to install a few things to get set up:

首先,我们需要安装一些东西进行设置:

容器化Python服务 (Containerise a Python service)

Let’s imagine we’re creating a Flask service called server.py and let’s say the contents of the file are as follows:

假设我们正在创建一个名为server.py的Flask服务,并假设文件的内容如下:

from flask import Flask
server = Flask(__name__)@server.route("/")
def hello():
return "Hello World!"if __name__ == "__main__":
server.run(host='0.0.0.0')

Now as I said above, we need to keep a record of the dependencies for our code so for this, we can create a requirements.txt file that can contain the following requirement:

现在,如上所述,我们需要记录代码的依赖关系,因此,我们可以创建一个requirements.txt文件,其中可以包含以下要求:

Flask==1.1.1

So our package has the following structure:

因此,我们的软件包具有以下结构:

app
├─── requirements.txt
└─── src
└─── server.py

The structure is pretty logical (source kept is kept in a separate directory). To execute our Python program, all is left to do is to install a Python interpreter and run it.

该结构非常合理(源代码保存在单独的目录中)。 要执行我们的Python程序,剩下要做的就是安装一个Python解释器并运行它。

Now to run the program, we could run it locally but suppose we have 15 projects we’re working through — it makes sense to run it in a container to avoid any conflicts with any other projects.

现在要运行该程序,我们可以在本地运行它,但假设我们正在处理15个项目-在容器中运行它以避免与任何其他项目发生任何冲突都是有意义的。

Let’s move onto containerisation.

让我们进入集装箱化。

Docker文件 (Dockerfile)

To run Python code, we pack the container as a Docker image and then run a container based on it. So as follows:

要运行Python代码,我们将容器打包为Docker image ,然后基于该容器运行一个容器。 因此如下:

  1. Create a Dockerfile that contains instructions needed to build the image

    创建一个Dockerfile,其中包含构建映像所需的指令
  2. Then create an image by the Docker builder

    然后通过Docker构建器创建image

  3. The simple docker run <image> command then creates a container that is running an app

    简单的docker run <image>命令然后创建一个运行应用程序的容器

Dockerfile的分析 (Analysis of a Dockerfile)

A Dockerfile is a file that contains instructions for assembling a Docker image (saved as myimage):

Dockerfile是一个文件,其中包含有关组装Docker映像(保存为myimage )的说明:

# set base image (host OS)
FROM python:3.8# set the working directory in the container
WORKDIR /code# copy the dependencies file to the working directory
COPY requirements.txt .# install dependencies
RUN pip install -r requirements.txt# copy the content of the local src directory to the working directory
COPY src/ .# command to run on container start
CMD [ "python", "./server.py" ]

A Dockerfile is compiled line by line so the builder generates an image layer and stacks it upon previous images.

Dockerfile是逐行编译的,因此构建器会生成图像层并将其堆叠在先前的图像上。

We can also observe in the output of the build command the Dockerfile instructions being executed as steps.

我们还可以在build命令的输出中观察到作为步骤执行的Dockerfile指令。

$ docker build -t myimage .
Sending build context to Docker daemon 6.144kBStep 1/6 : FROM python:3.8
3.8.3-alpine: Pulling from library/python

Status: Downloaded newer image for python:3.8.3-alpine
---> 8ecf5a48c789Step 2/6 : WORKDIR /code
---> Running in 9313cd5d834d
Removing intermediate container 9313cd5d834d
---> c852f099c2f9Step 3/6 : COPY requirements.txt .
---> 2c375052ccd6Step 4/6 : RUN pip install -r requirements.txt
---> Running in 3ee13f767d05

Removing intermediate container 3ee13f767d05
---> 8dd7f46dddf0Step 5/6 : COPY ./src .
---> 6ab2d97e4aa1Step 6/6 : CMD python server.py
---> Running in fbbbb21349be
Removing intermediate container fbbbb21349be
---> 27084556702b
Successfully built 70a92e92f3b5
Successfully tagged myimage:latest

Then, we can see that the image is in the local image store:

然后,我们可以看到该图像在本地图像存储中:

$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
myimage latest 70a92e92f3b5 8 seconds ago 991MB

During development, we may need to rebuild the image for our Python service multiple times and we want this to take as little time as possible.

在开发过程中,我们可能需要多次重建Python服务的映像,并且我们希望这样做花费尽可能少的时间。

Note: Docker and virtualenv are quite similar but different. Virtualenv only allows you to switch between Python Dependencies but you’re stuck with your host OS. However with Docker, you can swap out the entire OS — install and run Python on any OS (think Ubuntu, Debian, Alpine, even Windows Server Core). Therefore if you work in a team and want to future proof your technology, use Docker. If you don’t care about it — venv is fine, but remember it’s not future proof. Please reference this if you still want more information.

注意: Dockervirtualenv非常相似,但有所不同。 Virtualenv只允许您在Py​​thon依赖关系之间进行切换,但是您对主机OS感到Virtualenv 。 但是,使用Docker ,您可以换出整个OS -在任何OS上安装并运行Python(请考虑使用Ubuntu,Debian,Alpine甚至Windows Server Core)。 因此,如果您在团队中工作,并且希望将来验证您的技术,请使用Docker 。 如果您不关心它, venv很好,但是请记住,这并不是未来的证明。 如果您仍需要更多信息,请参考此内容。

There you have it! We’ve shown how to containerise a Python service. Hopefully, this process will make it a lot easier and gives your project a longer shelf life as it’ll be less likely to come down with code-bugs as dependencies change.

你有它! 我们已经展示了如何容器化Python服务。 希望这个过程将使它变得更容易,并为您的项目提供更长的保存期限,因为随着依赖关系的改变,代码错误的可能性将降低。

Thanks for reading, and please let me know if you have any questions!

感谢您的阅读,如果您有任何疑问,请告诉我!

Keep up to date with my latest articles here!

在这里了解我的最新文章!

翻译自: https://towardsdatascience.com/youre-living-in-1985-if-you-don-t-use-docker-for-your-data-science-projects-858264db0082

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值