如何为您的计算机科学研究项目使用Docker的最低使用指南

When everyone is discussing dockers, talking about how good it is to dockerize your project, because it is like an isolated environment, people can just run your docker image without setting up dependencies.

当每个人都在讨论docker时,谈论将项目进行docker化有多好,因为它就像一个孤立的环境,人们可以只运行您的docker映像而无需设置依赖项。

I am currently working on deep learning research and my boss asked me to put my research project into a docker. So it took me some time to understand what is docker and how to use it with minimal understanding.

我目前正在从事深度学习研究,而我的老板要求我将研究项目放入docker。 因此,我花了一些时间来了解什么是泊坞窗以及如何以最少的了解来使用它。

It is a minimal understanding, therefore it might not be the correct way to use docker, but I found out it suits my usage as a researcher so I decided to share it!

这是一个最低限度的了解,因此它可能不是使用docker的正确方法,但是我发现它适合我作为研究人员的用法,因此我决定分享一下!

什么是码头工人? (What is a docker?)

I am explaining it by my experience, not from the definition of Google. If you know what is a virtual machine, it is like an emulator of another computer (e.g. ubuntu) in your current computer. Docker is like a similar one, just that it is a faster and better virtual machine (deep inside is quite different and complicated). It is an isolated environment, a new computer!

我是根据我的经验而不是Google的定义来解释它。 如果您知道什么是虚拟机,则它就像您当前计算机中另一台计算机(例如ubuntu)的仿真器。 Docker就像一个类似的Docker,只是它是一个更快更好的虚拟机(内部深度完全不同且复杂)。 这是一个孤立的环境,一台新计算机!

如何安装? (How to install?)

This website has very clear instructions on installing docker in your ubuntu https://phoenixnap.com/kb/how-to-install-docker-on-ubuntu-18-04. After you installed it, try running “docker --version” and you have installed successfully if outputs are shown!

该网站上有关于在您的ubuntu中安装docker的非常明确的说明https://phoenixnap.com/kb/how-to-install-docker-on-ubuntu-18-04 。 安装后,尝试运行“ docker --version”,如果显示输出,则说明安装成功!

我想使用docker而不安装任何软件包! (I want to use docker without installing any package!)

The reason I use docker for all my projects is that I don’t need to setup CUDA stuff in a new computer (as you know, CUDA stuff very complicated, I messed up my ubuntu a few times). Of course, some times I am lazy to install dependencies, docker is best for me!

我在所有项目中都使用docker的原因是,我不需要在新计算机上设置CUDA程序(如您所知,CUDA程序非常复杂,我弄乱了Ubuntu几次)。 当然,有时候我懒于安装依赖项,docker最适合我!

If you are messing with Deep Learning now, most probably you will use either PyTorch or Tensorflow.

如果您现在正在使用深度学习,那么很可能会使用PyTorch或Tensorflow。

Without docker, you may need to install Anaconda3, python3, new ubuntu, setting up CUDA, pip install, conda install. But with docker, want you need to do is just two commands.

如果没有docker,则可能需要安装Anaconda3,python3,新的ubuntu,设置CUDA,pip安装,conda安装。 但是对于docker,想要做的只是两个命令。

docker pull pytorch/pytorch:latest # will be downloading the “image”, or you can get it as it is installing the environmentdocker run -it --rm --gpus all pytorch/pytorch bash

docker pull pytorch / pytorch:latest#将下载“图像”,或者您可以在安装环境时获取它。docker run -it --rm --gpus all pytorch / pytorch bash

If you face “permission error”, just run with sudo, because I have set up running docker without sudo.

如果您遇到“权限错误”,请仅使用sudo运行,因为我已设置了不使用sudo运行docker。

Now you have entered the “virtual machine” (it is called “container”, whenever I talked about “container”, it means I am talking about this “virtual machine”), notice that you are “root” in this container, later I will talk about how to get rid of “root”.

现在,您已经输入了“虚拟机”(称为“容器”,每当我谈论“容器”时,就意味着我在谈论此“虚拟机”),请注意,您稍后将在此容器中“ root”我将谈论如何摆脱“根”。

Image for post
In this “virtual machine”, you can do anything like a normal ubuntu environment.
在此“虚拟机”中,您可以执行任何类似于正常ubuntu环境的操作。

You can refer to pytorch repository in Dockerhub for other versions of pytorch: https://hub.docker.com/r/pytorch/pytorch

您可以参考Dockerhub中的pytorch存储库以获取其他版本的pytorch: https ://hub.docker.com/r/pytorch/pytorch

If you want tensorflow version (or maybe specific library version), just try Google “dockerhub tensorflow” or “docker pull tensorflow”, or search it in Dockerhub.

如果您想要tensorflow版本(或特定的库版本),只需尝试Google“ dockerhub tensorflow”或“ docker pull tensorflow”,或在Dockerhub中搜索它。

If you wonder what is that -it and --rm, “-i” is interactive, “-t” is TTY (something like displaying the real-time standard output, you may check the docker documentation for the actual meaning), “--rm” is that it will remove the “container” after you exit the docker (like deleting your virtual machine from storage, you can run without this command if you have already familiar with docker).

如果您想知道-it和--rm是什么,“-i”是交互式的,“-t”是TTY(类似于显示实时标准输出,则可以查看docker文档的实际含义),“ --rm”是退出docker后将删除“容器”(就像从存储中删除虚拟机一样,如果您已经熟悉docker,则可以在不使用此命令的情况下运行)。

Latest docker is quite convenient that you don’t have to set up Nvidia stuff. “--gpus all” is telling docker that you want to map ALL GPU devices to this container. You may run with “--gpus \“device=0,1\” if you just want to map GPU 0 and GPU 1 into this container, noticed that I used \” \” is because it is the docker syntax for the individual device.

最新的docker非常方便,您无需设置Nvidia的东西。 “ --gpus all”告诉docker您想将所有GPU设备映射到此容器。 如果您只想将GPU 0和GPU 1映射到此容器中,则可以使用“ --gpus \“ device = 0,1 \””运行,请注意,我使用了\” \”,因为这是个人的docker语法设备。

Image for post
If you wonder why the GPU memory is 7GB used but you can’t see other processes, it is because those processes are not in this container. This container is isolated in your original computer and with other containers
如果您想知道为什么要使用7GB的GPU内存,却看不到其他进程,那是因为这些进程不在此容器中。 该容器与您的原始计算机以及其他容器隔离

将文件夹挂载到Docker容器中 (Mounting your folder into Docker container)

docker run -it --rm --gpus all -v /path/to/my/code:/path/in/docker -v /datasets:/data mypytorch bash

docker run -it --rm --gpus all -v / path / to / my / code:/ path / in / docker -v / datasets:/ data mypytorch bash

For each folder that you want to “mount” into the container, you need to put a “-v” and “/pathA:/pathB” in the command like above. Then your folder “/A” will be mounted (or mapped) into “/B” in container

对于要“装入”容器的每个文件夹,都需要在上述命令中放入“ -v”和“ / pathA:/ pathB”。 然后,您的文件夹“ / A”将被挂载(或映射)到容器中的“ / B”中

Image for post
https://kamwoh.github.io/DeepIPR/ https://kamwoh.github.io/DeepIPR/

As you can see from above, the original folder path is “/data/Projects…”, then I mounted it into the container with the command “-v”. In the container, my mounted folder is in “/workspace/DeepIPR”. Note: all your files created in the mounted folder, will be appeared in the original folder.

从上面可以看到,原始文件夹路径为“ / data / Projects…”,然后我使用命令“ -v”将其安装到容器中。 在容器中,我安装的文件夹在“ / workspace / DeepIPR”中。 注意:在安装的文件夹中创建的所有文件都将出现在原始文件夹中。

记住用-u $ {id -u):$(id -g)运行 (Remember run with -u $(id -u):$(id -g))

Before I knowing this command, I usually just run the docker, mapping my folder into container’s folder like below

在知道此命令之前,我通常只是运行docker,将我的文件夹映射到容器的文件夹,如下所示

docker run -it --rm -v /path/to/my/code:/path/in/docker -u $(id -u):$(id -g) mypytorch bash

docker run -it --rm -v / path / to / my / code:/ path / in / docker -u $ {id -u):$ {id -g)mypytorch bash

It is okay if you don’t run with “-u”, but your new folder created by container in /path/to/my/code will become “root” access. After doing this step, your new folder created in the folder that is mounted in container will become your own username again. You will see “I have no name”, but it is okay, it won’t affect any usage, just ignore it because we didn’t set up the docker properly (more advance part)

不用-u也可以,但是/ path / to / my / code中由容器创建的新文件夹将成为“ root”访问权限。 完成此步骤后,在容器中安装的文件夹中创建的新文件夹将再次成为您自己的用户名。 您将看到“我没有名字”,但这没关系,它不会影响任何用法,请忽略它,因为我们没有正确设置泊坞窗(更高级的部分)

Image for post
notice that 6001 and 6000 is the user and group id of your original computer environment
请注意,6001和6000是原始计算机环境的用户和组ID

直接运行您的代码 (Directly run your code)

My DevOps friend told me that my way of using docker is not right, using “bash” command is a bit ugly (all programmers love the elegant way of handling their code).

我的DevOps朋友告诉我,我使用docker的方式不正确,使用“ bash”命令有点难看(所有程序员都喜欢用优雅的方式来处理代码)。

After arguments after that image name, are commands that will be run in the container. So you can run like this

在该映像名称后的参数之后,是将在容器中运行的命令。 这样你就可以像这样运行

docker run -it --rm -v /path/to/my/code:/path/in/docker --gpus all mypytorch python /path/in/docker/xxxx.py

docker run -it --rm -v / path / to / my / code:/ path / in / docker --gpus所有mypytorch python /path/in/docker/xxxx.py

Image for post

简单的Dockerfile (Simple Dockerfile)

If you need to install an extra package that your original docker image doesn’t have.

如果您需要安装原始Docker映像所没有的额外软件包。

For example, the PyTorch image that you have pulled doesn’t contain “jupyter notebook”. You can create your own Dockerfile. (you will notice some github projects that is “dockerized” have a file named “Dockerfile”)

例如,您拉出的PyTorch图像不包含“ jupyter笔记本”。 您可以创建自己的Dockerfile。 (您会注意到一些被“ dockerized”的github项目有一个名为“ Dockerfile”的文件)

This Dockerfile is basically like a setup script to set up your image (image is like ISO file, container is like the content we run with the ISO file). Like how you set up your new ubuntu environment.

这个Dockerfile基本上就像一个用于设置映像的设置脚本(映像就像ISO文件,容器就像我们使用ISO文件运行的内容一样)。 就像您如何设置新的ubuntu环境一样。

Remember to save your file as “Dockerfile”!!

记住将文件另存为“ Dockerfile”!

Image for post

FROM xxxxx, it is the base image you want to build on top of.

从xxxxx开始,它是您要在其上构建的基础映像。

RUN xxxxx, it is running the command during the image building. For apt-get, you must be careful of it, if you want to have apt-get update before installing other packages, you must run with & like what I shown above, or else the docker will skip it for the second time you build with this Dockerfile.

RUN xxxxx,它在映像构建期间正在运行命令。 对于apt-get,您必须小心,如果要在安装其他软件包之前进行apt-get更新,则必须使用&像我上面显示的那样运行,否则docker将在第二次构建时跳过它这个Dockerfile。

ENV xxxxx, it is to setup environment varibles

ENV xxxxx,用于设置环境变量

WORKDIR xxxxx, you would notice that whenever I enter the container, I am in “/workspace”, this is what WORKDIR for. Setting a default location for you.

WORKDIR xxxxx,您会注意到,每当我进入容器时,我都在“ / workspace”中,这就是WORKDIR的目的。 为您设置默认位置。

You will feel weird that why am I setting chmod 777 for jupyter, it was the permission bug I facing, hoping this trick can help to solve your problem if you face it. (not elegant way of course)

您会感到奇怪,为什么我将chmod 777设置为jupyter,这是我遇到的权限错误,希望这个技巧可以帮助您解决遇到的问题。 (当然不是优雅的方式)

And you can build your own docker with your own name.

您可以使用自己的名称构建自己的码头工人。

DOCKER_BUILDKIT=1 docker build --tag imagename .

DOCKER_BUILDKIT = 1 docker build --tag imagename。

“DOCKER_BUILDKIT=1” makes the docker building the image in a faster way (probably something about caching stuff), argument after the “--tag” is the image name, and don’t forget that dot “.”, indicating “Dockerfile” in current directory.

“ DOCKER_BUILDKIT = 1”使Docker以更快的方式构建映像(可能是有关缓存的内容),“-tag”后面的参数是映像名称,并且不要忘记点“。”,表示“ Dockerfile” ”在当前目录中。

After building your own image, you can check it with “docker images”

构建自己的映像后,您可以使用“ docker images”进行检查

Image for post

结论 (Conclusion)

Image for post
If you want to know more about docker in this post
如果您想在本文中进一步了解docker

In this post, I am actually skipping a lot of useful docker shortcut as well, but I think those are quite complicated in this post (at least it was complicated for me when the first time I used docker). You might feel setting up docker troublesome, but trust me, the troublesome only once. Docker is quite convenient because you don’t need to set up again on a new computer. In case you really need to know more about docker or real elegant way of using ocker, you should Google about some docker cheatsheet or whatever. This post is about the minimal understanding of docker usage. Thanks for reading!!

在这篇文章中,我实际上也跳过了许多有用的Docker快捷方式,但是我认为这些内容在这篇文章中相当复杂(至少在我第一次使用docker时,这对我来说很复杂)。 您可能会觉得设置docker很麻烦,但请相信我,麻烦仅一次。 Docker非常方便,因为您无需在新计算机上再次进行设置。 万一您真的需要了解更多关于docker或使用ocker的真正优雅方式,您应该向Google询问一些docker备忘单或其他内容。 这篇文章是关于对docker使用的最低了解。 谢谢阅读!!

奖励:您可以参考的备忘单 (Bonus: cheat sheets that you can refer to)

I think after this post, you should have an intuitive understanding, reading below cheat sheets shouldn’t be a problem for you anymore!!

我认为在发布这篇文章后,您应该有一个直观的理解,阅读备忘单下面的内容对您来说不再是问题!!

  1. https://www.docker.com/sites/default/files/d8/2019-09/docker-cheat-sheet.pdf

    https://www.docker.com/sites/default/files/d8/2019-09/docker-cheat-sheet.pdf

  2. https://github.com/wsargent/docker-cheat-sheet

    https://github.com/wsargent/docker-cheat-sheet

  3. http://dockerlabs.collabnix.com/docker/cheatsheet/

    http://dockerlabs.collabnix.com/docker/cheatsheet/

  4. https://phoenixnap.com/kb/list-of-docker-commands-cheat-sheet

    https://phoenixnap.com/kb/list-of-docker-commands-cheat-sheet

翻译自: https://towardsdatascience.com/how-to-docker-for-your-computer-science-research-project-the-minimal-guide-for-using-docker-2ecd3e9280ac

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值