docker: deploy a deep learning environment manually

最新推荐文章于 2021-11-06 10:53:21 发布

咸鱼酱

最新推荐文章于 2021-11-06 10:53:21 发布

阅读量202

点赞数

分类专栏： docker

本文链接：https://blog.csdn.net/Lyrassongs/article/details/83609339

版权

docker 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Since currently a server is shared between numerous lab members, I usually need to run my code with non-root environment. All too often, I find it is indispensable to install new packages and polish the running environment, yet it is extremely inconvenient with a non-root user. Therefore, I plan to use docker, which could separate each other’s running environment. Here I record how I make a deep learning docker image from a basic Ubuntu image.

First I check which version of CUDA and driver were installed in the server previously (stable version is preferred and I check that to avoid unnecessary pitfall)

cat /proc/driver/nvidia/version
cat /usr/local/cuda/version.txt

The command nvcc --version gives the CUDA compiler version (which matches the toolkit version).

Then, we should use nvidia-docker to enable the docker container to use GPU of server (QuickStart) :


# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge -y nvidia-docker

# Add the package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

# Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

# Test nvidia-smi with the latest official CUDA image
docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi

Moreover, I imitate the Dockerfile written previously by a machine learning server provider:
Dockerfile.gpu1
And I use the docker image built by nvidia (find the version that suits you in this website), which is 9.0-cudnn7-devel-ubuntu16.04 (Dockerfile).

Enter the docker image just downloaded, install corresponding packages:

apt-get update
apt-get install -y bc \
	build-essential \
	cmake \
	curl \
	g++ \
	gfortran \
	git \
	libopenblas-dev \
	software-properties-common \
	vim \
	wget

Clean the installation cache to control the image size:

apt-get clean
apt-get autoremove
rm -rf /var/lib/apt/lists/*

Link BLAS library to use OpenBLAS using the alternative mechanism (https://www.scipy.org/scipylib/building/linux.html#debian-ubuntu)

update-alternatives --set libblas.so.3 /usr/lib/openblas-base/libblas.so.3

Install pip

curl -O https://bootstrap.pypa.io/get-pip.py && \
python3 get-pip.py && \
rm get-pip.py

Install TensorFlow GPU version

pip --no-cache-dir install tensorflow-gpu

咸鱼酱

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
docker: deploy a deep learning environment manually

Since currently a server is shared between numerous lab members, I usually need to run my code with non-root environment. All too often, I find it is indispensable to install new packages and polish t...
复制链接

扫一扫

专栏目录