docker: deploy a deep learning environment manually

Since currently a server is shared between numerous lab members, I usually need to run my code with non-root environment. All too often, I find it is indispensable to install new packages and polish the running environment, yet it is extremely inconvenient with a non-root user. Therefore, I plan to use docker, which could separate each other’s running environment. Here I record how I make a deep learning docker image from a basic Ubuntu image.

First I check which version of CUDA and driver were installed in the server previously (stable version is preferred and I check that to avoid unnecessary pitfall)

cat /proc/driver/nvidia/version
cat /usr/local/cuda/version.txt

The command nvcc --version gives the CUDA compiler version (which matches the toolkit version).

Then, we should use nvidia-docker to enable the docker container to use GPU of server (QuickStart) :


# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge -y nvidia-docker

# Add the package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

# Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

# Test nvidia-smi with the latest official CUDA image
docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi

Moreover, I imitate the Dockerfile written previously by a machine learning server provider:
Dockerfile.gpu1
And I use the docker image built by nvidia (find the version that suits you in this website), which is 9.0-cudnn7-devel-ubuntu16.04 (Dockerfile).

  1. Enter the docker image just downloaded, install corresponding packages:
apt-get update
apt-get install -y bc \
	build-essential \
	cmake \
	curl \
	g++ \
	gfortran \
	git \
	libopenblas-dev \
	software-properties-common \
	vim \
	wget
  1. Clean the installation cache to control the image size:
apt-get clean
apt-get autoremove
rm -rf /var/lib/apt/lists/*
  1. Link BLAS library to use OpenBLAS using the alternative mechanism (https://www.scipy.org/scipylib/building/linux.html#debian-ubuntu)
update-alternatives --set libblas.so.3 /usr/lib/openblas-base/libblas.so.3
  1. Install pip
curl -O https://bootstrap.pypa.io/get-pip.py && \
python3 get-pip.py && \
rm get-pip.py
  1. Install TensorFlow GPU version
pip --no-cache-dir install tensorflow-gpu
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值