1. 先在GPU服务器上安装Docker-CE
- 命令行输入
curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
- 安装完成后验证一下是否安装成功
sudo docker run hello-world
- 以下消息代表安装成功
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
0e03bdcc26d7: Pull complete
Digest: sha256:d58e752213a51785838f9eed2b7a498ffa1cb3aa7f946dda11af39286c3db9a9
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
2. 安装Nvidia相关
- 注意Docker 19.03后可以直接安装Nvidia Package,
# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
- 安装完成后验证
#### Test nvidia-smi with the latest official CUDA image
docker run --gpus all nvidia/cuda:10.0-base nvidia-smi
- 看到如下信息代表成功
Wed Jul 15 08:43:29 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2... On | 00000000:00:09.0 Off | 0 |
| N/A 63C P0 234W / 300W | 17185MiB / 32480MiB | 99% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
如有疑问可参考官方文档 https://github.com/NVIDIA/nvidia-docker
3. 从Docker上Pull一个GPU的Docker
docker pull tensorflow/serving:latest-gpu
4. 调用GPU端口 :
- 下载一个tensorflow的Demo Repo
git clone https://github.com/tensorflow/serving
- 运行Docker
# Location of demo models
TESTDATA="$(pwd)/serving/tensorflow_serving/servables/tensorflow/testdata"
# Start TensorFlow Serving container and open the REST API port
docker run -t --rm -p 8501:8501 \
-v "$TESTDATA/saved_model_half_plus_two_cpu:/models/half_plus_two" \
-e MODEL_NAME=half_plus_two \
-t tensorflow/serving:latest-gpu \
tensorflow/serving &
- 调用Docker 8501端口
curl -d '{"instances": [1.0, 2.0, 5.0]}' \
-X POST http://localhost:8501/v1/models/half_plus_two:predict
- 讲道理应该会出现如下图所示的代码
{
"predictions": [2.5, 3.0, 4.5
]
}%
OK,暂时就先这样,下次告诉你怎么部署自己的模型