配置基于WSL2的Docker环境并支持CUDA

导言

Content

正如前文windows 10 开启WSL2介绍的,我们可以在windows10中使用linux子系统。今天本文介绍如何在此基础上安装Docker并支持在wsl中使用GPU。

准备工作

  1. 加入windows insider preview。建议选Dev通道,不要选Beta。
  2. 安装Nvidia WSL2-compatibile 驱动

打开这个链接-> Get CUDA Driver-> log in -> download

  1. 管理员身份运行powershell:
dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart

dism.exe /online /enable-feature /featurename:VirtualMachinePlatform /all /norestart

wsl --set-default-version 2
  1. 更新 wsl
wsl.exe --update

如果update参数无效,没有更新wsl,则说明你没有使用预览版的windows系统,wsl的版本低。也许你从NVIDIA、Docker、Microsoft看到的文档中告诉你大于某个版本号就可以,但我建议你使用当前最新版本

If you find wsl cannot be updated, please update your windows os to the latest preview version.

安装Docker

下载

去Docker官网下载,请不要使用下面这个脚本。
Don't use the following commend. Please visit Docker offical website.

curl https://get.docker.com | sh

设置

Use the WSL2 based engine

开启你需要使用docker的wsl发行版

安装CUDA Toolkit

在wsl里,这里举例用到微软store下载的Ubuntu-18.04

sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo sh -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/cuda.list'
sudo apt-get update
sudo apt-get install -y cuda-toolkit-11-0

测试CUDA

cd /usr/local/cuda/samples/4_Finance/BlackScholes
make
./BlackScholes

如果结果看起来如同下列所示,说明是OK的。

GPU Device 0: "Turing" with compute capability 7.5

Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
...generating input data in CPU mem.
...copying input data to GPU mem.
Data init done.

Executing Black-Scholes GPU kernel (512 iterations)...
Options count             : 8000000
BlackScholesGPU() time    : 0.723174 msec
Effective memory bandwidth: 110.623468 GB/s
Gigaoptions per second    : 11.062347

BlackScholes, Throughput = 11.0623 GOptions/s, Time = 0.00072 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128

Reading back GPU results...
Checking the results...
...running CPU calculations.

Comparing the results...
L1 norm: 1.741792E-07
Max absolute error: 1.192093E-05

Shutting down...
...releasing GPU memory.
...releasing CPU memory.
Shutdown done.

[BlackScholes] - Test Summary

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

请注意,使用 nvidia-smi命令不起作用是正常的

安装 NVIDIA Container Toolkit

distribution=(. /etc/os-release;echoIDVERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2

启动docker service

sudo service docker restart

测试下docker

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark 

如果结果形如下列所示,则说明是OK的

Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
        -fullscreen       (run n-body simulation in fullscreen mode)
        -fp64             (use double precision floating point values for simulation)
        -hostmem          (stores simulation data in host memory)
        -benchmark        (run benchmark to measure performance)
        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
        -compare          (compares simulation results running once on the default GPU and once on the CPU)
        -cpu              (run n-body simulation on the CPU)
        -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
MapSMtoCores for SM 7.5 is undefined.  Default to use 64 Cores/SM
GPU Device 0: "GeForce GTX 1650" with compute capability 7.5

> Compute 7.5 CUDA device: [GeForce GTX 1650]
16384 bodies, total time for 10 iterations: 25.868 ms
= 103.772 billion interactions per second
= 2075.440 single-precision GFLOP/s at 20 flops per interaction

如果你发现使用Nvidia Driver >=465.42 时,有这样的错误信息:

docker: Error response from daemon: OCI runtime create failed: 
container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: 
cuda>=11.2, please update your driver to a newer version, or use an earlier cuda container: unknown.

请更新驱动 Nvidia Driver >=470.76

Ref: Issue

一些问题QA

  • Error: only 0 Devices available, 1 requested. Exiting.
    • reboot
  • IP address of windows host
    • cat /etc/resolv.conf | grep nameserver | awk '{ print $2 }'

reference

1.https://ocdevel.com/blog/20201207-wsl2-gpu-docker
2.https://docs.nvidia.com/cuda/wsl-user-guide/index.html#installing-nvidia-docker
3.https://docs.microsoft.com/zh-cn/windows/wsl/install-win10
4.https://developer.nvidia.com/blog/announcing-cuda-on-windows-subsystem-for-linux-2/
5.https://docs.docker.com/docker-for-windows/wsl/

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

DLANDML

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值