Ubuntu 安装 CUDA

安装 NVIDIA 驱动

lspci | grep -i nvidia # 检查显卡
gcc --version # 检查是否安装 gcc
sudo apt-get install linux-headers-$(uname -r) # 安装内核头

下载 CUDA Toolkit

sudo dpkg -i cuda-repo-<distro>_<version>_<architecture>.deb
sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

环境

export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}

In addition, when using the runfile installation method, the LD_LIBRARY_PATH variable needs to contain /usr/local/cuda-10.0/lib64 on a 64-bit system, or /usr/local/cuda-10.0/lib on a 32-bit system

To change the environment variables for 64-bit operating systems:

export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64\
                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

To change the environment variables for 32-bit operating systems:

export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib\
                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Note that the above paths change when using a custom install path with the runfile installation method.

Because of the addition of new features specific to the NVIDIA POWER9 CUDA driver, there are some additional setup requirements in order for the driver to function properly. These additional steps are not handled by the installation of CUDA packages, and failure to ensure these extra requirements are met will result in a non-functional CUDA driver installation.

There are two changes that need to be made manually after installing the NVIDIA CUDA driver to ensure proper operation:
The NVIDIA Persistence Daemon should be automatically started for POWER9 installations. Check that it is running with the following command:

systemctl status nvidia-persistenced

If it is not active, run the following command:

sudo systemctl enable nvidia-persistenced

Disable a udev rule installed by default in some Linux distributions that cause hot-pluggable memory to be automatically onlined when it is physically probed. This behavior prevents NVIDIA software from bringing NVIDIA device memory online with non-default settings. This udev rule must be disabled in order for the NVIDIA CUDA driver to function properly on POWER9 systems.

On RedHat Enterprise Linux 7, this rule can be found in:

/lib/udev/rules.d/40-redhat.rules

On Ubuntu 17.04, this rule can be found in:

/lib/udev/rules.d/40-vm-hotadd.rules

The rule generally takes a form where it detects the addition of a memory block and changes the ‘state’ attribute to online. For example, in RHEL7, the rule looks like this:

SUBSYSTEM=="memory", ACTION=="add", PROGRAM="/bin/uname -p", RESULT!="s390*", ATTR{state}=="offline", ATTR{state}="online"

This rule must be disabled by copying the file to /etc/udev/rules.d and commenting out, removing, or changing the hot-pluggable memory rule in the /etc copy so that it does not apply to POWER9 NVIDIA systems. For example, on RHEL:

sudo cp /lib/udev/rules.d/40-redhat.rules /etc/udev/rules.d
sudo sed -i '/SUBSYSTEM=="memory", ACTION=="add"/d' /etc/udev/rules.d/40-redhat.rules

You will need to reboot the system to initialize the above changes.

卸载

Use the following command to uninstall a Toolkit runfile installation:

sudo /usr/local/cuda-X.Y/bin/uninstall_cuda_X.Y.pl

Use the following command to uninstall a Driver runfile installation:

sudo /usr/bin/nvidia-uninstall

Use the following commands to uninstall a RPM/Deb installation:

sudo yum remove <package_name>                      # Redhat/CentOS
sudo dnf remove <package_name>                      # Fedora
sudo zypper remove <package_name>                   # OpenSUSE/SLES
sudo apt-get --purge remove <package_name>          # Ubuntu

NVIDIA-Docker

Make sure you have installed the NVIDIA driver and a supported version of Docker for your distribution (see prerequisites).

Docker install

sudo wget -qO- https://get.docker.com/ | sh

If you have a custom /etc/docker/daemon.json, the nvidia-docker2 package might override it.

Ubuntu 14.04/16.04/18.04, Debian Jessie/Stretch

# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge -y nvidia-docker

# Add the package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

# Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

# Test nvidia-smi with the latest official CUDA image
docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi

引用

参考

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值