Ubuntu 安装 CUDA

最新推荐文章于 2024-07-07 13:38:23 发布

fungaren

最新推荐文章于 2024-07-07 13:38:23 发布

阅读量522

点赞数

分类专栏： Linux

本文链接：https://blog.csdn.net/kencaber/article/details/84668965

版权

Linux 专栏收录该内容

22 篇文章 1 订阅

订阅专栏

安装 NVIDIA 驱动

lspci | grep -i nvidia # 检查显卡
gcc --version # 检查是否安装 gcc
sudo apt-get install linux-headers-$(uname -r) # 安装内核头

下载 CUDA Toolkit

sudo dpkg -i cuda-repo-<distro>_<version>_<architecture>.deb
sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

环境

export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}

In addition, when using the runfile installation method, the LD_LIBRARY_PATH variable needs to contain /usr/local/cuda-10.0/lib64 on a 64-bit system, or /usr/local/cuda-10.0/lib on a 32-bit system

To change the environment variables for 64-bit operating systems:

export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64\
                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

To change the environment variables for 32-bit operating systems:

export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib\
                         ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Note that the above paths change when using a custom install path with the runfile installation method.

Because of the addition of new features specific to the NVIDIA POWER9 CUDA driver, there are some additional setup requirements in order for the driver to function properly. These additional steps are not handled by the installation of CUDA packages, and failure to ensure these extra requirements are met will result in a non-functional CUDA driver installation.

There are two changes that need to be made manually after installing the NVIDIA CUDA driver to ensure proper operation:
The NVIDIA Persistence Daemon should be automatically started for POWER9 installations. Check that it is running with the following command:

systemctl status nvidia-persistenced

If it is not active, run the following command:

sudo systemctl enable nvidia-persistenced

Disable a udev rule installed by default in some Linux distributions that cause hot-pluggable memory to be automatically onlined when it is physically probed. This behavior prevents NVIDIA software from bringing NVIDIA device memory online with non-default settings. This udev rule must be disabled in order for the NVIDIA CUDA driver to function properly on POWER9 systems.

On RedHat Enterprise Linux 7, this rule can be found in:

/lib/udev/rules.d/40-redhat.rules

On Ubuntu 17.04, this rule can be found in:

/lib/udev/rules.d/40-vm-hotadd.rules

The rule generally takes a form where it detects the addition of a memory block and changes the ‘state’ attribute to online. For example, in RHEL7, the rule looks like this:

SUBSYSTEM=="memory", ACTION=="add", PROGRAM="/bin/uname -p", RESULT!="s390*", ATTR{state}=="offline", ATTR{state}="online"

This rule must be disabled by copying the file to /etc/udev/rules.d and commenting out, removing, or changing the hot-pluggable memory rule in the /etc copy so that it does not apply to POWER9 NVIDIA systems. For example, on RHEL:

sudo cp /lib/udev/rules.d/40-redhat.rules /etc/udev/rules.d
sudo sed -i '/SUBSYSTEM=="memory", ACTION=="add"/d' /etc/udev/rules.d/40-redhat.rules

You will need to reboot the system to initialize the above changes.

卸载

Use the following command to uninstall a Toolkit runfile installation:

sudo /usr/local/cuda-X.Y/bin/uninstall_cuda_X.Y.pl

Use the following command to uninstall a Driver runfile installation:

sudo /usr/bin/nvidia-uninstall

Use the following commands to uninstall a RPM/Deb installation:

sudo yum remove <package_name>                      # Redhat/CentOS
sudo dnf remove <package_name>                      # Fedora
sudo zypper remove <package_name>                   # OpenSUSE/SLES
sudo apt-get --purge remove <package_name>          # Ubuntu

NVIDIA-Docker

Make sure you have installed the NVIDIA driver and a supported version of Docker for your distribution (see prerequisites).

Docker install

sudo wget -qO- https://get.docker.com/ | sh

If you have a custom /etc/docker/daemon.json, the nvidia-docker2 package might override it.

Ubuntu 14.04/16.04/18.04, Debian Jessie/Stretch

# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge -y nvidia-docker

# Add the package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

# Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

# Test nvidia-smi with the latest official CUDA image
docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi

引用

NVIDIA CUDA Installation Guide for Linux

参考

fungaren

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Ubuntu 安装 CUDA

安装 NVIDIA 驱动lspci | grep -i nvidia # 检查显卡gcc --version # 检查是否安装 gccsudo apt-get install linux-headers-$(uname -r) # 安装内核头下载 CUDA Toolkitsudo dpkg -i cuda-repo-&lt;distro&gt;_&lt;version&gt;_&lt;...
复制链接

扫一扫

专栏目录