Cuda and Cudnn install
Version
ubuntu 16.04 x86_64
TF: 1.15.0-gpu
Cuda:9.0
cudnn 7.5
tf 1.15 + cuda 10.2 + cudnn 7.6 may also be fine.
0. Nvidia Driver install(Not Necessary)
Press " ctrl+alt+f1 " to enter command mode.
* Forbidden nouveau
$blacklist nouveau
$options nouveau modeset=0
$update-initramfs -u
$reboot
$lsmod | grep nouveau
* shut down x server.
$su root // Cause I want to install for all users.
$/etc/init.d/lightdm stop // Stop Server
$sudo /etc/init.d/lightdm status // Only for check.
- if (you want to restart) {sudo /etc/init.d/lightdm restart}
* uninstall nvidia driver, cuda,
$ sudo apt-get --purge remove nvidia*
$ sudo apt autoremove
To remove CUDA Toolkit:
$ sudo apt-get --purge remove “cublas” “cuda*”
To remove NVIDIA Drivers:
$ sudo apt-get --purge remove “nvidia”
$ etc/init.d/lightm status (Make sure it’s started)
$ reboot
$ su root
1 Download and install Cuda 10.0
1.0 install dependency
root@$ apt-get install freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev"
– for kernel problem only.
root@$ apt install dkms
1.1
wget root@$ : https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda_10.0.130_410.48_linux
root@$ :sh cuda_10.0.130_410.48_linux --no-opengl-files
#open cl will cause more prolems.
root@$ : sh cuda_10.0.130_410.48_linux
1.2 if you get this
summery
Driver: installed
Toolkit: installed
Samples: Installed in /root, but missing recommend libraries
To uninstall CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.0/bin
To uninstall the NVIDIA Driver, run nvidia-unistall
Please see CUDA_Installation_Guide_linux.pdf in /usr/local/cuda-9.0/doc/pdf for detailed information on setting up CUDA.
log file is /tmp/cuda_install_31840.log
Please make sure that
- PATH includes /uer/local/cuda-10.0
- LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64 or add /user/loca/cuda-9.0/lib64 to /etc/ld.so.conf and run ldconfig as root.
2 install cudnn
wget https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.4.2/prod/10.0_20181213/cudnn-10.0-linux-x64-v7.4.2.24.tgz
tar -xzvf cudnn-10.0-linux-x64-v7.4.2.24.tgz
cp cuda/include/cudnn.h /usr/local/cuda/include
cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
chmod a+r /usr/local/cuda
1.3
3.Install Tensorflow_gpu
root@$ conda install tensorflow_gpu==1.15
4.Check NVIDIA-smi, Cuda, cuDNN
- driver
root@$ nvidia-smi
If error.
$ apt install nvidia-cuda-toolkit
$ nvidia-smi - cuda version
$ echo “CUDA Version”
$ cat /usr/local/cuda/version.txt
$ cat /proc/driver/nvidia/version - cudnn version.
$ echo “cudnn Version”
$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2 - NVCC
$ echo “NVCC INFO”
$ which nvcc
$ nvcc --version
vi ~/.bashrc
- #if no results
$ export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib.64:$LD_LIBRARY_PATH - #For primer targets. If here is no lib, then serarch cuda/lib
export LD_LIBRARY_PATH=/usr/local/cuda/lib:$LD_LIBRARY_PATH
$ export PATH=$PATH:/usr/local/cuda/bin
5.Test samples(Path may change)
- Check cuda.
root@$ cd /usr/local/cuda-10.0/samples/1_Utilities/deviceQuery/
root@$ make
root@$ ./deviceQuery - if print cuda information. Passed!