文章目录
查看系统内核:
安装Nvidia显卡的官方驱动和系统自带的nouveau驱动冲突,禁用nouveau
3.0.0-12是内核版本号
安装显卡驱动
- 禁用Ubuntu自带驱动nouveau:在/etc/modprobe.d/blacklist.conf中添加:blacklist nouveau options nouveau modeset=0
- 安装ppa源:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo add-apt-repository ppa:xorg-edgers/ppa
- 更新文件和内核
sudo apt-get update
sudo update-initramfs -u
- 重启
进入system settings中的software & updates里进行additional drivers的更新,然后重启即可完成。
切记一定要重启 reboot
重启后可发现字体变大,此时系统自带显卡驱动卸载成功
Ctrl+Alt+F1 进入命令行界面
- 关闭图形化界面
sudo service lightdm stop
- 卸载原有驱动
sudo apt-get remove nvidia-*
- 给显卡驱动文件赋予权限
sudo chmod a+x NVIDIA-Linux-x86_64-450.80.02.run
- 安装驱动
sudo ./NVIDIA-Linux-x86_64-450.80.02.run -no-x-check -no-nouveau-check -no-opengl-files //只有禁用opengl这样安装才不会出现循环登陆的问题
-no-x-check:安装驱动时关闭X服务
-no-nouveau-check:安装驱动时禁用nouveau
-no-opengl-files:只安装驱动文件,不安装OpenGL文件
- 检测安装是否成功
nvidia-smi
nvidia-settings
安装CUDA
首先需要为 cuda_10.0.130_410.48_linux.run 设置管理员权限
sudo chmod -R 777 ./cuda_10.0.130_410.48_linux.run
sudo ./cuda_10.0.130_410.48_linux.run
安装 提示
Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: no
Install the CUDA 10.0 Toolkit?
(y)es/(n)o/(q)uit: yes
Enter Toolkit Location
[ default is /usr/local/cuda-10.0 ]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: no
Install the CUDA 10.0 Samples?
(y)es/(n)o/(q)uit: no
安装完成后添加环境变量
在用户根目录下,找到.bashrc文件并打开,在最后添加下面三行文本,保存并退出
cd ~
sudo vim .bashrc //打开.bashrc文件
添加以下内容
# added by cuda 10.0 installer
export PATH="/usr/local/cuda-10.0/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH"
环境变量安装成功后,需要重新加载
source ./bashrc
检测安装结果
nvcc --version
安装cuDNN
首先设置权限
# 设置权限
sudo chmod a+x cudnn-10.0-linux-x64-v7.4.2.24.tgz
# 解压文件
tar -xzvf cudnn-10.0-linux-x64-v7.4.2.24.tgz
# 创建文件夹
sudo mkdir /usr/local/cuda
sudo mkdir /usr/local/cuda/lib64
sudo mkdir /usr/local/cuda/include
# 把解压后的文件复制到本地文件夹
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
# 为文件设置权限
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
# 验证安装是否成功
cat /usr/local/cuda/include/cudnn.h |grep CUDNN_MAJOR -A 2
sudo dpkg -i libcudnn8_8.0.5.39-1+cuda11.1_amd64.deb
sudo dpkg -i libcudnn8-dev_8.0.5.39-1+cuda11.1_amd64.deb
sudo dpkg -i libcudnn8-samples_8.0.5.39-1+cuda11.1_amd64.deb
cp -r /usr/src/cudnn_sample_v8/ ./
cd cudnn_sample_v8
make clean && make
./mnistCUDNN
如果显示以下内容则表示成功
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 4
#define CUDNN_PATCHLEVEL 2
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"
install torch1.9.0
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
安装pycharm
将pycharm压缩包解压缩至指定文件夹中
sudo cp ./pycharm-community-2020.2.1.tar.gz /home/sprite/software/
使用命令行解压
tar -xzvf pycharm-community-2020.2.1.tar.gz
或在图形化界面使用右键提取
进入pycharm的安装目录下的bin目录下
执行以下语句即可打开pycharm软件
sh ./pycharm.sh
ubuntu16.04更换国内源
1.备份原始源文件source.list
桌面打开终端,执行命令:sudo cp /etc/apt/sources.list /etc/apt/sources.list.bak
2.修改源文件sources.list
(1)终端执行命令:sudo chmod 777 /etc/apt/sources.list 更改文件权限使其可编辑;
(2)执行命令: sudo gedit /etc/apt/sources.list 打开文件进行编辑;
(3)删除原来的文件内容,复制下面的任意一个到其中并保存(常用的是阿里源和清华源,推荐阿里源);
阿里源:
deb http://mirrors.aliyun.com/ubuntu/ xenial main restricted universe
multiversedeb http://mirrors.aliyun.com/ubuntu/ xenial-security main restricted
universe multiversedeb http://mirrors.aliyun.com/ubuntu/ xenial-updates main restricted
universe multiversedeb http://mirrors.aliyun.com/ubuntu/ xenial-proposed main restricted
universe multiversedeb http://mirrors.aliyun.com/ubuntu/ xenial-backports main restricted
universe multiversedeb-src http://mirrors.aliyun.com/ubuntu/ xenial main restricted
universe multiversedeb-src http://mirrors.aliyun.com/ubuntu/ xenial-security main
restricted universe multiversedeb-src http://mirrors.aliyun.com/ubuntu/ xenial-updates main
restricted universe multiversedeb-src http://mirrors.aliyun.com/ubuntu/ xenial-proposed main
restricted universe multiversedeb-src http://mirrors.aliyun.com/ubuntu/ xenial-backports main
restricted universe multiverse
清华源:
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main
restricted universe multiversedeb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main
restricted universe multiversedeb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main
restricted universe multiversedeb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates
main restricted universe multiversedeb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main
restricted universe multiversedeb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports
main restricted universe multiversedeb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main
restricted universe multiversedeb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security
main restricted universe multiversedeb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-proposed main
restricted universe multiversedeb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-proposed
main restricted universe multiverse
3.更新源
桌面终端执行命令:sudo apt update更新软件列表,换源完成。
修改Anaconda源
1、命令方式
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/msys2/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/conda-forge/
conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
conda config --set show_channel_urls yes
2、修改配置文件
在根目录下 vim .condarc 配置如下,删除-default
安装Tensorflow
- 新建一个Python3.6的虚拟环境 名为py36
conda create -n py36 python=3.6
- 创建完成后,激活虚拟环境
conda activate py36
- 此时查看python版本即为3.6的最高版本
- 安装tensorflow
pip install tensorflow_gpu-1.13.1-cp36-cp36m-manylinux1_x86_64.whl
使用pycharm引入tensorflow时,出如下错误
libcudnn.so.7: cannot open shared object file: No such file or directory
或者是以下错误
libcudnn.so.10.0: cannot open shared object file: No such file or directory
若确定cuda版本和tensorflow版本对应时,一般问题是没有创建软连接,即CUDNN连接建立问题
执行以下命令可进行排除是否是该问题
sudo ldconfig /usr/local/cuda-10.0/lib64
若提示以下错误,则表示是该问题
sudo ldconfig报错: /sbin/ldconfig.real:
/usr/local/cuda-10.0/lib64/libcudnn.so.7 不是符号连接
解决办法:创建软连接,其中7.4.2是版本号(根据个人版本进行修改,版本号可在/usr/local/cuda/lib64文件夹下查看)
sudo ln -sf /usr/local/cuda-10.0/lib64/libcudnn.so.7.4.2 /usr/local/cuda-10.0/lib64/libcudnn.so.7
一定要注意本机cuda的安装路径,有时根据网上不同的教程,对用的安装目录是不同的,比如cuda或者cuda-10.0,若安装路径是cuda-10.0则在/usr/local/cuda-10.0/lib64下创建软连接,查看是否有该文件
ll /usr/local/cuda-10.0/lib64/libcudnn* #查看该文件下libcudnn开头的文件
若无则从 /usr/local/cuda/lib64/文件夹拷贝至该文件夹
如果以上问题解决后,引入tensorflow输出结果,但提示以下错误:
FutureWarning: Passing (type, 1) or ‘1type’ as a synonym of type is
deprecated; in a future version of numpy, it will be understood as
(type, (1,)) / ‘(1,)type’. _np_qint8 = np.dtype([(“qint8”, np.int8,
1)])
则表示numpy版本太高,与tensorflow版本匹配,选择低点的numpy版本安装即可
重启后显卡驱动不存在问题
提示错误信息如下:
NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver.
输入以下命令后,重启即可
sudo apt install dkms
sudo dkms install -m nvidia -v 410.78
其中410.78是显卡版本号,可根据以下指令查看
ls /usr/src | grep nvidia