Linux深度学习环境配置教程:Ubuntu16.04+CUDA10.0+pytoch+tensorflow

环境配置简介

在经过了N次配Linux深度学习环境之后,决定将流程进行总结避免每次都要重新进行搜索,此教程包含了一下几个方面

  • Ubuntu 16.04系统安装
  • NVIDIA驱动安装
  • CUDA 10.0 配置
  • cudnn 7.4 配置
  • PyTorch-GPU 安装及测试
  • TensorFlow-GPU 安装及测试

Ubuntu 16.04 系统安装

注:该部分内容参考了-牧野- 的博客内容,可以点击超链接或输入以下网址直达。
https://blog.csdn.net/dcrmg/article/details/79600421?tdsourcetag=s_pctim_aiomsg

制作 Ubuntu 16.04 的启动U盘

1. 准备软件

Ubuntu16.04 LTS:https://www.ubuntu.com/download/desktop
UltralSO软碟通:http://cn.ultraiso.net/xiazai.html

2. 制作Ubuntu的启动U盘

  1. 打开UltraISO,执行 文件->打开,选择已经下载的 ubuntu ISO镜像文件。
    在这里插入图片描述

  2. 启动->写入硬盘文件
    在这里插入图片描述

  3. 使用默认设置执行写入
    在这里插入图片描述

修改BISO,设置U盘启动

进入BIOS设置的快捷键不同板卡不尽相同,可以尝试F2,F12,Esc建,我这里是Dell台式机,F2键。操作步骤如下所示:

  1. 插入制作好的U盘,按快捷键进入BIOS,在左侧列表选择**“Secure Boot”,展开后选择“Secure Boot Enabled”,然后将右侧的选项改为“Disable”,更改完成后,点击下方的“Apply”**保存,如下图所示:
    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传在这里插入图片描述

  2. 在左侧找到**“Advance Boot Options”,然后在右侧界面将“Enable Legacy Option ROMs”选项打勾,完成后同样点击下方的“Apply”**按钮进行保存,如下图所示:
    在这里插入图片描述

  3. 在左侧找到**“Boot Sequence”选项,然后将右侧的“Boot List Option”更改为“Legacy”,在右上方将U盘对应的选项调整到到最上方(选中后点击上三角箭头),完成后按“Apply”**保存。
    在这里插入图片描述

  4. 退出BIOS,重启电脑即可。
    这一步就结束了

安装 Ubuntu 系统

  1. 进入Ubuntu安装界面是,按需求选择语言,选择“Install Ubuntu”
    在这里插入图片描述
  2. 单击“创建新分区表",点击的“+”创建4个主要的基础分区(如果没有空闲可以点击“-”号对区域进行格式化),按以下参数设置4个主要的基础分区:
空间分区位置格式名称
200G主分区空间起始位置ext4日志文件系统/
4G逻辑分区空间起始位置交换空间/swap
200M逻辑分区空间起始位置ext4日志文件系统/boot
剩余空间逻辑分区空间起始位置ext4日志文件系统/home

注意:

  • /home 可以不进行划分,剩余所有空间都划分到主分区里面。如果有多个硬盘,可以把一个硬盘用来装系统,其他硬盘划分为NTFS
  • 磁盘空间如果有剩余,可以按比例适当多分配一点。
  1. “安装启动引导设备”的参数选择:与/boot所在的编号一致。设置好后安装即可。

Ubuntu 初始化操作(个人选择)

更换apt源

1. 备份原始源文件

桌面打开终端,执行命令

sudo cp /etc/apt/sources.list /etc/apt/sources.list.bak

2. 修改源文件

  1. 更改文件权限使得其可以进行编辑
sudo chmod 777 /etc/apt/sources.list
  1. 打开文件进行编辑
sudo gedit /etc/apt/sources.list 
  1. 删除原来的内容,复制以下内容,推荐阿里源

阿里源

deb http://mirrors.aliyun.com/ubuntu/ xenial main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ xenial-security main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ xenial-updates main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ xenial-proposed main restricted universe multiverse

deb http://mirrors.aliyun.com/ubuntu/ xenial-backports main restricted universe multiverse

deb-src http://mirrors.aliyun.com/ubuntu/ xenial main restricted universe multiverse

deb-src http://mirrors.aliyun.com/ubuntu/ xenial-security main restricted universe multiverse

deb-src http://mirrors.aliyun.com/ubuntu/ xenial-updates main restricted universe multiverse

deb-src http://mirrors.aliyun.com/ubuntu/ xenial-proposed main restricted universe multiverse

deb-src http://mirrors.aliyun.com/ubuntu/ xenial-backports main restricted universe multiverse

清华源

	
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse

deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse

deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse

deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse

deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse

deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse

deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse

deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse

deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-proposed main restricted universe multiverse

deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-proposed main restricted universe multiverse

3. 更新源

终端执行命令,更新软件列表,换源完成。

sudo apt update

修改默认 python

Ubuntu 16.04 中,终端输入 python 默认为是 python 2.7,终端输入 python3 才是 python 3.5,我个人喜欢直接输入 python 运行程序,因此需要将 python 与 pip 默认为 python3 与 pip3。

1. 安装 pip3

终端输入命令

sudo apt-get install python3-pip

然后输入

pip3 -V

如果正常显示 pip3 的版本,则说明已经安装成功

2. python 默认为 python3

终端输入命令

sudo update-alternatives --install /usr/bin/python python /usr/bin/python2 100 
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 150

3. pip 默认为 pip3

终端输入命令

sudo ln -s /usr/local/bin/pip /usr/local/bin/pip3

然后输入

pip -V

如果正常显示 pip3 的版本,则说明已经修改成功

4. 修改 pip 源

终端输入命令

pip install pqi
pqi use aliyun

便将 pip 源转换为阿里源了。

安装 NVIDIA 驱动

该章节内容参考了下面博客,点击地址可以直达。
地址:https://blog.csdn.net/elegantoo/article/details/103886407

查询适用于自己显卡的驱动版本

查看自己的官网驱动十分重要,这里提供两种方式进行查询

1. 官网查询

地址:https://www.geforce.cn/drivers

输入自己的电脑配置查询NVIDIA型号,点击**“开始搜索”**,下方便有与电脑型号对应的NVIDIA版本。
在这里插入图片描述

2. Ubuntu 自带

直接点击Ubuntu应用栏左上角,搜索**“软件与更新”,点击“附加驱动”**。
在这里插入图片描述

安装驱动

在终端中依次执行以下命令

sudo add-apt-repository ppa:graphics-drivers/ppa
 
sudo apt-get update
 
sudo apt-get install nvidia-410#(我电脑显卡GTX1080)
 
sudo apt-get install mesa-common-dev
 
sudo apt-get install freeglut3-dev

注意: CUDA10.0 要求显卡驱动版本在 410.48 及以上,上述方法安装的 410 版本是 410.78

重启电脑

在以上步骤结束后,重启电脑,在终端输入命令

nvidia-smi

如出现下图则证明成功。
在这里插入图片描述

安装 CUDA 10.0

该章节内容参考了以下博客,点击地址可以直达。
地址1:https://blog.csdn.net/u012328159/article/details/80959454?tdsourcetag=s_pctim_aiomsg
地址2:https://blog.csdn.net/qq_30091945/article/details/89406457?tdsourcetag=s_pctim_aiomsg

1. 下载 CUDA10.0

进入CUDA官网下载最新的Cuda10.0的安装包
在这里插入图片描述

2. 安装 CUDA 10.0

下载完成后,利用cd命令进入到该文件所在的文件夹中,一般是Download文件中,打开终端,输入以下命令:

sudo chmod 777 cuda_10.0.130_410.48_linux.run

sudo sh ./cuda_10.0.130_410.48_linux.run

首先按住空格键,使得出现100%阅读完手册,然后进入以下选项:

Do you accept the previously read EULA?  输入: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48? 输入: no

Install the CUDA 9.0 Toolkit? 输入: yes

Enter Toolkit Location [ default is /usr/local/cuda-9.0 ]: 直接回车默认即可

Do you want to install a symbolic link at /usr/local/cuda?  输入: yes

Install the CUDA 9.0 Samples?  输入: yes  路径依旧回车默认

安装到最后的时候,会出现warning,是因为没有选择安装 NVIDIA 驱动,不管他,之前已经安装过了。

如果提示 incomplete installed,这是因为待会要把cuDNN压缩包里面的东西放入安装目录下,马上就会介绍怎么放。

3. 添加环境变量

打开.bashrc文件,配置环境变量。

sudo gedit ~/.bashrc

在打开的文件里面添加下面两句:

export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

使配置的环境生效

sudo source ~/.bashrc

4. 测试 CUDA 是否安装成功

打开终端,输入命令

cd /usr/local/cuda-10.0/samples/1_Utilities/deviceQuery

sudo make

sudo ./deviceQuery

如果显示如下,则证明成功
在这里插入图片描述
最后,查看安装的驱动版本
在终端输入命令

nvcc -V

在这里插入图片描述

安装 cuDNN 7.4

1. 下载cuDNN

进入官网下载cuDNN
地址:https://developer.nvidia.com/rdp/cudnn-download

可能需要注册账号并填写问卷才可以下载(该版本的cuDNN已上传到CSDN,可以在我的上传中找到)
地址:https://download.csdn.net/download/firehuiplane/12636084
在这里插入图片描述

2. 安装 cuDNN

安装包下载完成后,利用cd命令进入安装包所在的文件夹中,对安装包进行解压

sudo tar -zxvf ./cudnn-10.0-linux-x64-v7.4.2.24.tgz

然后将 cuDNN 里面的一些文件拷贝到刚刚安装好的cuda-10.0的安装目录下,并设置权限。

sudo cp cuda/include/cudnn.h /usr/local/cuda/include 

sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64 

sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

安装GPU版本的PyTorch

1. 安装 PyTorch

使用 pip 命令安装GPU版本的pytorch

pip install torch==1.2.0 torchvision==0.2.2

2. 测试 Pytorch

打开终端输入:python
输入以下代码

import torch
torch.cuda.is_available()

如果返回的是 True 则证明安装成功。

安装GPU版本的 TensorFlow

1. 安装 TensorFlow

使用 pip 命令安装GPU版本的pytorch

pip install tensorflow-gpu==1.13.2

2. 测试 TensorFlow

打开终端输入:python
输入以下代码

import tensorflow as tf

a = tf.constant(1)
b = tf.constant(2)
c = tf.add(a,b)
with tf.Session() as sess:
    print(sess.run(c)

如果返回的是 True 则证明安装成功。
如果出现如下图所示内容则证明安装成功
在这里插入图片描述

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
自编译tensorflow: 1.python3.5,tensorflow1.12; 2.支持cuda10.0,cudnn7.3.1,TensorRT-5.0.2.6-cuda10.0-cudnn7.3; 3.支持mkl,无MPI; 软硬件硬件环境:Ubuntu16.04,GeForce GTX 1080 配置信息: hp@dla:~/work/ts_compile/tensorflow$ ./configure WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". You have bazel 0.19.1 installed. Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3 Found possible Python library paths: /usr/local/lib/python3.5/dist-packages /usr/lib/python3/dist-packages Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages] Do you wish to build TensorFlow with XLA JIT support? [Y/n]: XLA JIT support will be enabled for TensorFlow. Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: No OpenCL SYCL support will be enabled for TensorFlow. Do you wish to build TensorFlow with ROCm support? [y/N]: No ROCm support will be enabled for TensorFlow. Do you wish to build TensorFlow with CUDA support? [y/N]: y CUDA support will be enabled for TensorFlow. Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]: Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-10.0 Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.3.1 Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-10.0]: Do you wish to build TensorFlow with TensorRT support? [y/N]: y TensorRT support will be enabled for TensorFlow. Please specify the location where TensorRT is installed. [Default is /usr/lib/x86_64-linux-gnu]:/home/hp/bin/TensorRT-5.0.2.6-cuda10.0-cudnn7.3/targets/x86_64-linux-gnu Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]: Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1]: Do you want to use clang as CUDA compiler? [y/N]: nvcc will be used as CUDA compiler. Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Do you wish to build TensorFlow with MPI support? [y/N]: No MPI support will be enabled for TensorFlow. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=" to your build command. See .bazelrc for more details. --config=mkl # Build with MKL support. --config=monolithic # Config for mostly static monolithic build. --config=gdr # Build with GDR support. --config=verbs # Build with libverbs support. --config=ngraph # Build with Intel nGraph support. --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. Preconfigured Bazel build configs to DISABLE default on features: --config=noaws # Disable AWS S3 filesystem support. --config=nogcp # Disable GCP support. --config=nohdfs # Disable HDFS support. --config=noignite # Disable Apacha Ignite support. --config=nokafka # Disable Apache Kafka support. --config=nonccl # Disable NVIDIA NCCL support. Configuration finished 编译: hp@dla:~/work/ts_compile/tensorflow$ bazel build --config=opt --config=mkl --verbose_failures //tensorflow/tools/pip_package:build_pip_package 卸载已有tensorflow: hp@dla:~/temp$ sudo pip3 uninstall tensorflow 安装自己编译的成果: hp@dla:~/temp$ sudo pip3 install tensorflow-1.12.0-cp35-cp35m-linux_x86_64.whl
自编译tensorflow: 1.python3.5,tensorflow1.12; 2.支持cuda10.0,cudnn7.3.1,TensorRT-5.0.2.6-cuda10.0-cudnn7.3; 3.无mkl支持; 软硬件硬件环境:Ubuntu16.04,GeForce GTX 1080 TI 配置信息: hp@dla:~/work/ts_compile/tensorflow$ ./configure WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". You have bazel 0.19.1 installed. Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3 Found possible Python library paths: /usr/local/lib/python3.5/dist-packages /usr/lib/python3/dist-packages Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages] Do you wish to build TensorFlow with XLA JIT support? [Y/n]: XLA JIT support will be enabled for TensorFlow. Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: No OpenCL SYCL support will be enabled for TensorFlow. Do you wish to build TensorFlow with ROCm support? [y/N]: No ROCm support will be enabled for TensorFlow. Do you wish to build TensorFlow with CUDA support? [y/N]: y CUDA support will be enabled for TensorFlow. Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]: Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-10.0 Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.3.1 Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-10.0]: Do you wish to build TensorFlow with TensorRT support? [y/N]: y TensorRT support will be enabled for TensorFlow. Please specify the location where TensorRT is installed. [Default is /usr/lib/x86_64-linux-gnu]://home/hp/bin/TensorRT-5.0.2.6-cuda10.0-cudnn7.3/targets/x86_64-linux-gnu Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]: Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1]: Do you want to use clang as CUDA compiler? [y/N]: nvcc will be used as CUDA compiler. Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Do you wish to build TensorFlow with MPI support? [y/N]: No MPI support will be enabled for TensorFlow. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=" to your build command. See .bazelrc for more details. --config=mkl # Build with MKL support. --config=monolithic # Config for mostly static monolithic build. --config=gdr # Build with GDR support. --config=verbs # Build with libverbs support. --config=ngraph # Build with Intel nGraph support. --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. Preconfigured Bazel build configs to DISABLE default on features: --config=noaws # Disable AWS S3 filesystem support. --config=nogcp # Disable GCP support. --config=nohdfs # Disable HDFS support. --config=noignite # Disable Apacha Ignite support. --config=nokafka # Disable Apache Kafka support. --config=nonccl # Disable NVIDIA NCCL support. Configuration finished 编译: bazel build --config=opt --verbose_failures //tensorflow/tools/pip_package:build_pip_package 卸载已有tensorflow: hp@dla:~/temp$ sudo pip3 uninstall tensorflow 安装自己编译的成果: hp@dla:~/temp$ sudo pip3 install tensorflow-1.12.0-cp35-cp35m-linux_x86_64.whl
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值