Ubuntu18.04+RTX2080+cuda10+tensorflow

经历了两天的黑暗时光,终于把RTX2080成功装进了ubuntu中,mark一下。

从一开始装系统开始,我的主板型号是华硕ROG STRIX B360-I GAMING,算是很新的板子了,也导致了各种不兼容问题,比如之前装了ubuntu16.04,在ubuntu下无线网卡不能用,window下和ubuntu下声卡都不能用,关机关不掉。换了ubuntu18.04这些问题都解决了,建议最新的主板还是装最新的系统吧。

一、装ubuntu18.04

首先是下载ubuntu18.04的镜像文件

Ubuntu 18.04.1-LTS-桌面版-64位
下载地址:
http://releases.ubuntu.com/18.04/ubuntu-18.04.1-desktop-amd64.iso
bt种子:
http://releases.ubuntu.com/18.04/ubuntu-18.04.1-desktop-amd64.iso.torrent

然后下载rufus制作系统盘, 过程很简单,有一个8g的u盘就可以解决问题,所有设置都是默认就可以。

进入ubuntu的u盘引导界面,第一个选项是try ubuntu,第二个是install ubuntu,点哪个进去都是提示以下错误:

其实有这个提示很常见,但是正常情况下电脑都会闪一下就跳转到ubuntu安装的引导界面了,但是我的电脑始终卡在这里不动。为了解决这个问题,网上找了很多方法,常见的解决方法如下:

1、用nomodeset模式进入安装系统:将光标移到try ubuntu或者install ubuntu的选项处,按e,进入编辑界面,在linux那一行的splash前面加nomodeset,按F10保存

2、禁用acpi功能:将光标移到try ubuntu或者install ubuntu的选项处,按e,进入编辑界面,在linux那一行的,将splash后面的---删除,加上acpi=off noacip,按F10保存

3、开机后按F8,选U盘名字的那个选项,不要选带UEIF的那个选项,要选不带UEIF只有U盘名字的那个,进入后按F6,选择语言后进入引导界面,按F7进入高级选项,在给第一个acpi,第二个是啥我忘了,还有nomodeset按enter打叉,按F10保存设置,再enter 安装ubuntu。

我用第1个没用,用第2个进入后提示另一个错误无法解决,第3个同理无法解决,因此都放弃了。

为了测试究竟是显卡不兼容还是主板问题,我把显卡2080的电源断了,屏幕显示器的线之前与独显相连,现在与主板上集显相连,发现虽然有以上错误提示但是可以进入安装界面!重大发现了,然后把2080的电源又连上,换了个2080上的DP口,发现这个也能跳过去了。好吧,折腾了半天换个口就行了。。。。具体什么原因我也不太清楚,难道这三个DP口供电还不一样?

总归能进入安装界面了,安装ubuntu18.04的过程和安装其他版本都差不多

1、选择语言

2、连接网络

3、正常安装,安装第三方软件(可选)

4、以其他方式分区,分区那里我分了三个区,一个交换空间15G左右,一个500兆的BIOS BOOT,剩下的划到根目录 \ 。

5、写名字密码

没什么困难就装好了,终于可以进入下一步了

二、安装RTX2080驱动

正常显卡的驱动在sudo apt-get update之后,都会在软件和更新里面的附加驱动中显示,或者通过指令 sudo ubuntu-drivers autoinstall就能自动安装,奈何2080太太太太太太太太太新了!附加驱动根本找不到它好么!我先把软件更新源换成了aliyun,还是找不到,绝望中。

多亏了万能的淘宝啊,热心的淘宝小哥给了我一个更新方案如下:

1. 更新apt-get源列表
sudo apt-get update
sudo apt-get upgrade
2. 添加驱动源
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update

然后在软件和更新的附加驱动里面就可以找到2080的驱动了!

没错,就是那个nvidia-drivers-410,点上它后点击apply changes,等待五分钟左右驱动安装结束,然后重启电脑,你会发现界面都变得正常了。

重启后在指令行敲

nvidia-settings
重启后在指令行敲恩,就可以看到期待已久的显卡信息了。

三、cuda+cudnn

cuda从官网https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal上下载cuda10版本,因为只有这个支持18.04,官网同时提供了安装指令,很良心。注意运行指令需要cd到包含该文件的文件夹目录下。

sudo dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb

sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub

sudo apt-get update

sudo apt-get install cuda
但是我在输入第一行指令的时候,就提示错误提示我先运行第二条指令,所以指令的顺序就变成了(pub结尾的这个不知道记得对不对,总之直接把提示的指令输入终端就没错啦)

sudo apt-key add /var/cuda-repo-ubuntu1804-10-0-local-10.0.130/7fa2af80.pub

sudo dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb

sudo apt-get update

sudo apt-get install cuda

 

装完cuda就可以装cudnn啦,同样官网https://developer.nvidia.com/rdp/cudnn-download下载,需要注册下账号,选对应cuda10.0的版本,下图中第一个cuDNN v7.3.1 Library for Linux

下载完解压文件,运行以下指令把解压后相应的文件拷贝到对应的CUDA目录下即可(要cd到含cudnn文件的目录下)

sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

注:因为只是cudnn安装方式是将库文件放置再cuda目录下,所以版本万一不对也不要慌,可以重新删除再安装新的版本。

接下来编辑一个path环境变量文档:

sudo gedit ~/.bashrc

将cuda的环境变量加到打开的文件最后

export LD_LIBRARY_PATH=”$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64”
export CUDA_HOME=/usr/local/cuda
export PATH="$CUDA_HOME/bin:$PATH"

文件生效:

source ~/.bashrc

重启终端,这一大步就算结束了。

四、ananconda+tensorflow

Anaconda安装与使用链接: https://blog.csdn.net/qq_31610789/article/details/80646276

清华镜像源download link: https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/
先用Anaconda创建python环境:

conda create -n 环境名 python=3.6

环境名就是你自己定义的一个环境名称!

环境创建完成之后,source activate激活环境。这里又有一个坑,我一开始是用pip安装TensorFlow的,即直接用了清华镜像


 然而导入tensorflow的时候报错,说需要cuda9.0。。。。傻眼ing
于是又去骚扰淘宝小哥哥。。。。。小哥哥很快给了正解,说需要用conda安装就不会报错!又一次刷新三观,学到新知识。万能的淘宝啊。

注意之前要先把conda换成清华镜像啊,不然下载慢到你怀疑人生。具体方法为:

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/

conda config --set show_channel_urls yes

然后运行指令

conda install tensorflow-gpu=1.5.0

短暂的等待后,就完成安装了。接着用spyder的用户注意了,需要重装下spyder才能调用tensorflow

conda install spyder

至此,基于RTX2080的GPU版tensorflow终于完成了。

缺少库的宝宝们,都到清华镜像pip安装去找啊 https://mirror.tuna.tsinghua.edu.cn/help/pypi/

需要漫长的休息。。。。。

  • 10
    点赞
  • 92
    收藏
    觉得还不错? 一键收藏
  • 32
    评论
自编译tensorflow: 1.python3.5,tensorflow1.12; 2.支持cuda10.0,cudnn7.3.1,TensorRT-5.0.2.6-cuda10.0-cudnn7.3; 3.支持mkl,无MPI; 软硬件硬件环境:Ubuntu16.04,GeForce GTX 1080 配置信息: hp@dla:~/work/ts_compile/tensorflow$ ./configure WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". You have bazel 0.19.1 installed. Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3 Found possible Python library paths: /usr/local/lib/python3.5/dist-packages /usr/lib/python3/dist-packages Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages] Do you wish to build TensorFlow with XLA JIT support? [Y/n]: XLA JIT support will be enabled for TensorFlow. Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: No OpenCL SYCL support will be enabled for TensorFlow. Do you wish to build TensorFlow with ROCm support? [y/N]: No ROCm support will be enabled for TensorFlow. Do you wish to build TensorFlow with CUDA support? [y/N]: y CUDA support will be enabled for TensorFlow. Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]: Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-10.0 Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.3.1 Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-10.0]: Do you wish to build TensorFlow with TensorRT support? [y/N]: y TensorRT support will be enabled for TensorFlow. Please specify the location where TensorRT is installed. [Default is /usr/lib/x86_64-linux-gnu]:/home/hp/bin/TensorRT-5.0.2.6-cuda10.0-cudnn7.3/targets/x86_64-linux-gnu Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]: Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1]: Do you want to use clang as CUDA compiler? [y/N]: nvcc will be used as CUDA compiler. Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Do you wish to build TensorFlow with MPI support? [y/N]: No MPI support will be enabled for TensorFlow. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=" to your build command. See .bazelrc for more details. --config=mkl # Build with MKL support. --config=monolithic # Config for mostly static monolithic build. --config=gdr # Build with GDR support. --config=verbs # Build with libverbs support. --config=ngraph # Build with Intel nGraph support. --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. Preconfigured Bazel build configs to DISABLE default on features: --config=noaws # Disable AWS S3 filesystem support. --config=nogcp # Disable GCP support. --config=nohdfs # Disable HDFS support. --config=noignite # Disable Apacha Ignite support. --config=nokafka # Disable Apache Kafka support. --config=nonccl # Disable NVIDIA NCCL support. Configuration finished 编译: hp@dla:~/work/ts_compile/tensorflow$ bazel build --config=opt --config=mkl --verbose_failures //tensorflow/tools/pip_package:build_pip_package 卸载已有tensorflow: hp@dla:~/temp$ sudo pip3 uninstall tensorflow 安装自己编译的成果: hp@dla:~/temp$ sudo pip3 install tensorflow-1.12.0-cp35-cp35m-linux_x86_64.whl
自编译tensorflow: 1.python3.5,tensorflow1.12; 2.支持cuda10.0,cudnn7.3.1,TensorRT-5.0.2.6-cuda10.0-cudnn7.3; 3.无mkl支持; 软硬件硬件环境:Ubuntu16.04,GeForce GTX 1080 TI 配置信息: hp@dla:~/work/ts_compile/tensorflow$ ./configure WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". You have bazel 0.19.1 installed. Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3 Found possible Python library paths: /usr/local/lib/python3.5/dist-packages /usr/lib/python3/dist-packages Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages] Do you wish to build TensorFlow with XLA JIT support? [Y/n]: XLA JIT support will be enabled for TensorFlow. Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: No OpenCL SYCL support will be enabled for TensorFlow. Do you wish to build TensorFlow with ROCm support? [y/N]: No ROCm support will be enabled for TensorFlow. Do you wish to build TensorFlow with CUDA support? [y/N]: y CUDA support will be enabled for TensorFlow. Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10.0]: Please specify the location where CUDA 10.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-10.0 Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.3.1 Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-10.0]: Do you wish to build TensorFlow with TensorRT support? [y/N]: y TensorRT support will be enabled for TensorFlow. Please specify the location where TensorRT is installed. [Default is /usr/lib/x86_64-linux-gnu]://home/hp/bin/TensorRT-5.0.2.6-cuda10.0-cudnn7.3/targets/x86_64-linux-gnu Please specify the locally installed NCCL version you want to use. [Default is to use https://github.com/nvidia/nccl]: Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1,6.1,6.1]: Do you want to use clang as CUDA compiler? [y/N]: nvcc will be used as CUDA compiler. Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: Do you wish to build TensorFlow with MPI support? [y/N]: No MPI support will be enabled for TensorFlow. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds. Preconfigured Bazel build configs. You can use any of the below by adding "--config=" to your build command. See .bazelrc for more details. --config=mkl # Build with MKL support. --config=monolithic # Config for mostly static monolithic build. --config=gdr # Build with GDR support. --config=verbs # Build with libverbs support. --config=ngraph # Build with Intel nGraph support. --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. Preconfigured Bazel build configs to DISABLE default on features: --config=noaws # Disable AWS S3 filesystem support. --config=nogcp # Disable GCP support. --config=nohdfs # Disable HDFS support. --config=noignite # Disable Apacha Ignite support. --config=nokafka # Disable Apache Kafka support. --config=nonccl # Disable NVIDIA NCCL support. Configuration finished 编译: bazel build --config=opt --verbose_failures //tensorflow/tools/pip_package:build_pip_package 卸载已有tensorflow: hp@dla:~/temp$ sudo pip3 uninstall tensorflow 安装自己编译的成果: hp@dla:~/temp$ sudo pip3 install tensorflow-1.12.0-cp35-cp35m-linux_x86_64.whl

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 32
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值