Ubuntu16.04系统安装tensorflow(GPU)

8 篇文章 0 订阅
6 篇文章 0 订阅

作者:冯拓

电脑配置如下:

配置HP-Z820 
CPU核心线程数和主频

intel xeon(至强) E-5  2620 

2.0GHz*24

内存64GB
硬盘2TB
显卡NIVDIA TITAN X 12GB

安装过程中使用的安装包:

 安装包
驱动NVIDIA-Linux-x86_64-396.18.run
cudacuda_9.1.85_387.26_linux.run
cudnncudnn-9.1-linux-x64-v7.1.tgz
AnacondaAnaconda3-5.2.0-Linux-x86_64.sh
bazelbazel-0.14.1-installer-linux-x86_64.sh
tensorflow源代码tensorflow-r1.8
cudnncudnn-9.1-linux-x64-v7.1.tgz

cuda下载链接:

https://developer.nvidia.com/cuda-downloads

anaconda下载链接:

https://www.continuum.io/downloads 

bazel下载链接:

https://github.com/bazelbuild/bazel/releases

tensorflow源代码下载链接:
https://github.com/tensorflow/tensorflow


本文分为两个大部分来介绍,首先介绍cuda与cudnn的安装,然后介绍使用源码安装tensorflow。

 

一、安装cuda与cudnn

1、Nvidia驱动安装-run文件安装

下载完名称为NVIDIA-Linux-x86_64-396.18.run 的文件。使用以下命令:

sudo gedit /etc/modprobe.d/blacklist.conf

加入以下语句,将nouveau禁止命令写入文件。

blacklist nouveau 
blacklist lbm-nouveau 
options nouveau modeset=0 
alias nouveau off 
alias lbm-nouveau off

调用指令禁止nouveau。

echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf

建立新的内核并重启。

sudo update-initramfs -u
sudo reboot

使用ctrl + alt+ F1进入文本模式,关闭x server。

sudo service lightdm stop 
sudo init 3

切换NVIDIA安装包指定目录,赋予权限并进行安装  

chmod +x NVIDIA-Linux-x86_64-396.18.run 
sudo sh NVIDIA-Linux-x86_64-396.18.run --no-opengl-files

返回图形界面

sudo service lightdm start

检查驱动是否安装成功,出现图片所示,驱动就安装完成了。

nvidia-smi

 

2、安装cuda

安装的CUDA 可以到官网去下载,我在此安装cuda_9.1.85_387.26_linux.run。执行以下命令:

sudo chmod +x cuda_9.1.85_387.26_linux.run
sudo ./cuda_9.1.85_387.26_linux.run

按q结束cuda的描述,然后输入accept,在提示是否安装NVIDIA驱动,选择N。后面的其他提示都选择默认或者Y,如下所示:

Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 387.26?
(y)es/(n)o/(q)uit: n

Install the CUDA 9.1 Toolkit?
(y)es/(n)o/(q)uit: y

Enter Toolkit Location
[ default is /usr/local/cuda-9.1 ]:

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Install the CUDA 9.1 Samples?
(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location
[ default is /home/xxxxx]:

然后使用以下命令:

sudo gedit /etc/profile

修改系统环境变量,在文件末尾加入以下内容。

export PATH=$PATH:/usr/local/cuda-9.1/bin
export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64:/lib

 

3、安装cudnn

cuDNN同样需要我们去NVIDIA的官网下载适合cuda版本的deb文件或tgz文件,我以cudnn-9.1-linux-x64-v7.1.tgz的安装为例。进入压缩包所在的目录分别执行以下命令:

tar -xzvf cudnn-9.1-linux-x64-v7.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

 

4、anaconda的安装

到官网下载安装包,我使用的是 Python 3.6 版本的Anaconda3-5.2.0-Linux-x86_64.sh。在终端进入安装包所在目录,执行命令:

bash  Anaconda3-5.2.0-Linux-x86_64.sh

在安装过程中在选择加入环境变量时,选择yes。

安装完成后,重新打开一个终端,输入命令。

创建虚拟环境,以备后续安装tensorflow。

 conda create -n tensorflow

 

二、安装tensorflow

下面介绍通过Bazel编译tensorflow源码的方式在虚拟环境安装tensorflow。

 

5、安装Bazel

到github下载bazel-0.14.1-installer-linux-x86_64.sh

sudo apt-get install pkg-config zip g++ zlib1g-dev unzip python

进入安装包目录,执行命令:

chmod +x bazel-0.14.1-installer-linux-x86_64.sh
./bazel-0.14.1-installer-linux-x86_64.sh --user

将export PATH="$PATH:$HOME/bin"添加到~/.bashrc中。

 

6、编译安装tensorflow

到github下载源码,branch处可以选择版本为r1.8。解压源码,从终端进入解压文件目录,执行以下命令:

./configure

按照提示输入y/n或者路径等信息。安装过程中cuda选择9.1版本,cudnn选择7.1版本。

xxx@xxx:~/Downloads/tensorflow-r1.8$ ./configure
You have bazel 0.14.1 installed.
Please specify the location of python. [Default is /home/yangyuting/anaconda3/bin/python]:

Found possible Python library paths:/home/yangyuting/anaconda3/lib/python3.6/site-packages
Please input the desired Python library path to use.  Default is [/home/yangyuting/anaconda3/lib/python3.6/site-packages]

Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: n
No jemalloc as malloc support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.

Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: n
No Apache Kafka Platform support will be enabled for TensorFlow.

Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with GDR support? [y/N]: n
No GDR support will be enabled for TensorFlow.

Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.

Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.1

Please specify the location where CUDA 9.2 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1

Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:

Do you wish to build TensorFlow with TensorRT support? [y/N]: n
No TensorRT support will be enabled for TensorFlow.

Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]:

Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 5.2]

Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.

Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:

Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
    --config=mkl             # Build with MKL support.
    --config=monolithic      # Config for mostly static monolithic build.
Configuration finished

配置完成后依次执行以下命令,安装tensorflow

bazel build --config=opt --config=cuda --config=monolithic //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
source activate tensorflow
pip install /tmp/tensorflow_pkg/tensorflow-1.8.0-cp36-cp36m-linux_x86_64.whl

 

7、运行例程

我按照以上步骤安装好TensorFlow之后,运行例程,出现了以下问题:

ImportError: libcublas.so.9.1: cannot open shared object file: No such file or directory

这个问题与编译文件无关,其原因在于安装cuda9.1的时候有一些配置文件没有正确进行配置,也就是一些文件找不到。找不到并不是意味着不存在,而是没有通过正确的路径来查找。依次执行以下命令(软连接):

sudo ln -s /usr/local/cuda-9.1/lib64/libcublas.so.9.1 /usr/lib/libcublas.so.9.1
sudo ln -s /usr/local/cuda-9.1/lib64/libcusolver.so.9.1 /usr/lib/libcusolver.so.9.1
sudo ln -s /usr/local/cuda-9.1/lib64/libcudart.so.9.1 /usr/lib/libcudart.so.9.1
sudo ln -s /usr/local/cuda-9.1/lib64/libcudnn.so.7 /usr/lib/libcudnn.so.7
sudo ln -s /usr/local/cuda-9.1/lib64/libcufft.so.9.1 /usr/lib/libcufft.so.9.1
sudo ln -s /usr/local/cuda-9.1/lib64/libcurand.so.9.1 /usr/lib/libcurand.so.9.1

将相应的文件和你的cuda路径进行软连接,这是默认安装路径,如果你的路径不一样,需要修改上面的代码!软连接后就可以正常import TensorFlow了。cuda,cudnn正常工作。

 

参考链接:

https://blog.csdn.net/stories_untold/article/details/78521925

https://blog.csdn.net/lhx_998/article/details/76135936

https://blog.csdn.net/caojunwei0324/article/details/78962223

https://blog.csdn.net/shuzfan/article/details/78516542

https://cloud.tencent.com/developer/article/1150020

https://docs.nvidia.com/deeplearning/sdk/cudnn-install/

https://docs.nvidia.com/deeplearning/sdk/cudnn-install/

 

 

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值