安装CUDA的步骤就省略了,下面说说怎么在已经安装好CUDA的基础上通过源码安装TensorFlow。
1、安装Bazel
先安装JDK8,执行命令:
sudo apt-get install openjdk-8-jdk
因为TensorFlow需要的版本是0.10.0,所以,通过以下链接下载0.10.0版本的,
https://github.com/bazelbuild/bazel/releases?after=0.10.1
如果是Ubuntu系统,下载.deb文件,双击安装,安装完后在终端执行以下命令查看是否安装成功,
$ bazel version
Extracting Bazel installation...
Build label: 0.10.0
Build target: bazel-out/k8-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Thu Nov 9 04:43:10 +50056 (1517474666590)
Build timestamp: 1517474666590
Build timestamp as int: 1517474666590
如上则表示成功
2、配置CUDA环境变量
我习惯修改~/.bashrc文件,添加代码如下,
export PATH=$PATH:/usr/local/cuda/bin/
export LD_LIBRARY_PATH=/usr/local/cuda/lib64/:$LD_LIBRARY_PATH
然后,在终端运行:
source ~/.bashrc
3、下载TensorFlow源码
这里我指定1.7.0版本,下载命令如下,
git clone -b r1.7.0 https://github.com/tensorflow/tensorflow
然后,cd到源码根目录,
cd tensorflow/
然后执行以下命令进行配置,
$ ./configure
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
/home/wilf/tools/tensorflow/tools/bazel.rc
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.23.1 installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3.5
Found possible Python library paths:
/usr/local/lib/python3.5/dist-packages
/usr/lib/python3/dist-packages
Please input the desired Python library path to use. Default is [/usr/local/lib/python3.5/dist-packages]
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: n
No jemalloc as malloc support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
No Google Cloud Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
No Hadoop File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
No Amazon S3 File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Apache Kafka Platform support? [y/N]: n
No Apache Kafka Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with GDR support? [y/N]: n
No GDR support will be enabled for TensorFlow.
Do you wish to build TensorFlow with VERBS support? [y/N]: n
No VERBS support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]:
Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]:
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Do you wish to build TensorFlow with TensorRT support? [y/N]: n
No TensorRT support will be enabled for TensorFlow.
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 6.1]
Do you want to use clang as CUDA compiler? [y/N]: n
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Do you wish to build TensorFlow with MPI support? [y/N]: n
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See tools/bazel.rc for more details.
--config=mkl # Build with MKL support.
--config=monolithic # Config for mostly static monolithic build.
Configuration finished
最后就是编译了,
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
出错,
Python Configuration Error: Problem getting numpy include path.
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: No module named 'numpy'
Is numpy installed?
and referenced by '//third_party/py/numpy:headers'
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted: Loading failed
INFO: Elapsed time: 26.148s
FAILED: Build did NOT complete successfully (79 packages loaded)
currently loading: tensorflow/core ... (2 packages)
解决方法:
没有numpy,因为我刚重装的系统,所以很多东西都没有的,先安装pip3,执行以下命令,
sudo apt-get install python3-pip
然后执行,
sudo pip3 install numpy
又出错,
ERROR: /home/wilf/tools/tensorflow/tensorflow/python/BUILD:4855:1: C++ compilation of rule '//tensorflow/python:framework/fast_tensor_util.so' failed (Exit 1)
bazel-out/k8-py3-opt/genfiles/tensorflow/python/framework/fast_tensor_util.cpp:4:20: fatal error: Python.h: No such file or directory
compilation terminated.
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 784.712s, Critical Path: 25.37s
FAILED: Build did NOT complete successfully
解决方法:
sudo apt-get install python3-dev
编译通过后,将Tensorflow转为.whl文件
bazel-bin/tensorflow/tools/pip_package/build_pip_package ./
安装:
sudo pip3 install tensorflow-1.9.0-cp35-cp35m-linux_x86_64.whl