前言
自己电脑已有tensorflow1.2+python2.7+CUDA8.0+CUDNN5.0的实验环境,但新的实验要求tensorflow1.5以上+python3的环境,不想破坏已有的实验环境,所以想用anaconda环境配置一下新的,另外tensorflow1.5以上的已编译好的whl安装文件都要求CUDA9的版本,所以不想换CUDA版本的前提下选择源码编译tensorflow。
综上,本文是在ubuntu14.04系统已有tensorflow1.2+python2.7+CUDA8.0+CUDNN5.0的前提下利用anaconda增加tensorflow1.7+python3.6+CUDA8.0+CUDNN5.0的环境,两个环境并存。
step 1:安装anaconda
下载:
官网下载对应想要的python3.6版本的anaconda,下载网址:https://www.anaconda.com/download/#linux。我下载的是Anaconda3-5.1.0-Linux-x86_64.sh。
安装:
bash Anaconda3-5.1.0-Linux-x86_64.sh
提示回车,阅读协议后输入yes安装,后面提示加环境变量选择yes,之后安装完成!
新打开一个终端,输入”conda -V”查看anaconda版本。
成功安装anaconda后可以创建环境了。
step 2: 创建tensorflow1.7环境
创建名为tf1-7的环境:
conda create -n tf1-7 python=3.6
创建成功之后激活环境
source activate tf1-7
此时命令行前面就会有创建的环境名。
因为CUDA环境不想改变,所以下面源码编译tensorflow1.7.
step 3: tensorflow安装
后面的操作都是在anaconda下激活的tf1-7环境下.
因为采用源码安装tensorflow,需要用到bazel,这里提前说一下我一开始安装的bazel是0.12版本的,但是后来碰到一个错误,查阅资料后发现是bazel版本过高导致的,所以又把bazel0.12卸载了装的bazel0.11.这里先放出这个错误:
`ERROR: /home/UserHome/.cache/bazel/_bazel_UserHome/ab33c8274551e1ea3125872a4c4e7db9/external/jpeg/BUILD:126:12: Illegal ambiguous match on configurable attribute "deps" in @jpeg//:jpeg:
@jpeg//:k8
@jpeg//:armeabi-v7a
Multiple matches are not allowed unless one is unambiguously more specialized.
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted:
/home/UserHome/.cache/bazel/_bazel_UserHome/ab33c8274551e1ea3125872a4c4e7db9/external/jpeg/BUILD:126:12: Illegal ambiguous match on configurable attribute "deps" in @jpeg//:jpeg:
@jpeg//:k8
@jpeg//:armeabi-v7a
Multiple matches are not allowed unless one is unambiguously more specialized.
INFO: Elapsed time: 1.086s
FAILED: Build did NOT complete successfully (3 packages loaded)`
所以可以注意一下遇到这个错误的时候会不会是bazel版本的问题。下面讲整个tensorflow的安装过程。
tensorflow下载:https://codeload.github.com/tensorflow/tensorflow/tar.gz/v1.7.0-rc0
安装bazel0.11:
首先下载bazel0.11:https://github.com/bazelbuild/bazel/releases
安装bazel之前有一些依赖需要安装,具体查看:https://docs.bazel.build/versions/master/install-ubuntu.html
安装bazel:
chmod +x bazel-<version>-installer-linux-x86_64.sh
./bazel-<version>-installer-linux-x86_64.sh --user
注意要带着–user安装,表示将会安装到/home/bin目录下。
安装完成后设置环境变量在~/.bashrc添加:PATH=/home/*/bin:$PATH,保存后source一下。
这样bazel就安装完成了,下面用bazel来编译tensorflow。
进入tensorflow源码文件下,配置tensorflow
./configure
会有一些配置选项要选择。
$ ./configure
Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc nvcc should use as the host compiler. [Default is /usr/bin/gcc]:
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
Please specify the location where CUDA 7.5 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the cuDNN version you want to use. [Leave empty to use system default]: 5.0
Please specify the location where cuDNN 5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 6.1
Do you wish to build TensorFlow with MPI support? [y/N] n
MPI support will not be enabled for TensorFlow
Configuration finished
过程中不相关的选项选择no,要指定cuda,cudnn版本及安装位置,显卡计算能力可以去给定网址查看。
配置完成后编译目标程序
bazel编译tensorflow:
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
这个过程比较长,耐心等待。
创建pip包并安装。
bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
mkdir _python_build
cd _python_build
ln -s ../bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow/* .
ln -s ../tensorflow/tools/pip_package/* .
python setup.py develop
安装完成,python下import tensorflow就可以调用tensorflow1.7了。
错误记录:
记得碰到一个问题是:
编译安装tensorflow GPU版本时报错:Cannot find libdevice.10.bc under /usr/local/cuda-8.0
解决办法是:将/usr/local/cuda-8.0/nvvm/libdevice/libdevice.compute_50.10.bc改为libdevice.10.bc,并复制一份至/usr/local/cuda-8.0/