Tensorflow1.0以后的版本对于CUDA的默认要求是8.0,所以如果打算直接使用pip命令安装Tensorflow,需要检查CUDA是否匹配,如果匹配,可以使用命令进行安装:
CPU版本:sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.1.0-cp27-none-linux_x86_64.whl
GPU版本:sudo pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.1.0-cp27-none-linux_x86_64.whl
本文在机器已经配置好了CUDA7.5的前提下进行环境配置,已经安装好了NVIDIA显卡驱动,为了避免更换CUDA8.0带来的麻烦,需要通过Tensorflow源码在CUDA7.5下安装Tensorflow1.1.0,关于CUDA的安装,可以参考以下文章:
Ubuntu 安装 tensorflow-gpu + keras
ubuntu16.04下安装TensorFlow(GPU加速)----详细图文教程
以上两篇都要通过deb安装,但是本人用runfile进行安装比较顺利,参考:
深度学习(TensorFlow)环境搭建:(二)Ubuntu16.04+1080Ti显卡驱动
深度学习(TensorFlow)环境搭建:(三)Ubuntu16.04+CUDA8.0+cuDNN7+Anaconda4.4+Python3.6+TensorFlow1.3
1. cuDNN 5.1下载及环境配置
参考上述文章,选择CUDA和Linux对应的版本下载cuDNN 5.1并解压,将头文件和lib库复制到cuda的目录下,本文定义为/usr/local/cuda
sudo cp cuda/include/cudnn.h /usr/local/cuda/include # 复制到 include 中
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64 # 复制到 lib64 中
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn* # 将头文件复制进去
配置CUDA的环境变量,可以配置为用户环境变量或者全局环境变量,分别在~/bash_profile和/etc/profile中进行配置,配置完后source令其生效:
#set cuda environment
export CUDA_HOME=/usr/local/cuda
export PATH=/usr/local/cuda/bin:/usr/local/bin:$PATH
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$CUDA_HOME/lib64:$CUDA_HOME/extras/CUPTI/lib64"
运行以下命令检查CUDA配置是否生效
nvidia-settings # 打开 NVIDIA 设置界面
nvidia-smi
2. 源码安装Tensorflow 1.1.0
2.1 安装编译工具bazel
首先从github下载相应版本的源码,然后安装编译工具bazel:https://bazel.build/versions/master/docs/install.html
安装JDK8
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
把bazel加入到源
echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
安装bazel
sudo apt-get update && sudo apt-get install bazel
在终端输入bazel测试是否安装成功
2.2 配置Tensorflow
解压源码,进入解压后目录,运行:./configure进行配置,配置选项如下:
Please specify the location of python. [Default is /usr/bin/python]:
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n]
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N] N
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N]
No XLA support will be enabled for TensorFlow
Found possible Python library paths:
/usr/local/lib/python2.7/dist-packages
/usr/lib/python2.7/dist-packages
Please input the desired Python library path to use. Default is [/usr/local/lib/python2.7/dist-packages]
Using python library path: /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with OpenCL support? [y/N] N
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 7.5
Please specify the location where CUDA 7.5 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 5.1.10
Please specify the location where cuDNN 5.1.10 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]: 3.0
特别需要注意选择CUDA版本只要到7.5就好,但是选择cdDNN版本需要能找到对应的libcudnn.so.5.1.10这样的文件,在/usr/local/cuda/lib64中,所以版本选择5.1.10,其他默认即可
2.3 产生pip包并安装
# To build with GPU support:
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
# 在/tmp/tensorflow_pkg/文件夹下查找.whl文件,并pip安装,如名字为:tensorflow-1.1.0-cp27-cp27mu-linux_x86_64.whl
sudo pip install /tmp/tensorflow_pkg/tensorflow-1.1.0-cp27-cp27mu-linux_x86_64.whl
如果出现如下错误:
tensorflow-1.1.0-cp27-cp27mu-linux_x86_64.whl is not a supported wheel on this platform.
Storing debug log for failure in /home/jiaqi/.pip/pip.log
则进入/tmp/tensorflow_pkg目录,更改tensorflow-1.1.0-cp27-cp27mu-linux_x86_64.whl为tensorflow-1.1.0-cp27-none-linux_x86_64.whl
再次pip即可
2.4 设置Tensorflow,并安装到python中
# To build with GPU support:
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
mkdir _python_build
cd _python_build
ln -s ../bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow/* .
ln -s ../tensorflow/tools/pip_package/* .
sudo python setup.py develop
安装成功后,运行 python -c "import tensorflow;print(tensorflow.__version__)"测试是否可行
运行以下测试用例测试GPU是否可用
import tensorflow as tf
hello = tf.constant("hello TensorFlow!")
sess=tf.Session()
print(sess.run(hello))
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print(sess.run(c))
3. 安装Keras 2.0.4
安装之前需要一些依赖,否则在安装scipy时会出错,比如Atlas和gfortran,其他深度学习常用赖项可参考:2015.08.17 Ubuntu 14.04+cuda 7.5+caffe安装配置中关于caffe的依赖设置
#安装scipy所需依赖
sudo apt-get install libatlas-base-dev
sudo apt-get install gfortran
#安装keras
sudo pip install keras==2.0.4
使用命令
python -c "import keras"
测试keras是否安装成功