Tensorflow 1.4 GPU版本安装
系统环境
- Ubuntu 16.04
- NVIDIA GF 660
安装流程
- 安装 CUDA 8
- 安装 cuDNN 6
- 安装 Tensorflow GPU版本
安装 CUDA 8
不要安装 CUDA 9
不要安装 CUDA 9
不要安装 CUDA 9
目前Tensorfllow 1.4版本还不支持使用CUDA 9, 应该1.5 版本之后才支持
官方安装文档是安装CUDA 9的,不过对CUDA 8适用
1. 检查自己GPU是否支持CUDA
$ lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation GK106 [GeForce GTX 660] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GK106 HDMI Audio Controller (rev a1)
在 这里 查看自己的GPU是否支持CUDA, 基本上 GF 5xx 以后都是支持的
2. 下载 CUDA 8
3. 安装 CUDA 8
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get upgrade
4. 验证 CUDA 8 是否安装成功
$ cd /usr/local/cuda-8.0/samples
$ sudo make
$ cd bin/x86_64/linux/release/
$ ./deviceQuery
输出下面内容
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 660"
CUDA Driver Version / Runtime Version 9.0 / 8.0
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 1998 MBytes (2094923776 bytes)
( 5) Multiprocessors, (192) CUDA Cores/MP: 960 CUDA Cores
GPU Max Clock rate: 1032 MHz (1.03 GHz)
Memory Clock rate: 3004 Mhz
Memory Bus Width: 192-bit
L2 Cache Size: 393216 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 660
Result = PASS
重点看: Device 0: "GeForce GTX 660"
这是是否正确, 还有最后一行 Result = PASS
5. 设置环境变量
$ gedit ~/.bashrc
- 添加
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
如果是64位系统添加
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64\ ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
如果是32位系统添加
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib\ ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
$ source ~/.bashrc
安装 cuDNN 6
地址: https://developer.nvidia.com/deep-learning
1. 需要注册一个账号(免费) 登录账号
2. 下载地址 https://developer.nvidia.com/rdp/cudnn-download
3. 同意协议
4. 选择下面
$ sudo dpkg -i libcudnn6_6.0.21-1+cuda8.0_amd64.deb
$ sudo dpkg -i libcudnn6-dev_6.0.21-1+cuda8.0_amd64.deb
安装 Tensorflow
$ pip3 install tensorflow-gpu
验证Tensorflow安装成功
# Python3
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))