一、 安装Bazel
安装依赖包
yum install -y epel-release
yum install -y git java-1.8.0-openjdk-headless java-1.8.0-openjdk-devel gcc gcc-c++ make automake autoconf zip unzip
编译
Bazel没有针对CentOS的发布包,所以先编译安装Bazel
// 下载源码包:
https://docs.bazel.build/versions/master/install-compile-source.html
// 本次下载的是bazel-0.16.0-dist.zip
unzip bazel-0.16.0-dist.zip -d bazel-0.16.0
cd bazel-0.16.0/
./compile.sh
编译完成后会在output目录,将编译好的二进制文件拷贝到/usr/local/bin目录下即可
cp output/bazel /usr/local/bin/
chmod +x /usr/local/bin/bazel
Note: Tensorflow v1.12.0版本需要至少0.16.0以上的bazel才能编译,最开始使用最新的0.21.0版本的bazel去编译tensorflow报错后来使用0.16.0版本后编译成功
二、编译TensorFlow
下载依赖包
yum install python-devel gcc gcc-c++ patch zip python-setuptools
easy_install pip
pip install numpy wheel six mock enum34
下载tensorflow源码并切换指定的版本分支
git clone https://github.com/tensorflow/tensorflow
cd tensorflow
git checkout v1.12.0
git checkout -b v1.12.0
配置
./configure
问一些问题,如果机器上有NVIDIA GPU,需要安装GPU驱动、cuDNN、CUDA等工具(安装Nvida驱动、CUDA和CUDNN)。大部分的设置都可以一路回车。
编译
// 编译CPU版本+MKL
bazel build --config=opt --config=mkl //tensorflow/tools/pip_package:build_pip_package
// 如果是GPU版本(需要在服务器上先安装好cuda)
bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
// 编译成功后生成whl包
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
// 成功后会保存在/tmp/tensorflow_pkg下
Mon Jan 21 17:50:59 CST 2019 : === Output wheel file is in: /tmp/tensorflow_pkg
[root@k8s-node1 tensorflow]# cd /tmp/tensorflow_pkg
[root@k8s-node1 tensorflow_pkg]# ll
total 120692
-rw-r--r-- 1 root root 123586552 Jan 21 17:50 tensorflow-1.12.0-cp27-cp27mu-linux_x86_64.whl
测试
可以使用benckmark进行简单测试
import sys
import numpy as np
import tensorflow as tf
from datetime import datetime
device_name = sys.argv[1] # Choose device from cmd line. Options: gpu or cpu
shape = (int(sys.argv[2]), int(sys.argv[2]))
if device_name == "gpu":
device_name = "/gpu:0"
else:
device_name = "/cpu:0"
with tf.device(device_name):
random_matrix = tf.random_uniform(shape=shape, minval=0, maxval=1)
dot_operation = tf.matmul(random_matrix, tf.transpose(random_matrix))
sum_operation = tf.reduce_sum(dot_operation)
startTime = datetime.now()
with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as session:
result = session.run(sum_operation)
print(result)
# It can be hard to see the results on the terminal with lots of output -- add some newlines to improve readability.
print("\n" * 5)
print("Shape:", shape, "Device:", device_name)
print("Time taken:", datetime.now() - startTime)
print("\n" * 5)
调用方式
python benchmark.py cpu 20000
会先用0-1的随机数填充、生成20000*20000的一个矩阵,然后转置,再和原来的矩阵相乘,最后求和。
输出结果
('Shape:', (20000, 20000), 'Device:', '/cpu:0')
('Time taken:', datetime.timedelta(0, 23, 390198))