hackintosh + nvidia 开启机器学习之路（二）（tensorflow-gpu在Mac下的编译）

最新推荐文章于 2024-04-06 14:43:34 发布

种花家的小白

最新推荐文章于 2024-04-06 14:43:34 发布

阅读量699

点赞数

分类专栏： AI

本文链接：https://blog.csdn.net/H_haow/article/details/104856500

版权

AI 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

tensorflow gpu版本在macOS下的编译

背景

tensorflow-gpu版本在1.2以及之后的版本，停止了对Mac的支持，因此通过pip安装的版本，只停留在1.1，并且此版本要求的cuda为8.0，无法体验最新的cuda

环境信息

OS macOS HighSierra 10.13.6
cuda 10.1
cudnn 7.0
bazel 0.14.0
xcode 10.1
python 3.6
tensorflow 1.8

硬件信息

CPU i5-7500
MEM 32G
GPU GTX1050TI

编译环境构建

# python  通过conda构建
conda create -n tensorflow python=3.6
conda activate tensorflow
pip install six numpy wheel
# 基础依赖
brew install coreutils llvm cliutils/apple/libomp
# bazel 安装 请自行google
# 环境变量设置
vim ~/.bash_profile
export CUDA_HOME=/usr/local/cuda
export DYLD_LIBRARY_PATH="$CUDA_HOME/lib:$CUDA_HOME/extras/CUPTI/lib"
export LD_LIBRARY_PATH=$DYLD_LIBRARY_PATH
export PATH=$DYLD_LIBRARY_PATH:$PATH
export flags="--config=cuda --config=opt"
source /usr/local/lib/bazel/bin/bazel-complete.bash
# 生效
source ~/.bash_profile

拉取源码，修改

# 拉取1.8分支
git clone https://github.com/tensorflow/tensorflow -b r1.8
# 拉取macos补丁
cd tensorflow
curl -O https://raw.githubusercontent.com/SixQuant/tensorflow-macos-gpu/master/patch/tensorflow-macos-gpu-r1.8.patch
# 修改如下部分
       name = "protobuf_archive",
       urls = [
-          "https://mirror.bazel.build/github.com/google/protobuf/archive/396336eb961b75f03b25824fe86cf6490fb75e3a.tar.gz",
-          "https://github.com/google/protobuf/archive/396336eb961b75f03b25824fe86cf6490fb75e3a.tar.gz",
+          "https://mirror.bazel.build/github.com/dinever/protobuf/archive/188578878eff18c2148baba0e116d87ce8f49410.tar.gz",
+          "https://github.com/dinever/protobuf/archive/188578878eff18c2148baba0e116d87ce8f49410.tar.gz",
       ]
# apply
git apply tensorflow-macos-gpu-r1.8.patch
curl -o third_party/nccl/nccl.h https://raw.githubusercontent.com/SixQuant/tensorflow-macos-gpu/master/patch/nccl.h
# copy cuda 依赖
sudo cp /usr/local/cuda/lib/libcublas.10.dylib /usr/local/cuda/lib/libcublas.10.1.dylib
sudo cp /usr/local/cuda/lib/libcusolver.10.dylib /usr/local /cuda/lib/libcusolver.10.1.dylib
sudo cp /usr/local/cuda/lib/libcurand.10.dylib /usr/local/cuda/lib/libcurand.10.1.dylib

编译

cd tensorflow 
./configure # 这步是生成bazel编译的环境变量，文件名字.tf_configure.bazelrc
# 内容如下
build --action_env PYTHON_BIN_PATH="/Users/haohaiwei/miniconda3/envs/tensorflow/bin/python"
build --action_env PYTHON_LIB_PATH="/Users/haohaiwei/miniconda3/envs/tensorflow/lib/python3.6/site-packages"
build --force_python=py3
build --host_force_python=py3
build --python_path="/Users/haohaiwei/miniconda3/envs/tensorflow/bin/python"
build:gcp --define with_gcp_support=true
build --define with_hdfs_support=true
build --define with_s3_support=true
build --define with_kafka_support=true
build:xla --define with_xla_support=true
build:gdr --define with_gdr_support=true
build:verbs --define with_verbs_support=true
build --action_env TF_NEED_OPENCL_SYCL="0"
build --action_env TF_NEED_CUDA="1"
build --action_env CUDA_TOOLKIT_PATH="/usr/local/cuda"
build --action_env TF_CUDA_VERSION="10.1"
build --action_env CUDNN_INSTALL_PATH="/usr/local/cuda"
build --action_env TF_CUDNN_VERSION="7"
build --action_env TF_CUDA_COMPUTE_CAPABILITIES="5.2,6.0,6.1"
build --action_env LD_LIBRARY_PATH="/usr/local/cuda/lib:/usr/local/cuda/extras/CUPTI/lib"
build --action_env TF_CUDA_CLANG="0"
build --action_env GCC_HOST_COMPILER_PATH="/usr/bin/gcc"
build --config=cuda
test --config=cuda
build --define grpc_no_ares=true
build:opt --copt=-march=native
build:opt --host_copt=-march=native
build:opt --define with_default_optimizations=true
build --copt=-DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK
build --host_copt=-DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK
# 编译 这步在当前cpu配置下，大概2小时多一点
bazel build --config=cuda --config=opt --action_env PATH --action_env LD_LIBRARY_PATH --action_env DYLD_LIBRARY_PATH  tensorflow/tools/pip_package:build_pip_package
gcc -march=native -c -fPIC tensorflow/contrib/nccl/kernels/nccl_ops.cc -o _nccl_ops.o
gcc _nccl_ops.o -shared -o _nccl_ops.so
mv _nccl_ops.so bazel-out/darwin-py3-opt/bin/tensorflow/contrib/nccl/python/ops
# 生成pip安装包
bazel-bin/tensorflow/tools/pip_package/build_pip_package ~/Downloads/bazel-bin/tensorflow/tools/pip_package/build_pip_package ~/Downloads/
# 安装，注意不要切国内的源，部分依赖会提示找不到
pip install ~/Downloads/tensorflow-1.8.0-cp36-cp36m-macosx_10_13_x86_64.whl

测试

# test.py内容如下，放到～/目录下，不要放到代码目录下
#!/usr/bin/env python
import tensorflow as tf
config = tf.ConfigProto()
config.log_device_placement = True
# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
with tf.Session(config=config) as sess:
    # Runs the op.
    print(sess.run(c))
# 执行
python test.py

正常输出如下：

/Users/haohaiwei/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/Users/haohaiwei/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/Users/haohaiwei/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:521: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/Users/haohaiwei/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:522: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/Users/haohaiwei/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/Users/haohaiwei/miniconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
2020-03-13 18:43:13.368564: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 3.15GiB
2020-03-13 18:43:13.368585: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2020-03-13 18:43:13.660669: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2867 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
2020-03-13 18:43:13.715111: I tensorflow/core/common_runtime/direct_session.cc:284] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1

MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2020-03-13 18:43:13.715678: I tensorflow/core/common_runtime/placer.cc:886] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0
b: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2020-03-13 18:43:13.715688: I tensorflow/core/common_runtime/placer.cc:886] b: (Const)/job:localhost/replica:0/task:0/device:GPU:0
a: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2020-03-13 18:43:13.715709: I tensorflow/core/common_runtime/placer.cc:886] a: (Const)/job:localhost/replica:0/task:0/device:GPU:0
[[22. 28.]
 [49. 64.]]

** 可能出现的问题**

code version must be specified to use an Apple CROSSTOOL. ...
解决方法：

sudo xcode-select -s /Applications/Xcode.app/Contents/Developer
sudo xcodebuild -license
bazel clean --expunge

参考

[1] 思否
[2] github

种花家的小白

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
hackintosh + nvidia 开启机器学习之路（二）（tensorflow-gpu在Mac下的编译）

tensorflow gpu版本在macOS下的编译背景tensorflow-gpu版本在1.2以及之后的版本，停止了对Mac的支持，因此通过pip安装的版本，只停留在1.1，并且此版本要求的cuda为8.0，无法体验最新的cuda环境信息OS macOS HighSierra 10.13.6cuda 10.1cudnn 7.0bazel 0.14.0xcode 10.1...
复制链接

扫一扫

专栏目录