tensorflow GPU windows下编译

最新推荐文章于 2024-11-25 22:13:52 发布

UnkownState

最新推荐文章于 2024-11-25 22:13:52 发布

阅读量522

点赞数

分类专栏： MFC/C/C++

本文链接：https://blog.csdn.net/andrew57/article/details/103403221

版权

MFC/C/C++ 专栏收录该内容

51 篇文章

订阅专栏

1.首先配置环境cuda，cudnn开发运行环境，参考https://blog.csdn.net/m0_37605642/article/details/98854753

2.参考我的前一篇配置CPU版本编译环境。https://blog.csdn.net/andrew57/article/details/103396426

3.如果已经生成CPU版本，可以bazel clean一下。然后执行python configure.py，配置如下，和cpu版区别是cuda support。

Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
No XLA JIT support will be enabled for TensorFlow.

Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Found CUDA 10.0 in:
D:/NVIDIA GPU Computing Toolkit/CUDA/v10.0/lib/x64
D:/NVIDIA GPU Computing Toolkit/CUDA/v10.0/include
Found cuDNN 7 in:
D:/NVIDIA GPU Computing Toolkit/CUDA/v10.0/lib/x64
D:/NVIDIA GPU Computing Toolkit/CUDA/v10.0/include

Please specify a list of comma-separated CUDA compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]:

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is /arch:AVX]:

Would you like to override eigen strong inline for some C++ compilation to reduce the compilation time? [Y/n]: Y
Eigen strong inline overridden.

3.分别生成dll lib，头文件生成参考前一篇CPU版本tensorflow生成。

bazel build --config=cuda //tensorflow:tensorflow_cc
bazel build --config=cuda //tensorflow:tensorflow_cc.lib

4.如果发生“Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED”，如下处理：

SessionOptions options;
options.config.mutable_gpu_options()->set_allow_growth(true);
options.config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(1.0);

5.环境差异大可能导致编译出的问题不一样，本次环境依然同前一篇。