1.首先配置环境cuda,cudnn开发运行环境,参考https://blog.csdn.net/m0_37605642/article/details/98854753
2.参考我的前一篇配置CPU版本编译环境。https://blog.csdn.net/andrew57/article/details/103396426
3.如果已经生成CPU版本,可以bazel clean一下。然后执行python configure.py,配置如下,和cpu版区别是cuda support。
Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Found CUDA 10.0 in:
D:/NVIDIA GPU Computing Toolkit/CUDA/v10.0/lib/x64
D:/NVIDIA GPU Computing Toolkit/CUDA/v10.0/include
Found cuDNN 7 in:
D:/NVIDIA GPU Computing Toolkit/CUDA/v10.0/lib/x64
D:/NVIDIA GPU Computing Toolkit/CUDA/v10.0/include
Please specify a list of comma-separated CUDA compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]:
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is /arch:AVX]:
Would you like to override eigen strong inline for some C++ compilation to reduce the compilation time? [Y/n]: Y
Eigen strong inline overridden.
3.分别生成dll lib,头文件生成参考前一篇CPU版本tensorflow生成。
bazel build --config=cuda //tensorflow:tensorflow_cc
bazel build --config=cuda //tensorflow:tensorflow_cc.lib
4.如果发生“Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED”,如下处理:
SessionOptions options;
options.config.mutable_gpu_options()->set_allow_growth(true);
options.config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(1.0);
5.环境差异大可能导致编译出的问题不一样,本次环境依然同前一篇。