Windows下编译带CUDA 11.2的TensorFlow 2.4.1（Python3.9.1，cuDNN 8.1.0，兼容性3.5 - 8.6，附编译结果下载）

最新推荐文章于 2024-05-10 12:31:20 发布

瑞凤玉子烧

最新推荐文章于 2024-05-10 12:31:20 发布

阅读量2k

点赞数 4

分类专栏：与Windows死磕到底的日常文章标签： tensorflow windows cuda gpu mkl

本文链接：https://blog.csdn.net/u012440550/article/details/113361176

版权

该博客详细介绍了如何在Windows系统下编译带有CUDA 11.2和cuDNN 8.1.0的TensorFlow 2.4.1，包括环境准备（内存、Python、CUDA、Bazel、MSYS2、Visual Studio 2019）、配置编译、代码修改、启动编译的步骤，以及安装注意事项和结果获取。编译过程中涉及Python依赖、CUDA安装、Bazel设置、MSYS2的环境配置，以及解决下载依赖失败、启用MKL等问题。

摘要由CSDN通过智能技术生成

基本参照我的这篇文章：《Windows下编译带CUDA 11.1（Update 1）的TensorFlow 2.4（RC0）（Python3.9.0，cuDNN 8.0.4，兼容性3.5 - 8.6，附编译结果下载）》，有些地方有所改动，重新组织一下步骤。

环境准备

1. 内存要求

在8个并行任务下（默认并行数为CPU线程数），应有不小于10G的内存，否则会产生编译器堆空间不足的错误。

2. Python & Pip

首先Python需要安装一些包：six、numpy、wheel、setuptools、keras_applications和keras_preprocessing，使用管理员权限打开命令提示符：

pip install six numpy wheel setuptools
pip install keras_applications --no-deps
pip install keras_preprocessing --no-deps

注意，Python路径中不能出现空格，即Windows下默认安装路径C:\Program Files\Python39会在编译时报错，因此如果装到了这个路径，需要在一个没有空格的目录下创建一个链接（不是快捷方式），用mklink命令。

3. CUDA

这里选的CUDA 11.2，CUDA官网下载安装，没什么好说的。

4. Bazel

然后是Bazel，bazel很简单，就一个exe，需要设置环境变量给到Path下，我偷懒直接放到CUDA的bin目录下。我选的版本是3.7.2。

5. MSYS2

再安装MSYS2，同样需要给msys64\usr\bin目录设置环境变量。

装好后再安装一些包，用的是pacman，由于默认源极慢极慢，所以建议国内换清华源。

进到msys64\etc\pacman.d目录下，修改三个mirrolist，分别在各自所有Server行前加一行：

mirrorlist.msys：

Server = https://mirrors.tuna.tsinghua.edu.cn/msys2/msys/$arch

mirrorlist.mingw32：

Server = https://mirrors.tuna.tsinghua.edu.cn/msys2/mingw/i686

mirrorlist.mingw64：

Server = https://mirrors.tuna.tsinghua.edu.cn/msys2/mingw/x86_64

官方教程少提了一个zip包，因此安装命令如下：

pacman -S git patch unzip zip

6. Visual Studio 2019

然后是VS，下载VS安装器，为避免麻烦，装到C盘默认路径（这次我没有尝试非C盘路径，不知道找不到编译器的bug还在不在）。如果非VS用户，只需安装除必选组件外的MSVC v142 - VS 2019 C++ x64/x86生成工具（随便一个，我选的最新版本）和Windows 10 SDK（同样随便，我选的最新的）。

编译

配置编译

下载TensorFlow 2.4.1源码，进入解压后的根目录，执行

D:\tensorflow-2.4.1>python configure.py
You have bazel 3.7.2 installed.
Please specify the location of python. [Default is C:\Python39\python.exe]:


Found possible Python library paths:
  C:\Python39\lib\site-packages
Please input the desired Python library path to use.  Default is [C:\Python39\lib\site-packages]

Do you wish to build TensorFlow with ROCm support? [y/N]: N
No ROCm support will be enabled for TensorFlow.

Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.

Found CUDA 11.1 in:
    D:/CUDA/lib/x64
    D:/CUDA/include
Found cuDNN 8 in:
    D:/CUDA/lib/x64
    D:/CUDA/include


Please specify a list of comma-separated CUDA compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Each capability can be specified as "x.y" or "compute_xy" to include both virtual and binary GPU code, or as "sm_xy" to only include the binary code.
Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]: 3.5,3.7,5.0,5.2,6.0,6.1,7.0,7.5,8.0,8.6


Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is /arch:AVX]: /arch:AVX2


Would you like to override eigen strong inline for some C++ compilation to reduce the compilation time? [Y/n]: Y
Eigen strong inline overridden.

Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:
Not configuring the WORKSPACE for Android builds.

Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
        --config=mkl            # Build with MKL support.
        --config=mkl_aarch64    # Build with oneDNN support for Aarch64.
        --config=monolithic     # Config for mostly static monolithic build.
        --config=ngraph         # Build with Intel nGraph support.
        --config=numa           # Build with NUMA support.
        --config=dynamic_kernels        # (Experimental) Build kernels into separate shared objects.
        --config=v2             # Build TensorFlow 2.x instead of 1.x.
Preconfigured Bazel build configs to DISABLE default on features:
        --config=noaws          # Disable AWS S3 filesystem support.
        --config=nogcp          # Disable GCP support.
        --config=nohdfs         # Disable HDFS support.
        --config=nonccl         # Disable NVIDIA NCCL support.

这版TensorFlow编译SM 3.5会报错，查了一下貌似TensorRT不支持这么低的版本。

代码修改

出现下载依赖失败时

后续编译时，如果出现类似于这种警告：

WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/www.sqlite.org/2020/sqlite-amalgamation-3340000.zip failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
WARNING: Download from https://mirror.bazel.build/github.com/aws/aws-sdk-cpp/archive/1.7.336.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/f402e682d0ef5598eeffc9a21a691b03e602ff58.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found

表明谷歌相应的源码镜像失效了，理论上有备用原始地址，但不知道为什么没有启用，因此修改tensorflow\workspace.bzl，对调镜像地址和原始地址（不能直接删除或注释掉镜像地址，因为要求必须至少有两个地址），比如修改SQLite地址，原始为：

tf_http_archive(
        name = "org_sqlite",
        build_file = clean_dep("//third_party:sqlite.BUILD"),
        sha256 = "8ff0b79fd9118af7a760f1f6a98cac3e69daed325c8f9f0a581ecb62f797fd64",
        strip_prefix = "sqlite-amalgamation-3340000",
        system_build_file = clean_dep("//third_party/systemlibs:sqlite.BUILD"),
        urls = [
            "https://storage.googleapis.com/mirror.tensorflow.org/www.sqlite.org/2020/sqlite-amalgamation-3340000.zip",
            "https://www.sqlite.org/2020/sqlite-amalgamation-3340000.zip",
        ],
    )

改为：

最低0.47元/天解锁文章

瑞凤玉子烧

关注

4
点赞
踩
7

收藏

觉得还不错? 一键收藏
6
评论
Windows下编译带CUDA 11.2的TensorFlow 2.4.1（Python3.9.1，cuDNN 8.1.0，兼容性3.5 - 8.6，附编译结果下载）

基本参照我的这篇文章：《Windows下编译带CUDA 11.1（Update 1）的TensorFlow 2.4（RC0）（Python3.9.0，cuDNN 8.0.4，兼容性3.5 - 8.6，附编译结果下载）》，有些地方有所改动，重新组织一下步骤。环境准备1. Python & Pip首先Python需要安装一些包：six、numpy、wheel、setuptools、keras_applications和keras_preprocessing，使用管理员权限打开命令提示符：
复制链接

扫一扫