基本参照我的这篇文章:《Windows下编译带CUDA 11.1(Update 1)的TensorFlow 2.4(RC0)(Python3.9.0,cuDNN 8.0.4,兼容性3.5 - 8.6,附编译结果下载)》,有些地方有所改动,重新组织一下步骤。
环境准备
1. 内存要求
在8个并行任务下(默认并行数为CPU线程数),应有不小于10G的内存,否则会产生编译器堆空间不足的错误。
2. Python & Pip
首先Python需要安装一些包:six、numpy、wheel、setuptools、keras_applications和keras_preprocessing,使用管理员权限打开命令提示符:
pip install six numpy wheel setuptools
pip install keras_applications --no-deps
pip install keras_preprocessing --no-deps
注意,Python路径中不能出现空格,即Windows下默认安装路径C:\Program Files\Python39会在编译时报错,因此如果装到了这个路径,需要在一个没有空格的目录下创建一个链接(不是快捷方式),用mklink命令。
3. CUDA
这里选的CUDA 11.2,CUDA官网下载安装,没什么好说的。
4. Bazel
然后是Bazel,bazel很简单,就一个exe,需要设置环境变量给到Path下,我偷懒直接放到CUDA的bin目录下。我选的版本是3.7.2。
5. MSYS2
再安装MSYS2,同样需要给msys64\usr\bin目录设置环境变量。
装好后再安装一些包,用的是pacman,由于默认源极慢极慢,所以建议国内换清华源。
进到msys64\etc\pacman.d目录下,修改三个mirrolist,分别在各自所有Server行前加一行:
mirrorlist.msys:
Server = https://mirrors.tuna.tsinghua.edu.cn/msys2/msys/$arch
mirrorlist.mingw32:
Server = https://mirrors.tuna.tsinghua.edu.cn/msys2/mingw/i686
mirrorlist.mingw64:
Server = https://mirrors.tuna.tsinghua.edu.cn/msys2/mingw/x86_64
官方教程少提了一个zip包,因此安装命令如下:
pacman -S git patch unzip zip
6. Visual Studio 2019
然后是VS,下载VS安装器,为避免麻烦,装到C盘默认路径(这次我没有尝试非C盘路径,不知道找不到编译器的bug还在不在)。如果非VS用户,只需安装除必选组件外的MSVC v142 - VS 2019 C++ x64/x86生成工具(随便一个,我选的最新版本)和Windows 10 SDK(同样随便,我选的最新的)。
编译
配置编译
下载TensorFlow 2.4.1源码,进入解压后的根目录,执行
D:\tensorflow-2.4.1>python configure.py
You have bazel 3.7.2 installed.
Please specify the location of python. [Default is C:\Python39\python.exe]:
Found possible Python library paths:
C:\Python39\lib\site-packages
Please input the desired Python library path to use. Default is [C:\Python39\lib\site-packages]
Do you wish to build TensorFlow with ROCm support? [y/N]: N
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Found CUDA 11.1 in:
D:/CUDA/lib/x64
D:/CUDA/include
Found cuDNN 8 in:
D:/CUDA/lib/x64
D:/CUDA/include
Please specify a list of comma-separated CUDA compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Each capability can be specified as "x.y" or "compute_xy" to include both virtual and binary GPU code, or as "sm_xy" to only include the binary code.
Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 3.5,7.0]: 3.5,3.7,5.0,5.2,6.0,6.1,7.0,7.5,8.0,8.6
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is /arch:AVX]: /arch:AVX2
Would you like to override eigen strong inline for some C++ compilation to reduce the compilation time? [Y/n]: Y
Eigen strong inline overridden.
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:
Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
--config=mkl # Build with MKL support.
--config=mkl_aarch64 # Build with oneDNN support for Aarch64.
--config=monolithic # Config for mostly static monolithic build.
--config=ngraph # Build with Intel nGraph support.
--config=numa # Build with NUMA support.
--config=dynamic_kernels # (Experimental) Build kernels into separate shared objects.
--config=v2 # Build TensorFlow 2.x instead of 1.x.
Preconfigured Bazel build configs to DISABLE default on features:
--config=noaws # Disable AWS S3 filesystem support.
--config=nogcp # Disable GCP support.
--config=nohdfs # Disable HDFS support.
--config=nonccl # Disable NVIDIA NCCL support.
这版TensorFlow编译SM 3.5会报错,查了一下貌似TensorRT不支持这么低的版本。
代码修改
出现下载依赖失败时
后续编译时,如果出现类似于这种警告:
WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/www.sqlite.org/2020/sqlite-amalgamation-3340000.zip failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
WARNING: Download from https://mirror.bazel.build/github.com/aws/aws-sdk-cpp/archive/1.7.336.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/f402e682d0ef5598eeffc9a21a691b03e602ff58.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
表明谷歌相应的源码镜像失效了,理论上有备用原始地址,但不知道为什么没有启用,因此修改tensorflow\workspace.bzl,对调镜像地址和原始地址(不能直接删除或注释掉镜像地址,因为要求必须至少有两个地址),比如修改SQLite地址,原始为:
tf_http_archive(
name = "org_sqlite",
build_file = clean_dep("//third_party:sqlite.BUILD"),
sha256 = "8ff0b79fd9118af7a760f1f6a98cac3e69daed325c8f9f0a581ecb62f797fd64",
strip_prefix = "sqlite-amalgamation-3340000",
system_build_file = clean_dep("//third_party/systemlibs:sqlite.BUILD"),
urls = [
"https://storage.googleapis.com/mirror.tensorflow.org/www.sqlite.org/2020/sqlite-amalgamation-3340000.zip",
"https://www.sqlite.org/2020/sqlite-amalgamation-3340000.zip",
],
)
改为: