从源码编译tensorflow2.1.0

之前因为cpu不支持avx指令,所以尝试过源码编译,但以失败告终.
昨天重操旧业,想咸鱼翻身…忙了两天,终于成功编译…
记录下各种错误和经验教训. 希望大家少遇坑.

系统配置

win10 cuda10.0 cudnn7.4 anaconda python3.7
官方的教程在这里
用到的编译工具是google的bazel.
还有微软的visual studio 2017 地址

下载tf源码

原始仓库: https://github.com/tensorflow/tensorflow
昨天偶然在码云上发现码云官方提供的tensorflow镜像源这里,下载速度贼快.
除了tensorflow还有一大堆其他项目,大家想下载源码先到码云上看看有没有镜像.

下载bazel

bazel github仓库在这 不过下载速度太慢了,幸运的是我发现了它的镜像源,华为开源镜像站

bazel的版本和visual studio的版本可以参考这里
但建议使用tf源码目录下的.bazelversion文件里指定的版本

  • 下载的bazel的exe文件改名成bazel.exe,把其所在目录加到path环境变量中

下载visual studio

  • vs的版本 也可以参考上面
    这里遇到个坑, 一开始用的2019版本的,导致在build过程中报错,提示需要2013-2017的vs
  • 设置BAZEL_VC环境变量为 visual studio的VC路径 ,比如我的是
    D:\visualStudio2017\Community\VC\

源码包的结构

根目录包含.bazelrc / .bazelversion / configure.py / .tf_configure.bazelrc / WORKSPACE等文件,
这个根目录是bazel运行时的工作目录.

  • bazelrc是一些bazel的配置方案,
  • .bazelversion 顾名思义指定了需要的bazel版本
  • configure.py 需要运行来设置你的python位置/库位置/cuda版本等,
    我用的python和库都是anaconda里的
  • .tf_configure.bazelrc 就是运行configure.py之后的个性化配置
  • WORKSPACE文件 貌似是用来指示当前目录是一个包,类似python的__init__.py

其中需要注意的是 运行configure.py时会问

Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified

默认的值是"/arch:avx"
如果你的cpu不支持avx的话,建议不要默认,随便设置一个值,比如我设的是 -Wextra ,
然后在.tf_configure.bazelrc文件中把 build:opt --copt=-Wextra 这一行删掉即可

安装必要的pip包

在anaconda中安装pip包,官方教程中只装了几个包.建议全部装.
编译要求的pip包位于"根目录\tensorflow\tools\pip_package\setup.py"文件中的REQUIRED_PACKAGES变量下:

REQUIRED_PACKAGES = [
    'absl-py >= 0.7.0',
    'astor >= 0.6.0',
    'backports.weakref >= 1.0rc1;python_version<"3.4"',
    'enum34 >= 1.1.6;python_version<"3.4"',
    'gast == 0.2.2',
    'google_pasta >= 0.1.6',
    'keras_applications >= 1.0.8',
    'keras_preprocessing >= 1.1.0',
    'numpy >= 1.16.0, < 2.0',
    'opt_einsum >= 2.3.2',
    'protobuf >= 3.8.0',
    'tensorboard >= 2.1.0, < 2.2.0',
    'tensorflow_estimator >= 2.1.0rc0, < 2.2.0',
    'termcolor >= 1.1.0',
    'wrapt >= 1.11.1',
    # python3 requires wheel 0.26
    'wheel >= 0.26;python_version>="3"',
    'wheel;python_version<"3"',
    # mock comes with unittest.mock for python3, need to install for python2
    'mock >= 2.0.0;python_version<"3"',
    # functools comes with python3, need to install the backport for python2
    'functools32 >= 3.2.3;python_version<"3"',
    'six >= 1.12.0',
    # scipy < 1.4.1 causes segfaults due to pybind11
    # Latest scipy pip for py2 is scipy==1.2.2
    'scipy == 1.4.1;python_version>="3"',
    'scipy == 1.2.2;python_version<"3"',
]

安装这些包,建议使用pip install -r request.txt
先把上面的包名和版本限制写在request.txt里,还要把前面的单引号和结尾的单引号&逗号都删掉,(注释行可以保留),最后运行pip一次性安装所有需要的包.

配置build

运行 python ./configure.py

构建 pip 软件包

在根目录运行下面的命令. 根目录前面提到过,就是含有.bazelrc的目录

bazel build --config=opt --config=cuda --define=no_tensorflow_py_deps=true //tensorflow/tools/pip_package:build_pip_package

生成whl

  • bazel build之后会在生成一个可执行文件,位于
    根目录/bazel-bin\tensorflow\tools\pip_package\build_pip_package
  • 运行它,并指定一个目录,将会在该目录下生成pip安装需要的whl文件.
bazel-bin\tensorflow\tools\pip_package\build_pip_package C:/tmp/tensorflow_pkg

这里的目录是C:/tmp/tensorflow_pkg

安装whl

用anaconda里的pip安装上一步生成的whl

pip3 install C:/tmp/tensorflow_pkg/tensorflow-version-cp36-cp36m-win_amd64.whl

错误集合

  • 报错 Caused by: java.nio.file.InvalidPathException: Illegal char <:> at index 72:
    这里和java有关,可能需要先安装java,参考廖雪峰大神的java教程
    安装好之后设置JAVA_HOME指向jdk目录
    还要把jdk的bin目录加入到path中
    然而还是报错,最后在tensoflow github issue上看到解决方法:
    BAZEL_VC环境变量设为 visual Studio的VC目录,我的是:
    D:\visualStudio2017\Community\VC`
  • 报错...\bazel-out\x64_windows-opt\bin\external\local_config_cuda\cuda\cuda\include\crt/host_config.h(143): fatal error C1189: #error: -- unsupported Microsoft Visual Studio version! Only the versions between 2013 and 2017 (inclusive) are supported!
    这是vs版本太高导致的,不要用2019,用2017即可
  • 报错 nvcc error : 'cudafe++' died with status 0xC0000005 (ACCESS_VIOLATION
    据github issue上说这个是cuda10.0的bug,需要用10.1版本的cudafe++来替换10.0的,
    参考 博客
    参考 github issue
  • 报错 Executing genrule //tensorflow/python/keras/api:keras_python_api_gen_compat_v1 failed (Exit 1) .....ModuleNotFoundError: No module named 'keras_preprocessing'
    这是因为你没有安装一些必要的包,比如keras_preprocessing
    参考上面提到的,把需要的py包一次性安装上即可

安装完tf之后遇到的错误

  • import报错ImportError: cannot import name 'build_info' from 'tensorflow.python.platform'
    原因是我运行ipython时的目录是编译tf时的根目录,换到其他路径就好了
  • 运行tf.命令报错: tensorflow/stream_executor/lib/statusor.cc:34] Attempting to fetch value instead of handling error Internal: failed to get device attribute 13 for device 0: CUDA_ERROR_UNKNOWN: unknown error
    可能原因:
    (1) cudafe++.exe用的是10.1的,换回10.0的cudafe++试试
    (2) 环境变量中用的10.1的cuda ,改成10.0
    检查这几个环境变量是不是10.0,要确保是:
    CUDA_PATH / CUDA_PATH_V10_0 / CUDA_PATH_V10_1 等等
发布了18 篇原创文章 · 获赞 1 · 访问量 3017
App 阅读领勋章
微信扫码 下载APP
阅读全文

TensorFlow运行中出现的问题

03-21

先放图 ![代码](https://img-ask.csdn.net/upload/201703/21/1490094445_164133.png) ![结果](https://img-ask.csdn.net/upload/201703/21/1490094459_829443.png) 虽然有结果 但是 E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "BestSplits" device_type: "CPU"') for unknown op: BestSplits E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "CountExtremelyRandomStats" device_type: "CPU"') for unknown op: CountExtremelyRandomStats E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "FinishedNodes" device_type: "CPU"') for unknown op: FinishedNodes E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "GrowTree" device_type: "CPU"') for unknown op: GrowTree E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "ReinterpretStringToFloat" device_type: "CPU"') for unknown op: ReinterpretStringToFloat E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "SampleInputs" device_type: "CPU"') for unknown op: SampleInputs E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "ScatterAddNdim" device_type: "CPU"') for unknown op: ScatterAddNdim E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TopNInsert" device_type: "CPU"') for unknown op: TopNInsert E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TopNRemove" device_type: "CPU"') for unknown op: TopNRemove E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TreePredictions" device_type: "CPU"') for unknown op: TreePredictions E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "UpdateFertileSlots" device_type: "CPU"') for unknown op: UpdateFertileSlots 是什么玩意 问答

源码编译tensorflow,遇到如下问题:试了一些方法,仍然ERROR

09-05

zmm@zmm-System-Product-Name:~/src/AI/tensorflow$ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package ERROR: /home/zmm/src/AI/tensorflow/tensorflow/core/kernels/BUILD:2334:1: C++ compilation of rule '//tensorflow/core/kernels:svd_op' failed (Exit 1) In file included from external/eigen_archive/Eigen/Core:452:0, from external/eigen_archive/Eigen/QR:11, from external/eigen_archive/Eigen/SVD:11, from ./third_party/eigen3/Eigen/SVD:1, from ./tensorflow/core/kernels/svd_op_impl.h:23, from tensorflow/core/kernels/svd_op_complex128.cc:16: external/eigen_archive/Eigen/src/Core/util/BlasUtil.h: In instantiation of 'struct Eigen::internal::conj_helper<__vector(8) double, std::complex<double>, false, false>': external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:339:77: required from 'void Eigen::internal::apply_rotation_in_the_plane(Eigen::DenseBase<Derived>&, Eigen::DenseBase<Derived>&, const Eigen::JacobiRotation<OtherScalar>&) [with VectorX = Eigen::Block<Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>, 1, -1, true>; VectorY = Eigen::Block<Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>, 1, -1, true>; OtherScalar = double]' external/eigen_archive/Eigen/src/Jacobi/Jacobi.h:282:40: required from 'void Eigen::MatrixBase<Derived>::applyOnTheLeft(Eigen::Index, Eigen::Index, const Eigen::JacobiRotation<OtherScalar>&) [with OtherScalar = double; Derived = Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>; Eigen::Index = long int]' external/eigen_archive/Eigen/src/SVD/JacobiSVD.h:725:13: required from 'Eigen::JacobiSVD<MatrixType, QRPreconditioner>& Eigen::JacobiSVD<MatrixType, QRPreconditioner>::compute(const MatrixType&, unsigned int) [with _MatrixType = Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>; int QRPreconditioner = 2; Eigen::JacobiSVD<MatrixType, QRPreconditioner>::MatrixType = Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>]' external/eigen_archive/Eigen/src/SVD/JacobiSVD.h:548:14: required from 'Eigen::JacobiSVD<MatrixType, QRPreconditioner>::JacobiSVD(const MatrixType&, unsigned int) [with _MatrixType = Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>; int QRPreconditioner = 2; Eigen::JacobiSVD<MatrixType, QRPreconditioner>::MatrixType = Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>]' external/eigen_archive/Eigen/src/SVD/BDCSVD.h:252:57: required from 'Eigen::BDCSVD<_MatrixType>& Eigen::BDCSVD<MatrixType>::compute(const MatrixType&, unsigned int) [with _MatrixType = Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>; Eigen::BDCSVD<MatrixType>::MatrixType = Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>]' external/eigen_archive/Eigen/src/SVD/BDCSVD.h:137:12: required from 'Eigen::BDCSVD<MatrixType>::BDCSVD(const MatrixType&, unsigned int) [with _MatrixType = Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>; Eigen::BDCSVD<MatrixType>::MatrixType = Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>]' ./tensorflow/core/kernels/svd_op_impl.h:88:49: required from 'void tensorflow::SvdOp<Scalar>::ComputeMatrix(tensorflow::OpKernelContext*, const ConstMatrixMaps&, tensorflow::SvdOp<Scalar>::MatrixMaps*) [with Scalar = std::complex<double>; tensorflow::SvdOp<Scalar>::ConstMatrixMaps = tensorflow::gtl::InlinedVector<Eigen::Map<const Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>, 0, Eigen::Stride<0, 0> >, 4>; tensorflow::SvdOp<Scalar>::MatrixMaps = tensorflow::gtl::InlinedVector<Eigen::Map<Eigen::Matrix<std::complex<double>, -1, -1, 1, -1, -1>, 0, Eigen::Stride<0, 0> >, 4>]' tensorflow/core/kernels/svd_op_complex128.cc:23:1: required from here external/eigen_archive/Eigen/src/Core/util/BlasUtil.h:63:74: error: no type named 'ReturnType' in 'struct Eigen::ScalarBinaryOpTraits<__vector(8) double, std::complex<double>, Eigen::internal::scalar_product_op<__vector(8) double, std::complex<double> > >' typedef typename ScalarBinaryOpTraits<LhsScalar,RhsScalar>::ReturnType Scalar; FAILED: Build did NOT complete successfully 问答

没有更多推荐了,返回首页

©️2019 CSDN 皮肤主题: 游动-白 设计师: 上身试试

分享到微信朋友圈

×

扫一扫,手机浏览