Win 10 环境安装TensorFlow 2.1.0 GPU版本的失败经历

最新推荐文章于 2024-09-14 15:46:29 发布

newstrongers

最新推荐文章于 2024-09-14 15:46:29 发布

阅读量5.3k

点赞数

分类专栏：系统操作编程基础文章标签： tensorflow windows

本文链接：https://blog.csdn.net/newstrongers/article/details/104516470

版权

编程基础同时被 2 个专栏收录

3 篇文章 0 订阅

订阅专栏

系统操作

2 篇文章 0 订阅

订阅专栏

文章目录

本机配置

系统：Win 10笔记本
显卡：GeForce 940MX， computeCapability: 5.0
Python：3.6
CUDA版本：10.1
cuDNN版本：7.6.5

官网安装说明

官方的硬件及软件要求 (GPU版本)

The following NVIDIA® software must be installed on your system:

NVIDIA® GPU drivers —CUDA 10.1 requires 418.x or higher.
CUDA® Toolkit —TensorFlow supports CUDA 10.1 (TensorFlow >= 2.1.0)
CUPTI ships with the CUDA Toolkit.
cuDNN SDK (>= 7.6)
(Optional) TensorRT 6.0 to improve latency and throughput for inference on some models.

GPU型号和cuda版本的对应关系
cuDNN的安装说明参考这里，下载cuDNN需要注册登录。

安装 TensorFlow

在对应的conda环境下安装
pip install tensorflow # install in $HOME，默认安装GPU版本 2.1.0

测试

官方说明在这里

from __future__ import absolute_import, division, print_function, unicode_literals

import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

output:
2020-02-26 13:52:57.152740: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-02-26 13:53:13.930344: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-02-26 13:53:14.241621: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce 940MX computeCapability: 5.0
coreClock: 1.189GHz coreCount: 3 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 37.33GiB/s
2020-02-26 13:53:14.251583: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-02-26 13:53:14.267335: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-02-26 13:53:14.281962: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-02-26 13:53:14.292789: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-02-26 13:53:14.308111: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-02-26 13:53:14.322738: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-02-26 13:53:14.341372: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-02-26 13:53:14.349375: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0      Num GPUs Available:  1

此时，我以为大功告成了! 一阵窃喜，然后继续测试，

In [4]: a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
   ...: b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
   ...: c = tf.matmul(a, b)
   ...:
   ...: print(c)
   ...:

output:
2020-02-26 14:00:13.049637: I tensorflow/core/platform/cpu_feature_guard.cc:142]
 Your CPU supports instructions that this TensorFlow binary was 
 not compiled to use: AVX2
2020-02-26 14:00:13.056240: F tensorflow/stream_executor/lib/statusor.cc:34] 
Attempting to fetch value instead of handling error Internal:
 failed to get device attribute 13 for device 0: CUDA_ERROR_UNKNOWN: unknown error

程序报错了……

网上查找一通，仍然没有解决。有人说，这几句代码怎么知道是在GPU上运行的呢？因为官方文档是这么说的：

If a TensorFlow operation has both CPU and GPU implementations, by default the GPU devices will be given priority when the operation is assigned to a device. For example, tf.matmul has both CPU and GPU kernels. On a system with devices CPU:0 and GPU:0, the GPU:0 device will be selected to run tf.matmul unless you explicitly request running it on another device.

如果一个算子，既有cpu实现，也有gpu实现，那么优先选择运行在GPU上，而且框架是自动切换的。

对于这个错误，有人说是GPU型号不行，性能太差，如果真的是这样的话，那就是硬伤了，不抱希望了，换回CPU版本了。
安装CPU版本后，代码就可以运行了。以后有计划再尝试吧。
如果有人解决了，我也可以参考下，感谢~
2020-02-26

更新【2021-01-16】
根据评论中exthe对原因的说明，