Windows 安装配置tensorflow-gpu2.6、cuda、cudnn的一些问题及解决办法

前言

tensorflow-gpu、cuda、cudnn的版本一定要对应!!!

详见:在 Windows 环境中从源代码构建  |  TensorFlow (google.cn)

1.配置环境

我的配置环境:windows11(家庭版,x64) + tensorflow-gpu2.6.0 + cudatoolkit11.2.0 + cudnn8.1.0.77 + Anaconda3(2024.06_1) + pycharm 2023.3.2 + RTX4050显卡

2.安装步骤

1.安装顺序

  1. 安装pycharm
  2. 安装Anaconda3
  3. 管理员方式打开Anaconda promt,通过conda或pip命令安装接下来步骤中的安装包
  4. 通过conda命令安装cudatoolkit
  5. 通过conda命令安装cudnn
  6. 通过pip命令安装tensorflow-gpu

2.安装参考链接

按照上述顺序安装,即可完成环境配置。我参考的博文如下:

Anaconda3安装:1.Windows下的Anaconda详细安装教程_windows安装anaconda-CSDN博客

                             2. 还是搞不懂Anaconda是什么?读这一篇文章就够了-CSDN博客 (辅助参考)

步骤3~6安装:十分钟安装Tensorflow-gpu2.6.0+本机CUDA12 以及numpy+matplotlib各包版本协调问题_tensorflow cuda12-CSDN博客

注意点:

1.安装Anaconda3第一个参考链接中的“conda默认虚拟环境路径修改”,我的是使用方法1生效(方法2也可以):

2.配置tensorflow-gpu、cuda、cudnn时,按照参考博文步骤,最终 输入conda list命令 所显示的安装包版本会与博主所展示的些许差别,可以先不用着急改,我安装完后 list 如下:

3.测试验证

1.TensorFlow-gpu测试

测试代码如下:

import tensorflow as tf

"""
    tensorflow-gpu-2.6.0 test
"""
print(tf.__version__)
print(tf.test.gpu_device_name())
print(tf.config.experimental.set_visible_devices)
print('GPU:', tf.config.list_physical_devices('GPU'))
print('CPU:', tf.config.list_physical_devices(device_type='CPU'))
print(tf.config.list_physical_devices('GPU'))
print(tf.test.is_gpu_available())
# 输出可用的GPU数量
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
# 查询GPU设备

得到报错结果如下:显示“Could not load dynamic library ‘cudart64_110.dll‘; dlerror: cudart64_110.dll”等不能加载动态库的错误。

在网上搜了一大堆基本都是缺什么dll文件,就寻找下载对应的的dll文件,然后放在C:\Windows\System路径下。这是治标不治本的方法。仔细一想,我的tensorflow-gpu、cuda、cudnn的版本都是按照tensorflow官网建议来的,应该不会错。于是,先查看了我的conda虚拟环境下的是否有这些缺失的dll文件,我创建的路径是D:\ProgramData\Data\Anaconda_envs\envs\tf-gpu-2.6\Library\bin:

报错不能加载的dll文件都在, 怀疑是系统找不到该路径,于是打开系统环境,找到path,果然,没有添加。添加后,再次运行测试代码,结果如下:

得到上述结果,测试通过,则说明TensorFlow-gpu环境配置初步成功。Congratulations!!! 

2. 调用cuda加速测试

测试代码如下:

import cv2
import tensorflow as tf
from mtcnn import MTCNN

config = tf.compat.v1.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.6
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)

detector = MTCNN()

img = cv2.imread("Lena3.png")

output = detector.detect_faces(img)

print(output)

得到报错信息如下,主要是

错误1. Call to CreateProcess failed. Error code: 2

告警2. Couldn't get ptxas version string: Internal: Couldn't invoke ptxas.exe --version

告警3. Internal: Failed to launch ptxas

D:\ProgramData\Data\Anaconda_envs\envs\tf-gpu-2.6\python.exe D:\Codes\Python\tf_gpu_test\cuda_test.py 
2024-08-29 17:45:39.732544: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-29 17:45:40.595161: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3684 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9
2024-08-29 17:45:40.618800: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3684 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9
2024-08-29 17:45:40.827761: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2024-08-29 17:45:41.754506: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8100
2024-08-29 17:45:42.639977: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2
2024-08-29 17:45:42.641084: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2
2024-08-29 17:45:42.641227: W tensorflow/stream_executor/gpu/asm_compiler.cc:77] Couldn't get ptxas version string: Internal: Couldn't invoke ptxas.exe --version
2024-08-29 17:45:42.648187: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2
2024-08-29 17:45:42.648554: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Failed to launch ptxas
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.
2024-08-29 17:45:45.448664: I tensorflow/stream_executor/cuda/cuda_blas.cc:1760] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
[{'box': [202, 182, 158, 210], 'confidence': 0.9971387386322021, 'keypoints': {'left_eye': (267, 265), 'right_eye': (336, 267), 'nose': (316, 317), 'mouth_left': (265, 348), 'mouth_right': (322, 351)}}]

Process finished with exit code 0

对于错误1 Call to CreateProcess failed,该博主的方法1:运行 conda install -c nvidia cuda-nvcc 来确保ptxas在conda环境中 (因为我的cudatoolkit是通过在conda虚拟环境中安装的,没有通过从官网下载安装包的方式,因此方法2不适合我)。完整命令如下:

conda install -c nvidia cuda-nvcc

试了,还是报错,和上面的报错信息一模一样,没有任何改善.......  (别急着卸载cuda-nvcc)

通过在电脑中搜索 ptxas.exe ,发现它就在我的conda虚拟环境路径下:

于是老方法,将该路径添加到系统环境变量的path中。关闭pycharm,重新打开,运行测试代码,上述错误和警告均解决,nice!!!  正确结果如下:

D:\ProgramData\Data\Anaconda_envs\envs\tf-gpu-2.6\python.exe D:\Codes\Python\tf_gpu_test\cuda_test.py 
2024-08-29 19:11:11.700032: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-29 19:11:14.105552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3684 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9
2024-08-29 19:11:14.129957: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3684 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9
2024-08-29 19:11:14.347828: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2024-08-29 19:11:15.332231: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8100
2024-08-29 19:11:21.279191: I tensorflow/stream_executor/cuda/cuda_blas.cc:1760] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
[{'box': [202, 182, 158, 210], 'confidence': 0.9971387386322021, 'keypoints': {'left_eye': (267, 265), 'right_eye': (336, 267), 'nose': (316, 317), 'mouth_left': (265, 348), 'mouth_right': (322, 351)}}]

Process finished with exit code 0

3.人脸识别代码测试

 下面跑一段人脸识别代码测试:

import cv2
import tensorflow as tf
from mtcnn import MTCNN

config = tf.compat.v1.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.6
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)

detector = MTCNN()
img = cv2.imread('Lena3.png')
output = detector.detect_faces(img)
print(output)

x,y,width,height = output[0]['box']
left_eye_X,left_eye_Y = output[0]['keypoints']['left_eye']
right_eye_X,right_eye_Y = output[0]['keypoints']['right_eye']
nose_X,nose_Y = output[0]['keypoints']['nose']
mouth_left_X,mouth_left_Y = output[0]['keypoints']['mouth_left']
mouth_right_X,mouth_right_Y = output[0]['keypoints']['mouth_right']

# opencv的三色顺序为BGR (Blue,Green,Red)
cv2.rectangle(img,pt1=(x,y),pt2=(x+width,y+height),color=(0,255,0),thickness=2)
cv2.circle(img,center=(left_eye_X,left_eye_Y),color=(0,0,255),thickness=2,radius=1)
cv2.circle(img,center=(right_eye_X,right_eye_Y),color=(0,0,255),thickness=2,radius=1)
cv2.circle(img,center=(nose_X,nose_Y),color=(255,0,0),thickness=2,radius=1)
cv2.circle(img,center=(mouth_left_X,mouth_left_Y),color=(255,0,0),thickness=2,radius=1)
cv2.circle(img,center=(mouth_right_X,mouth_right_Y),color=(255,0,0),thickness=2,radius=1)
cv2.imshow('result',img)

cv2.waitKey(0)

结果如下:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值