tensorflowgpu20升级

最新推荐文章于 2024-07-24 08:45:12 发布

DLANDML

最新推荐文章于 2024-07-24 08:45:12 发布

阅读量907

点赞数 1

分类专栏：环境搭建

本文链接：https://blog.csdn.net/l641208111/article/details/108529071

版权

环境搭建专栏收录该内容

31 篇文章 0 订阅

订阅专栏

原因：

最近从某处下载了一个深度学习例程，其代码是基于TF2.0的，由于以前一直使用tf1.x，故需要进行新建2.0的环境，2.0的代码结构与1.x相比有很大的不同，精简了很多步骤，进行模型训练更加方便高效。

记录一下升级过程中遇到的一些小坑：

1、首先，要建立虚拟环境：conda create -n tf20 并激活conda activate tf20

2、gpu本地驱动安装

显卡驱动：

本地gpu一般有一个默认显卡驱动器，无需再次安装显卡驱动

CUDA:

NVIDIA的显卡驱动器与CUDA并不是一一对应,

CUDA只是一个工具包，

同一个显卡驱动可以安装多个不同的cuda

CUDNN:

cuDNN是一个SDK，是一个专门用于神经网络的加速包

cuDNN与CUDA没有对应关系

一个cuda,可以有多个不同版本的cudnn。

本机gtx1050ti win10,对应版本为 cuda10.0,cudnn=7

法1:命令行安装(未尝试)：

conda install cudatoolkit=10.0 cudnn=7

法2：本地安装

cuda安装：

下载cuda:cuda10.0,

建议下载local版本（下载安装速度都快），并且在中午12以前下载（速度快）

安装exe,默认解压路径：c:\users\xx\AppData\Local\Temp\CUDA

精简模式

如果提示安装visio studio，可以不按装（本机未提示，可能是之前安装过vs2015）

检测是否安装成功：cmd中：

nvcc -V

C:\Users\wym>nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:04_Central_Daylight_Time_2018
Cuda compilation tools, release 10.0, V10.0.130

nvidia-smi

C:\Users\wym>nvidia-smi
Fri Sep 11 11:08:25 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 451.82       Driver Version: 451.82       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105... WDDM  | 00000000:01:00.0  On |                  N/A |
| 29%   33C    P8    N/A /  75W |    548MiB /  4096MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1360    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A      9704    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A     10148    C+G   ...es.TextInput.InputApp.exe    N/A      |
|    0   N/A  N/A     12432    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     12852    C+G   ...w5n1h2txyewy\SearchUI.exe    N/A      |
|    0   N/A  N/A     14360    C+G   ...ekyb3d8bbwe\YourPhone.exe    N/A      |
|    0   N/A  N/A     14444    C+G   ...zf8qxf38zg5c\SkypeApp.exe    N/A      |
|    0   N/A  N/A     14860    C+G   ...cw5n1h2txyewy\LockApp.exe    N/A      |
|    0   N/A  N/A     18736    C+G   ...se6\Application\360se.exe    N/A      |
+-----------------------------------------------------------------------------+

cudnn安装：

下载cudnn:cudnn v7.6.5

解压，生成cuda目录，cuda复制到下列目录

建立文件路径：c:\tools\cuda

在cuda\bin下有一个cundnn64_7.dll的动态链接库，这个dll是使用cudnn的核心，因此需要加入环境变量调用

添加环境变量：系统属性-环境变量-path:c:\tools\cuda\bin.

3、安装需要的库：

tensorflow-gpu安装：

方法1：pip install tensorflow-gpu==2.0.0rc1 -i https://pypi.tuna.tsinghua.edu.cn/simple

方法2：如果有requirement.txt文件，可以：pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple --user
加入--user是为了防止出现木有权限的问题

pip install --default-timeout=100 --ignore-installed --upgrade tensorflow-gpu==2.0.0rc1 -i https://pypi.tuna.tsinghua.edu.cn/simple(直接输入，借用清华镜像，下载速度更快，--ignore-installed解决版本无法安装错误，--default-timeout解决超时错误)

keras安装：如果工程文件中的例子是使用keras的话，keras版本需要与tf版本对应，对应tf2.0的keras步骤如下：

pip install keras==2.3.1 -i https://pypi.tuna.tsinghua.edu.cn/simple --user

opencv安装：无需指定版本，会默认安装适合本地的版本

pip install opencv-python -i https://pypi.tuna.tsinghua.edu.cn/simple --user

验证是否安装成功：

激活环境-activate tensorflow 进入python 导入tensorflow,

出现：I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll

说明cuda安装成功且版本符合本机显卡：

测试命令-

import tensorflow as tf
a=tf.constant([10])
print(a)

成功安装结果：

C:\Users\wym>activate tensorflow

(tensorflow) C:\Users\wym>python
Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-09-11 11:11:24.792867: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
>>> a=tf.constant([10])
2020-09-11 11:13:01.846228: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-09-11 11:13:01.940688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
2020-09-11 11:13:01.958516: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-09-11 11:13:01.980137: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-09-11 11:13:02.178945: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-09-11 11:13:02.243931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
2020-09-11 11:13:02.261308: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-09-11 11:13:02.267009: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-09-11 11:13:11.202419: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-11 11:13:11.213008: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2020-09-11 11:13:11.215213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2020-09-11 11:13:11.308757: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2996 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
>>> print(a)
tf.Tensor([10], shape=(1,), dtype=int32)

4、常见问题：

运行代码出现：Non-OK-status: CudaLaunchKernel(FillPhiloxRandomKernelLaunch, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: out of memory

解决方法：出现此错误，并未内存不够，而是cuda版本不正确，请安装正确的cuda版本，即可解决。

5、其他问题：

代码中用到：pycocotools模块，但不支持window本地安装：

法1：

获取源码，git clone https://github.com/pdollar/coco.git

linux可以使用git，windows不可以，可以下载windowgit下载（后期会有路径问题不好解决，不推荐），

直接下载https://github.com/pdollar/coco.git

解压，

激活虚拟环境，

进入coco/PythonAPI

执行

pip install -U cython -i https://pypi.tuna.tsinghua.edu.cn/simple
# install pycocotools locally
python setup.py build_ext --inplace
python setup.py build_ext install

参考文献：

1、https://www.cnblogs.com/xiaosongshine/p/11615639.html

2、https://blog.csdn.net/qq_27825451/article/details/89082978

升级后的代码结果：

D:\anaconda\envs\tensorflow\python.exe C:/Users/wym/Desktop/cjr/Centernet-Tensorflow2.0/TF2-CenterNet/ctdet_image.py
2020-09-11 11:32:24.971183: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-09-11 11:32:37.660610: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-09-11 11:32:37.696192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
2020-09-11 11:32:37.696390: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-09-11 11:32:37.696559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-09-11 11:32:37.696905: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-09-11 11:32:37.699315: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
2020-09-11 11:32:37.699564: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-09-11 11:32:37.700104: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-09-11 11:32:38.330225: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-11 11:32:38.330366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-09-11 11:32:38.330446: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-09-11 11:32:38.330655: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2996 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
  0%|          | 0/1 [00:00<?, ?it/s]2020-09-11 11:33:01.627843: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-09-11 11:33:02.992058: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
2020-09-11 11:33:05.944135: W tensorflow/core/common_runtime/bfc_allocator.cc:239] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.12GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-11 11:33:06.362537: W tensorflow/core/common_runtime/bfc_allocator.cc:239] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.59GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
Image saved to: output\ctdet.demo.jpg
100%|██████████| 1/1 [00:12<00:00, 12.83s/it]

原图：