tensorflowgpu20升级

原因:

最近从某处下载了一个深度学习例程,其代码是基于TF2.0的,由于以前一直使用tf1.x,故需要进行新建2.0的环境,2.0的代码结构与1.x相比有很大的不同,精简了很多步骤,进行模型训练更加方便高效。

记录一下升级过程中遇到的一些小坑:

1、首先,要建立虚拟环境:conda create -n tf20   并激活conda activate tf20

2、gpu本地驱动安装

显卡驱动:

本地gpu一般有一个默认显卡驱动器,无需再次安装显卡驱动

CUDA:

NVIDIA的显卡驱动器与CUDA并不是一一对应,

CUDA只是一个工具包,

同一个显卡驱动可以安装多个不同的cuda

CUDNN:

cuDNN是一个SDK,是一个专门用于神经网络的加速包

cuDNN与CUDA没有对应关系

一个cuda,可以有多个不同版本的cudnn。

本机gtx1050ti win10,对应版本为 cuda10.0,cudnn=7

法1:命令行安装(未尝试):

conda install cudatoolkit=10.0 cudnn=7

法2:本地安装

cuda安装:

下载cuda:cuda10.0,

建议下载local版本(下载安装速度都快),并且在中午12以前下载(速度快)

安装exe,默认解压路径:c:\users\xx\AppData\Local\Temp\CUDA

精简模式

如果提示安装visio studio,可以不按装(本机未提示,可能是之前安装过vs2015)

检测是否安装成功:cmd中:

nvcc -V

C:\Users\wym>nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:04_Central_Daylight_Time_2018
Cuda compilation tools, release 10.0, V10.0.130

nvidia-smi

C:\Users\wym>nvidia-smi
Fri Sep 11 11:08:25 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 451.82       Driver Version: 451.82       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 105... WDDM  | 00000000:01:00.0  On |                  N/A |
| 29%   33C    P8    N/A /  75W |    548MiB /  4096MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1360    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A      9704    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A     10148    C+G   ...es.TextInput.InputApp.exe    N/A      |
|    0   N/A  N/A     12432    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     12852    C+G   ...w5n1h2txyewy\SearchUI.exe    N/A      |
|    0   N/A  N/A     14360    C+G   ...ekyb3d8bbwe\YourPhone.exe    N/A      |
|    0   N/A  N/A     14444    C+G   ...zf8qxf38zg5c\SkypeApp.exe    N/A      |
|    0   N/A  N/A     14860    C+G   ...cw5n1h2txyewy\LockApp.exe    N/A      |
|    0   N/A  N/A     18736    C+G   ...se6\Application\360se.exe    N/A      |
+-----------------------------------------------------------------------------+

cudnn安装:

下载cudnn:cudnn v7.6.5

解压,生成cuda目录,cuda复制到下列目录

建立文件路径:c:\tools\cuda

在cuda\bin下有一个cundnn64_7.dll的动态链接库,这个dll是使用cudnn的核心,因此需要加入环境变量调用

添加环境变量:系统属性-环境变量-path:c:\tools\cuda\bin.

3、安装需要的库:

tensorflow-gpu安装:

方法1:pip install tensorflow-gpu==2.0.0rc1 -i https://pypi.tuna.tsinghua.edu.cn/simple

方法2:如果有requirement.txt文件,可以:pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple --user
加入--user是为了防止出现木有权限的问题

pip install --default-timeout=100 --ignore-installed --upgrade tensorflow-gpu==2.0.0rc1 -i https://pypi.tuna.tsinghua.edu.cn/simple(直接输入,借用清华镜像,下载速度更快,--ignore-installed解决版本无法安装错误,--default-timeout解决超时错误)

keras安装:如果工程文件中的例子是使用keras的话,keras版本需要与tf版本对应,对应tf2.0的keras步骤如下:

pip install keras==2.3.1 -i https://pypi.tuna.tsinghua.edu.cn/simple --user

opencv安装:无需指定版本,会默认安装适合本地的版本

pip install opencv-python -i https://pypi.tuna.tsinghua.edu.cn/simple --user

验证是否安装成功:

激活环境-activate tensorflow  进入python 导入tensorflow,

出现:I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll

说明cuda安装成功且版本符合本机显卡:

测试命令-

import tensorflow as tf
a=tf.constant([10])
print(a)

成功安装结果: 

C:\Users\wym>activate tensorflow

(tensorflow) C:\Users\wym>python
Python 3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-09-11 11:11:24.792867: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
>>> a=tf.constant([10])
2020-09-11 11:13:01.846228: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-09-11 11:13:01.940688: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
2020-09-11 11:13:01.958516: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-09-11 11:13:01.980137: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-09-11 11:13:02.178945: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-09-11 11:13:02.243931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
2020-09-11 11:13:02.261308: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-09-11 11:13:02.267009: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-09-11 11:13:11.202419: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-11 11:13:11.213008: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2020-09-11 11:13:11.215213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2020-09-11 11:13:11.308757: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2996 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
>>> print(a)
tf.Tensor([10], shape=(1,), dtype=int32)

4、常见问题:

运行代码出现:Non-OK-status: CudaLaunchKernel(FillPhiloxRandomKernelLaunch, num_blocks, block_size, 0, d.stream(), gen, data, size, dist) status: Internal: out of memory

解决方法:出现此错误,并未内存不够,而是cuda版本不正确,请安装正确的cuda版本,即可解决。

5、其他问题:

代码中用到:pycocotools模块,但不支持window本地安装:

法1:

获取源码,git clone https://github.com/pdollar/coco.git

linux可以使用git,windows不可以,可以下载windowgit下载(后期会有路径问题不好解决,不推荐),

直接下载https://github.com/pdollar/coco.git

解压,

激活虚拟环境,

进入coco/PythonAPI

执行

pip install -U cython -i https://pypi.tuna.tsinghua.edu.cn/simple
# install pycocotools locally
python setup.py build_ext --inplace
python setup.py build_ext install

参考文献:

1、https://www.cnblogs.com/xiaosongshine/p/11615639.html

2、https://blog.csdn.net/qq_27825451/article/details/89082978

 

升级后的代码结果:

D:\anaconda\envs\tensorflow\python.exe C:/Users/wym/Desktop/cjr/Centernet-Tensorflow2.0/TF2-CenterNet/ctdet_image.py
2020-09-11 11:32:24.971183: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2020-09-11 11:32:37.660610: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-09-11 11:32:37.696192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
2020-09-11 11:32:37.696390: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-09-11 11:32:37.696559: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-09-11 11:32:37.696905: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-09-11 11:32:37.699315: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 1050 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.392
pciBusID: 0000:01:00.0
2020-09-11 11:32:37.699564: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-09-11 11:32:37.700104: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-09-11 11:32:38.330225: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-11 11:32:38.330366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-09-11 11:32:38.330446: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-09-11 11:32:38.330655: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2996 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
  0%|          | 0/1 [00:00<?, ?it/s]2020-09-11 11:33:01.627843: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-09-11 11:33:02.992058: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Internal: Invoking ptxas not supported on Windows
Relying on driver to perform ptx compilation. This message will be only logged once.
2020-09-11 11:33:05.944135: W tensorflow/core/common_runtime/bfc_allocator.cc:239] Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.12GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-09-11 11:33:06.362537: W tensorflow/core/common_runtime/bfc_allocator.cc:239] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.59GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
Image saved to: output\ctdet.demo.jpg
100%|██████████| 1/1 [00:12<00:00, 12.83s/it]

原图: 

检测: 

 

 

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

DLANDML

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值