【最新】cuDNN在CUDA11.7+Ubuntu20.04下的安装及卸载

CUDA11.7已经安装好了,现在安装cuDNN

1、卸载

(1)查询

sudo dpkg -l | grep cudnn

$ sudo dpkg -l | grep cudnn
ii  cudnn-local-repo-ubuntu2004-8.6.0.163      1.0-1                               amd64        cudnn-local repository configuration files
ii  libcudnn8                                  8.6.0.163-1+cuda11.8                amd64        cuDNN runtime libraries
ii  libcudnn8-dev                              8.6.0.163-1+cuda11.8                amd64        cuDNN development libraries and headers
ii  libcudnn8-samples                          8.6.0.163-1+cuda11.8                amd64        cuDNN samples

(2)卸载

$ sudo dpkg -r libcudnn8-samples
(Reading database ... 227274 files and directories currently installed.)
Removing libcudnn8-samples (8.6.0.163-1+cuda11.8) ...
$ sudo dpkg -r libcudnn8-dev
(Reading database ... 227208 files and directories currently installed.)
Removing libcudnn8-dev (8.6.0.163-1+cuda11.8) ...
update-alternatives: removing manually selected alternative - switching libcudnn to auto mode
$ sudo dpkg -r libcudnn8
(Reading database ... 227175 files and directories currently installed.)
Removing libcudnn8 (8.6.0.163-1+cuda11.8) ...

注意cudnn-local-repo-ubuntu2004-8.6.0.163要用purge命令删除

否则会提示:ignoring request to remove cudnn-local-repo-ubuntu2004-8.6.0.163, only the config
 files of which are on the system; use --purge to remove them too

$ sudo apt-get purge cudnn-local-repo-ubuntu2004-8.6.0.163
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages will be REMOVED:
  cudnn-local-repo-ubuntu2004-8.6.0.163*
0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
(Reading database ... 227143 files and directories currently installed.)
Purging configuration files for cudnn-local-repo-ubuntu2004-8.6.0.163 (1.0-1) ...

(3)再次查询,没有输出则卸载完成

接下来根据官方文档进行下载安装,选择2.3.2. Debian Local Installation

官方文档链接:Installation Guide :: NVIDIA Deep Learning cuDNN Documentation

2、下载

下载链接:https://developer.nvidia.com/rdp/cudnn-download

这里我选择的是Local Installer for Ubuntu20.04 x86_64 (Deb)

下载后得到文件cudnn-local-repo-ubuntu2004-8.5.0.96_1.0-1_amd64.deb

3、安装

r****@r********:~/Downloads$ sudo dpkg -i cudnn-local-repo-ubuntu2004-8.5.0.96_1.0-1_amd64.deb
[sudo] password for r**: 
Selecting previously unselected package cudnn-local-repo-ubuntu2004-8.5.0.96.
(Reading database ... 191620 files and directories currently installed.)
Preparing to unpack cudnn-local-repo-ubuntu2004-8.5.0.96_1.0-1_amd64.deb ...
Unpacking cudnn-local-repo-ubuntu2004-8.5.0.96 (1.0-1) ...
Setting up cudnn-local-repo-ubuntu2004-8.5.0.96 (1.0-1) ...

The public CUDA GPG key does not appear to be installed.
To install the key, run this command:
sudo cp /var/cudnn-local-repo-ubuntu2004-8.5.0.96/cudnn-local-0579404E-keyring.gpg /usr/share/keyrings/

r****@r********:~/Downloads$ sudo cp /var/cudnn-local-repo-ubuntu2004-8.5.0.96/cudnn-local-0579404E-keyring.gpg /usr/share/keyrings/
r****@r********:~/Downloads$ sudo apt-get update
Get:1 file:/var/cudnn-local-repo-ubuntu2004-8.5.0.96  InRelease [1,575 B]
Get:1 file:/var/cudnn-local-repo-ubuntu2004-8.5.0.96  InRelease [1,575 B]
Get:2 file:/var/cudnn-local-repo-ubuntu2004-8.5.0.96  Packages [945 B]                                    
Get:3 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]                                                           
Hit:4 http://cn.archive.ubuntu.com/ubuntu focal InRelease                                    
Fetched 114 kB in 2s (52.7 kB/s)                                                             
Reading package lists... Done

r***@r******:~/Downloads$ sudo apt-get install libcudnn8=8.5.0.96-1+cuda11.7
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  libcudnn8
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/347 MB of archives.
After this operation, 892 MB of additional disk space will be used.
Get:1 file:/var/cudnn-local-repo-ubuntu2004-8.5.0.96  libcudnn8 8.5.0.96-1+cuda11.7 [347 MB]
Selecting previously unselected package libcudnn8.
(Reading database ... 191636 files and directories currently installed.)
Preparing to unpack .../libcudnn8_8.5.0.96-1+cuda11.7_amd64.deb ...
Unpacking libcudnn8 (8.5.0.96-1+cuda11.7) ...
Setting up libcudnn8 (8.5.0.96-1+cuda11.7) ...

r***@r******:~/Downloads$ sudo apt-get install libcudnn8-dev=8.5.0.96-1+cuda11.7
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  libcudnn8-dev
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/356 MB of archives.
After this operation, 1,050 MB of additional disk space will be used.
Get:1 file:/var/cudnn-local-repo-ubuntu2004-8.5.0.96  libcudnn8-dev 8.5.0.96-1+cuda11.7 [356 MB]
Selecting previously unselected package libcudnn8-dev.
(Reading database ... 191653 files and directories currently installed.)
Preparing to unpack .../libcudnn8-dev_8.5.0.96-1+cuda11.7_amd64.deb ...
Unpacking libcudnn8-dev (8.5.0.96-1+cuda11.7) ...
Setting up libcudnn8-dev (8.5.0.96-1+cuda11.7) ...
update-alternatives: using /usr/include/x86_64-linux-gnu/cudnn_v8.h to provide /usr/include/cudnn.h (libcudnn) in auto mode

r***@r******:~/Downloads$ sudo apt-get install libcudnn8-samples=8.5.0.96-1+cuda11.7
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  libcudnn8-samples
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/1,665 kB of archives.
After this operation, 2,169 kB of additional disk space will be used.
Get:1 file:/var/cudnn-local-repo-ubuntu2004-8.5.0.96  libcudnn8-samples 8.5.0.96-1+cuda11.7 [1,665 kB]
Selecting previously unselected package libcudnn8-samples.
(Reading database ... 191686 files and directories currently installed.)
Preparing to unpack .../libcudnn8-samples_8.5.0.96-1+cuda11.7_amd64.deb ...
Unpacking libcudnn8-samples (8.5.0.96-1+cuda11.7) ...
Setting up libcudnn8-samples (8.5.0.96-1+cuda11.7) ...

 运行:

cat /usr/include/x86_64-linux-gnu/cudnn_version_v8.h | grep CUDNN_MAJOR -A 2

$ cat /usr/include/x86_64-linux-gnu/cudnn_version_v8.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 5
#define CUDNN_PATCHLEVEL 0
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#endif /* CUDNN_VERSION_H */

 4、测试

r*****@r**********:~/Downloads$ sudo cp -r /usr/src/cudnn_samples_v8/ $HOME
r*****@r**********:~/Downloads$ cd  $HOME/cudnn_samples_v8/mnistCUDNN
r*****@r**********:~/cudnn_samples_v8/mnistCUDNN$ make clean
rm -rf *o
rm -rf mnistCUDNN

 执行make时报错

ra**@r*****:~/cudnn_samples_v8/mnistCUDNN$ make
CUDA_VERSION is 11070
Linking agains cublasLt = true
CUDA VERSION: 11070
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 35 50 53 60 61 62 70 72 75 80 86 87
/bin/sh: 1: cannot create test.c: Permission denied
/bin/sh: 1: cannot create test.c: Permission denied
g++: error: test.c: No such file or directory
g++: warning: ‘-x c’ after last input file has no effect
g++: fatal error: no input files
compilation terminated.
>>> WARNING - FreeImage is not set up correctly. Please ensure FreeImage is set up correctly. <<<
[@] /usr/local/cuda/bin/nvcc -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_87,code=compute_87 -o fp16_dev.o -c fp16_dev.cu
[@] g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -o fp16_emu.o -c fp16_emu.cpp
[@] g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -o mnistCUDNN.o -c mnistCUDNN.cpp
[@] /usr/local/cuda/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_87,code=compute_87 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -lcublasLt -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm

(1)因为有warning:

WARNING - FreeImage is not set up correctly. Please ensure FreeImage is set up correctly.

所以先下载libfreeimage:sudo apt-get install libfreeimage3 libfreeimage-dev

$ sudo apt-get install libfreeimage3 libfreeimage-dev
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  libilmbase24 libjxr0 libopenexr24
The following NEW packages will be installed:
  libfreeimage-dev libfreeimage3 libilmbase24 libjxr0 libopenexr24
0 upgraded, 5 newly installed, 0 to remove and 0 not upgraded.
Need to get 1,113 kB of archives.
After this operation, 5,116 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://cn.archive.ubuntu.com/ubuntu focal/universe amd64 libilmbase24 amd64 2.3.0-6build1 [75.1 kB]
Get:2 http://security.ubuntu.com/ubuntu focal-security/universe amd64 libopenexr24 amd64 2.3.0-6ubuntu0.5 [592 kB]
Get:3 http://cn.archive.ubuntu.com/ubuntu focal/universe amd64 libjxr0 amd64 1.1-6build1 [158 kB]
Get:4 http://cn.archive.ubuntu.com/ubuntu focal/universe amd64 libfreeimage3 amd64 3.18.0+ds2-1ubuntu3 [269 kB]
Get:5 http://cn.archive.ubuntu.com/ubuntu focal/universe amd64 libfreeimage-dev amd64 3.18.0+ds2-1ubuntu3 [18.8 kB]
Fetched 1,113 kB in 3s (383 kB/s)                                              
Selecting previously unselected package libilmbase24:amd64.
(Reading database ... 191752 files and directories currently installed.)
Preparing to unpack .../libilmbase24_2.3.0-6build1_amd64.deb ...
Unpacking libilmbase24:amd64 (2.3.0-6build1) ...
Selecting previously unselected package libjxr0:amd64.
Preparing to unpack .../libjxr0_1.1-6build1_amd64.deb ...
Unpacking libjxr0:amd64 (1.1-6build1) ...
Selecting previously unselected package libopenexr24:amd64.
Preparing to unpack .../libopenexr24_2.3.0-6ubuntu0.5_amd64.deb ...
Unpacking libopenexr24:amd64 (2.3.0-6ubuntu0.5) ...
Selecting previously unselected package libfreeimage3:amd64.
Preparing to unpack .../libfreeimage3_3.18.0+ds2-1ubuntu3_amd64.deb ...
Unpacking libfreeimage3:amd64 (3.18.0+ds2-1ubuntu3) ...
Selecting previously unselected package libfreeimage-dev.
Preparing to unpack .../libfreeimage-dev_3.18.0+ds2-1ubuntu3_amd64.deb ...
Unpacking libfreeimage-dev (3.18.0+ds2-1ubuntu3) ...
Setting up libjxr0:amd64 (1.1-6build1) ...
Setting up libilmbase24:amd64 (2.3.0-6build1) ...
Setting up libopenexr24:amd64 (2.3.0-6ubuntu0.5) ...
Setting up libfreeimage3:amd64 (3.18.0+ds2-1ubuntu3) ...
Setting up libfreeimage-dev (3.18.0+ds2-1ubuntu3) ...
Processing triggers for libc-bin (2.31-0ubuntu9.9) ...

(2)error

/bin/sh: 1: cannot create test.c: Permission denied
/bin/sh: 1: cannot create test.c: Permission denied
g++: error: test.c: No such file or directory
g++: warning: ‘-x c’ after last input file has no effect
g++: fatal error: no input files

Permission denied,命令前添加sudo,即sudo make,成功

$ sudo make
CUDA_VERSION is 11070
Linking agains cublasLt = true
CUDA VERSION: 11070
TARGET ARCH: x86_64
HOST_ARCH: x86_64
TARGET OS: linux
SMS: 35 50 53 60 61 62 70 72 75 80 86 87
/usr/local/cuda/bin/nvcc -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include  -ccbin g++ -m64    -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_87,code=compute_87 -o fp16_dev.o -c fp16_dev.cu
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include   -o fp16_emu.o -c fp16_emu.cpp
g++ -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include   -o mnistCUDNN.o -c mnistCUDNN.cpp
/usr/local/cuda/bin/nvcc   -ccbin g++ -m64      -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_87,code=sm_87 -gencode arch=compute_87,code=compute_87 -o mnistCUDNN fp16_dev.o fp16_emu.o mnistCUDNN.o -I/usr/local/cuda/include -I/usr/local/cuda/include -IFreeImage/include -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -lcublasLt -LFreeImage/lib/linux/x86_64 -LFreeImage/lib/linux -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).

(3)虽然有warning

nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release

参考:opencv编译时警告vncc warning: The ‘compute_35’, ‘compute_37’, ‘compute_50’, ‘sm_35’, ‘sm_37’ and ‘sm_50’_古典部程序员的博客-CSDN博客_sm_35

Makefile中采用了CUDA中的compute capability 3.5、3.7和5.0这几种计算能力,而这几种计算能力CUDA11.7中可能已经弃用了。参考中作者利用的是CUDA_ARCH_BIN这个选项,而输出的建议是使用Wno-deprecated-gpu-targets这个选项(Use -Wno-deprecated-gpu-targets to suppress warning),我选择不管(手动狗头),反正最后成功输出了test passed

$ ./mnistCUDNN
Executing: mnistCUDNN
cudnnGetVersion() : 8500 , CUDNN_VERSION from cudnn.h : 8500 (8.5.0)
Host compiler version : GCC 9.4.0

There are 1 CUDA capable devices on your machine :
device 0 : sms 34  Capabilities 7.5, SmClock 1650.0 Mhz, MemSize (Mb) 12007, MemClock 7001.0 Mhz, Ecc=0, boardGroupID=0
Using device 0

Testing single precision
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.014976 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.016064 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.025440 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.037280 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.235328 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 1.416096 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.034656 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.034848 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.059392 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.072960 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.198272 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.714656 time requiring 2450080 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.010688 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.011232 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.017952 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.034848 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.036608 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.040576 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.032160 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.033760 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.035264 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.056064 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.059520 time requiring 128000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.073248 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

Testing half precision (math in single precision)
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.010912 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.011424 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.014688 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.034112 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.039360 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.041280 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 51584 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.045632 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.047008 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.059616 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.073056 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.081472 time requiring 51584 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.115200 time requiring 64000 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.012288 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.013504 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.014016 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.033408 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.036928 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.040928 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 51584 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 64000 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.034784 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.045024 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.059648 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.070400 time requiring 4656640 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.077696 time requiring 51584 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.114624 time requiring 64000 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!

  • 11
    点赞
  • 38
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值