Jetson Nano配置YOLO v3(CUDA+cudnn+OpenCV+TensorRT)

一、运行YOLO v3


Series: YOLO object detector in PyTorch A collection of 5 posts.

在安装YOLO v3之前要先检查已经安装的系统组件,Jetson Nano的OS镜像已经自带了JetPack,cuda,cudnn,opencv等都已经安装好,我们要分别检查一下环境。



sudo vim  ~/.bashrc


export CUDA_HOME=/usr/local/cuda-10.0
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-10.0/bin:$PATH


source ~/.bashrc


nvidia@nvidia-desktop:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_21:14:42_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89


bash: nvcc: 未找到命令

先切换到 目录下: cd ~
然后打开 .bashrc 文件:vim .bashr

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_ROOT=/usr/local/cud

此时依次按下Esc : q ! 将文件强制写入并退出,然后在运行一下:

source ~/.bashrc

再输入以下nvcc -V应该就会出现。



pkg-config opencv --modversion

如果OpenCv安装就绪,会显示版本号,但是我的当时一直不能显示OpenCV ,提示需要将OpenCV加入系统环境中,此时可以输入代码:

pkg-config opencv4 --modversion


nvidia@nvidia-desktop:~$ pkg-config opencv --modversion



cd /usr/src/cudnn_samples_v8/mnistCUDNN   #进入例子目录
sudo make     #编译一下例子
sudo chmod a+x mnistCUDNN # 为可执行文件添加执行权限
./mnistCUDNN # 执行


nvidia@nvidia-desktop:/usr/src/cudnn_samples_v8/mnistCUDNN$ ./mnistCUDNN 
Executing: mnistCUDNN
cudnnGetVersion() : 8000 , CUDNN_VERSION from cudnn.h : 8000 (8.0.0)
Host compiler version : GCC 7.5.0

There are 1 CUDA capable devices on your machine :
device 0 : sms  1  Capabilities 5.3, SmClock 921.6 Mhz, MemSize (Mb) 3962, MemClock 12.8 Mhz, Ecc=0, boardGroupID=0
Using device 0

Testing single precision
Loading binary file data/conv1.bin
Loading binary file data/conv1.bias.bin
Loading binary file data/conv2.bin
Loading binary file data/conv2.bias.bin
Loading binary file data/ip1.bin
Loading binary file data/ip1.bias.bin
Loading binary file data/ip2.bin
Loading binary file data/ip2.bias.bin
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.379532 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.407083 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 1.107760 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 15.830000 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 28.929480 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 160.134064 time requiring 184784 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 2.277968 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 2.286458 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 2.821927 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 3.238802 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 8.895677 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 9.272969 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000 
Loading image data/three_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 2057744 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 178432 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.197396 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.205937 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.399844 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 2.132292 time requiring 184784 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 4.073125 time requiring 178432 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 7.371354 time requiring 2057744 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnGetConvolutionForwardAlgorithm_v7 ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: -1.000000 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: -1.000000 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: -1.000000 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 1.505469 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 1.527083 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 1.528802 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 2.178020 time requiring 2450080 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 6.266927 time requiring 1433120 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 10.913073 time requiring 4656640 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 6: -1.000000 time requiring 0 memory
^^^^ CUDNN_STATUS_NOT_SUPPORTED for Algo 3: -1.000000 time requiring 0 memory
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000 
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006 

Result of classification: 1 3 5

Test passed!
root@nvidia-desktop:/usr/src/cudnn_samples_v8/mnistCUDNN# cd
root@nvidia-desktop:~# cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 0

#endif /* CUDNN_VERSION_H */

5、安装YOLO v3


git clone git://


cd darknet
sudo vim Makefile   #修改Makefil




make -j4


error:'CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT' undeclared(first use in this function);did you mean 'CUDNN_CONVOLUTION_FWD_ALGO_DIRECT'


git clone git://


cd darknet/AlexeyAB
sudo vim Makefile   #修改Makefil


OPENCV=1 #看是否安装opencv


make install








./darknet detector test data/ data/yolov3.cfg data/yolov3.weight

9.运行YOLO v3-tiny:

nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detect cfg/yolov3-tiny.cfg yolov3-tiny.weights data/dog.jpg
 CUDA-version: 10020 (10020), cuDNN: 8.0.0, CUDNN_HALF=1, GPU count: 1  
 OpenCV version: 4.1.1
 0 : compute_capability = 530, cudnn_half = 0, GPU: NVIDIA Tegra X1 
net.optimized_memory = 0 
mini_batch = 1, batch = 1, time_steps = 1, train = 0 
   layer   filters  size/strd(dil)      input                output
   0 conv     16       3 x 3/ 1    416 x 416 x   3 ->  416 x 416 x  16 0.150 BF
   1 max                2x 2/ 2    416 x 416 x  16 ->  208 x 208 x  16 0.003 BF
   2 conv     32       3 x 3/ 1    208 x 208 x  16 ->  208 x 208 x  32 0.399 BF
   3 max                2x 2/ 2    208 x 208 x  32 ->  104 x 104 x  32 0.001 BF
   4 conv     64       3 x 3/ 1    104 x 104 x  32 ->  104 x 104 x  64 0.399 BF
   5 max                2x 2/ 2    104 x 104 x  64 ->   52 x  52 x  64 0.001 BF
   6 conv    128       3 x 3/ 1     52 x  52 x  64 ->   52 x  52 x 128 0.399 BF
   7 max                2x 2/ 2     52 x  52 x 128 ->   26 x  26 x 128 0.000 BF
   8 conv    256       3 x 3/ 1     26 x  26 x 128 ->   26 x  26 x 256 0.399 BF
   9 max                2x 2/ 2     26 x  26 x 256 ->   13 x  13 x 256 0.000 BF
  10 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  11 max                2x 2/ 1     13 x  13 x 512 ->   13 x  13 x 512 0.000 BF
  12 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  13 conv    256       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 256 0.089 BF
  14 conv    512       3 x 3/ 1     13 x  13 x 256 ->   13 x  13 x 512 0.399 BF
  15 conv    255       1 x 1/ 1     13 x  13 x 512 ->   13 x  13 x 255 0.044 BF
  16 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
  17 route  13 		                           ->   13 x  13 x 256 
  18 conv    128       1 x 1/ 1     13 x  13 x 256 ->   13 x  13 x 128 0.011 BF
  19 upsample                 2x    13 x  13 x 128 ->   26 x  26 x 128
  20 route  19 8 	                           ->   26 x  26 x 384 
  21 conv    256       3 x 3/ 1     26 x  26 x 384 ->   26 x  26 x 256 1.196 BF
  22 conv    255       1 x 1/ 1     26 x  26 x 256 ->   26 x  26 x 255 0.088 BF
  23 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 5.571 
avg_outputs = 341534 
 Allocate additional workspace_size = 0.00 MB 
Loading weights from yolov3-tiny.weights...
 seen 64, trained: 32013 K-images (500 Kilo-batches_64) 
Done! Loaded 24 layers from weights-file 
 Detection layer: 16 - type = 27 
 Detection layer: 23 - type = 27 
data/dog.jpg: Predicted in 2663.842000 milli-seconds.
dog: 81%
bicycle: 38%
car: 71%
truck: 41%
truck: 62%
car: 39%

这是YOLO v3-tiny跑出来的检测结果

运行YOLO v3 :

nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detect cfg/yolov3.cfg yolov3.weights data/dog.jpg
 CUDA-version: 10020 (10020), cuDNN: 8.0.0, CUDNN_HALF=1, GPU count: 1  
 OpenCV version: 4.1.1
 0 : compute_capability = 530, cudnn_half = 0, GPU: NVIDIA Tegra X1 
net.optimized_memory = 0 
mini_batch = 1, batch = 1, time_steps = 1, train = 0 
   layer   filters  size/strd(dil)      input                output
   0 conv     32       3 x 3/ 1    416 x 416 x   3 ->  416 x 416 x  32 0.299 BF
   1 conv     64       3 x 3/ 2    416 x 416 x  32 ->  208 x 208 x  64 1.595 BF
   2 conv     32       1 x 1/ 1    208 x 208 x  64 ->  208 x 208 x  32 0.177 BF
   3 conv     64       3 x 3/ 1    208 x 208 x  32 ->  208 x 208 x  64 1.595 BF
   4 Shortcut Layer: 1,  wt = 0, wn = 0, outputs: 208 x 208 x  64 0.003 BF
   5 conv    128       3 x 3/ 2    208 x 208 x  64 ->  104 x 104 x 128 1.595 BF
   6 conv     64       1 x 1/ 1    104 x 104 x 128 ->  104 x 104 x  64 0.177 BF
   7 conv    128       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x 128 1.595 BF
   8 Shortcut Layer: 5,  wt = 0, wn = 0, outputs: 104 x 104 x 128 0.001 BF
   9 conv     64       1 x 1/ 1    104 x 104 x 128 ->  104 x 104 x  64 0.177 BF
  10 conv    128       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x 128 1.595 BF
  11 Shortcut Layer: 8,  wt = 0, wn = 0, outputs: 104 x 104 x 128 0.001 BF
  12 conv    256       3 x 3/ 2    104 x 104 x 128 ->   52 x  52 x 256 1.595 BF
  13 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  14 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  15 Shortcut Layer: 12,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  16 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  17 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  18 Shortcut Layer: 15,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  19 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  20 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  21 Shortcut Layer: 18,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  22 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  23 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  24 Shortcut Layer: 21,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  25 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  26 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  27 Shortcut Layer: 24,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  28 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  29 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  30 Shortcut Layer: 27,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  31 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  32 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  33 Shortcut Layer: 30,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  34 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
  35 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
  36 Shortcut Layer: 33,  wt = 0, wn = 0, outputs:  52 x  52 x 256 0.001 BF
  37 conv    512       3 x 3/ 2     52 x  52 x 256 ->   26 x  26 x 512 1.595 BF
  38 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  39 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  40 Shortcut Layer: 37,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  41 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  42 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  43 Shortcut Layer: 40,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  44 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  45 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  46 Shortcut Layer: 43,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  47 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  48 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  49 Shortcut Layer: 46,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  50 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  51 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  52 Shortcut Layer: 49,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  53 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  54 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  55 Shortcut Layer: 52,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  56 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  57 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  58 Shortcut Layer: 55,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  59 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  60 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  61 Shortcut Layer: 58,  wt = 0, wn = 0, outputs:  26 x  26 x 512 0.000 BF
  62 conv   1024       3 x 3/ 2     26 x  26 x 512 ->   13 x  13 x1024 1.595 BF
  63 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  64 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  65 Shortcut Layer: 62,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  66 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  67 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  68 Shortcut Layer: 65,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  69 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  70 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  71 Shortcut Layer: 68,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  72 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  73 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  74 Shortcut Layer: 71,  wt = 0, wn = 0, outputs:  13 x  13 x1024 0.000 BF
  75 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  76 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  77 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  78 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  79 conv    512       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 512 0.177 BF
  80 conv   1024       3 x 3/ 1     13 x  13 x 512 ->   13 x  13 x1024 1.595 BF
  81 conv    255       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 255 0.088 BF
  82 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
  83 route  79 		                           ->   13 x  13 x 512 
  84 conv    256       1 x 1/ 1     13 x  13 x 512 ->   13 x  13 x 256 0.044 BF
  85 upsample                 2x    13 x  13 x 256 ->   26 x  26 x 256
  86 route  85 61 	                           ->   26 x  26 x 768 
  87 conv    256       1 x 1/ 1     26 x  26 x 768 ->   26 x  26 x 256 0.266 BF
  88 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  89 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  90 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  91 conv    256       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 256 0.177 BF
  92 conv    512       3 x 3/ 1     26 x  26 x 256 ->   26 x  26 x 512 1.595 BF
  93 conv    255       1 x 1/ 1     26 x  26 x 512 ->   26 x  26 x 255 0.177 BF
  94 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
  95 route  91 		                           ->   26 x  26 x 256 
  96 conv    128       1 x 1/ 1     26 x  26 x 256 ->   26 x  26 x 128 0.044 BF
  97 upsample                 2x    26 x  26 x 128 ->   52 x  52 x 128
  98 route  97 36 	                           ->   52 x  52 x 384 
  99 conv    128       1 x 1/ 1     52 x  52 x 384 ->   52 x  52 x 128 0.266 BF
 100 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
 101 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
 102 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
 103 conv    128       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 128 0.177 BF
 104 conv    256       3 x 3/ 1     52 x  52 x 128 ->   52 x  52 x 256 1.595 BF
 105 conv    255       1 x 1/ 1     52 x  52 x 256 ->   52 x  52 x 255 0.353 BF
 106 yolo
[yolo] params: iou loss: mse (2), iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 65.879 
avg_outputs = 532444 
 Allocate additional workspace_size = 0.00 MB 
Loading weights from yolov3.weights...
 seen 64, trained: 32013 K-images (500 Kilo-batches_64) 
Done! Loaded 107 layers from weights-file 
 Detection layer: 82 - type = 27 
 Detection layer: 94 - type = 27 
 Detection layer: 106 - type = 27 
data/dog.jpg: Predicted in 7466.562000 milli-seconds.
bicycle: 99%
dog: 100%
truck: 94%

这是YOLO v3用跑出来的检测结果

可以看出YOLO v3-tiny只用了2.6s,而YOLO v3用了7.46s,但从准确度来说,YOLO v3的精准度远远大于YOLO v3-tiny。


sudo swapoff /swapfile

2.修改swap 空间的大小为4G

sudo dd if=/dev/zero of=/swapfile bs=1M count=4096

3.设置文件为“swap file”类型

sudo mkswap /swapfile


sudo swapon /swapfile


nvidia@nvidia-desktop:~$ df -hl
文件系统        容量  已用  可用 已用% 挂载点
/dev/mmcblk0p1   30G   22G  6.0G   79% /
none            1.8G     0  1.8G    0% /dev
tmpfs           2.0G  628M  1.4G   32% /dev/shm
tmpfs           2.0G   47M  1.9G    3% /run
tmpfs           5.0M  4.0K  5.0M    1% /run/lock
tmpfs           2.0G     0  2.0G    0% /sys/fs/cgroup
tmpfs           397M   12K  397M    1% /run/user/120
tmpfs           397M  108K  397M    1% /run/user/1000



(1) 单张测试命令

nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detect cfg/yolov3.cfg yolov3.weights data/dog.jpg

(2) 多张测试命令

nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detect cfg/yolov3.cfg yolov3.weights
Enter Image Path: data/dog1.jpg
Enter Image Path: data/dog2.jpg

(3) 改变阈值


nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detect cfg/yolov3.cfg yolov3.weights data/dog.jpg -thresh 0

(4) 实时摄像头

实时视频检测需要Darknet with CUDA and OpenCV,-c ,OpenCV默认为0:

nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detector demo cfg/ cfg/yolov3.cfg yolov3.weights

(5) 本地视频检测

nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detector demo cfg/ cfg/yolov3.cfg yolov3.weights <video file>


nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detector test cfg/ cfg/yolov3.cfg yolov3.weights data/xxx.mp4

OpenCV 环境中:

nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ python3 --video=xxx.mp4



nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detector train cfg/ cfg/yolov3-voc.cfg darknet53.conv.74


nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detector train cfg/ cfg/yolov3.cfg darknet53.conv.74 -gpus 0,1,2,3


nvidia@nvidia-desktop:~/darknet/AlexeyAB/darknet$ ./darknet detector train cfg/ cfg/yolov3.cfg backup/yolov3.backup -gpus 0,1,2,3




  1. TensorRT(1)-介绍.

1.首先在官网下载最新TensorRT7安装包:NVIDIA TensorRT 7.x Download在这里插入图片描述


tar xzvf TensorRT-


export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/nvidia/TensorRT-


cd TensorRT-
pip3 install tensorrt-


nvidia@nvidia-desktop:~/TensorRT-$ pip install tensorrt- 
Defaulting to user installation because normal site-packages is not writeable
ERROR: tensorrt- is not a supported wheel on this platform.


nvidia@nvidia-desktop:~/TensorRT-$ pip3 --version
pip 20.2.2 from /home/nvidia/.local/lib/python3.6/site-packages/pip (python 3.6)


sudo update-alternatives --install /usr/bin/python python /usr/bin/python2 100

sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 150 


sudo update-alternatives --config python


nvidia@nvidia-desktop:~$ pip3 install TensorRT- 
Defaulting to user installation because normal site-packages is not writeable
ERROR: tensorrt- is not a supported wheel on this platform.


nvidia@nvidia-desktop:~$ python
Python 3.6.9 (default, Jul 17 2020, 12:50:27) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import  tensorrt
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nvidia/.local/lib/python3.6/site-packages/tensorrt/", line 1, in <module>
    from .tensorrt import *
ImportError: /home/nvidia/.local/lib/python3.6/site-packages/tensorrt/ cannot open shared object file: No such file or directory


>>> import tensorflow
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/", line 343, in load_dynamic
    return _load(spec)
ImportError: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.6/", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.6/", line 343, in load_dynamic
    return _load(spec)
ImportError: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.


for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.



TensorFlow是一个采用数据流图,用于数值计算的开源软件库。可用于机器学习和深度神经网络方面的研究,按着个系统的通性使其也可广泛的用于其他领域。通过使用TensorFlow人们可以快速的入门神经网络,大大降低深度学习的开发成本和开发难度,具有灵活性 、可移植性、多语言支持、性能最优化等特点。


TensorFlow官网中的搜索框搜索TensorFlow,下载tf_gpu-2.2.0 + nv20.6-py3,然后安装。

nvidia@nvidia-desktop:~$ sudo -H pip3 install ~/下载/tensorflow-1.15.3+nv20.7-cp36-cp36m-linux_aarch64.whl 
Processing ./下载/tensorflow-1.15.3+nv20.7-cp36-cp36m-linux_aarch64.whl
Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Requirement already satisfied: keras-applications>=1.0.8 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Requirement already satisfied: wheel>=0.26; python_version >= "3" in /usr/lib/python3/dist-packages (from tensorflow==1.15.3+nv20.7)
Requirement already satisfied: astor>=0.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Collecting tensorflow-estimator==1.15.1 (from tensorflow==1.15.3+nv20.7)
  Using cached
Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Requirement already satisfied: absl-py>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Requirement already satisfied: gast==0.2.2 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Collecting opt-einsum>=2.3.2 (from tensorflow==1.15.3+nv20.7)
  Using cached
Collecting tensorboard<1.16.0,>=1.15.0 (from tensorflow==1.15.3+nv20.7)
  Using cached
Collecting numpy<2.0,>=1.16.0 (from tensorflow==1.15.3+nv20.7)
  Using cached
Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow==1.15.3+nv20.7)
Requirement already satisfied: setuptools in /usr/local/lib/python3.6/dist-packages (from protobuf>=3.6.1->tensorflow==1.15.3+nv20.7)
Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from keras-applications>=1.0.8->tensorflow==1.15.3+nv20.7)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.16.0,>=1.15.0->tensorflow==1.15.3+nv20.7)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.16.0,>=1.15.0->tensorflow==1.15.3+nv20.7)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /usr/local/lib/python3.6/dist-packages (from markdown>=2.6.8->tensorboard<1.16.0,>=1.15.0->tensorflow==1.15.3+nv20.7)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.6/dist-packages (from importlib-metadata; python_version < "3.8"->markdown>=2.6.8->tensorboard<1.16.0,>=1.15.0->tensorflow==1.15.3+nv20.7)
Building wheels for collected packages: numpy
  Running bdist_wheel for numpy ... error
  Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-1g5p2fm1/numpy/';f=getattr(tokenize, 'open', open)(__file__);'\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpca5hou2rpip-wheel- --python-tag cp36:
  Running from numpy source directory.
  Cythonizing sources
  Processing numpy/random/
  Processing numpy/random/_philox.pyx
  Traceback (most recent call last):
    File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 59, in process_pyx
      from Cython.Compiler.Version import version as cython_version
  ModuleNotFoundError: No module named 'Cython'
  During handling of the above exception, another exception occurred:
  Traceback (most recent call last):
    File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 235, in <module>
    File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 231, in main
    File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 222, in find_process_files
      process(root_dir, fromfile, tofile, function, hash_db)
    File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 188, in process
      processor_function(fromfile, tofile)
    File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 64, in process_pyx
      raise OSError('Cython needs to be installed in Python as a module')
  OSError: Cython needs to be installed in Python as a module
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/pip-build-1g5p2fm1/numpy/", line 499, in <module>
    File "/tmp/pip-build-1g5p2fm1/numpy/", line 479, in setup_package
    File "/tmp/pip-build-1g5p2fm1/numpy/", line 274, in generate_cython
      raise RuntimeError("Running cythonize failed!")
  RuntimeError: Running cythonize failed!
  Failed building wheel for numpy
  Running clean for numpy
  Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-1g5p2fm1/numpy/';f=getattr(tokenize, 'open', open)(__file__);'\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" clean --all:
  Running from numpy source directory.
  ` clean` is not supported, use one of the following instead:
    - `git clean -xdf` (cleans all files)
    - `git clean -Xdf` (cleans all versioned files, doesn't touch
                        files that aren't checked into the git repo)
  Add `--force` to your command to use it anyway if you must (unsupported).
  Failed cleaning build dir for numpy
Failed to build numpy
Installing collected packages: tensorflow-estimator, numpy, opt-einsum, tensorboard, tensorflow
  Found existing installation: tensorflow-estimator 1.13.0
    Uninstalling tensorflow-estimator-1.13.0:
      Successfully uninstalled tensorflow-estimator-1.13.0
  Found existing installation: numpy 1.13.3
    Not uninstalling numpy at /usr/lib/python3/dist-packages, outside environment /usr
  Running install for numpy ... error
    Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-1g5p2fm1/numpy/';f=getattr(tokenize, 'open', open)(__file__);'\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-d81_pz_r-record/install-record.txt --single-version-externally-managed --compile:
    Running from numpy source directory.
    Note: if you need reliable uninstall behavior, then install
    with pip instead of using ` install`:
      - `pip install .`       (from a git repo or downloaded source
      - `pip install numpy`   (last NumPy release on PyPi)
    Cythonizing sources
    numpy/random/ has not changed
    Processing numpy/random/_philox.pyx
    Traceback (most recent call last):
      File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 59, in process_pyx
        from Cython.Compiler.Version import version as cython_version
    ModuleNotFoundError: No module named 'Cython'
    During handling of the above exception, another exception occurred:
    Traceback (most recent call last):
      File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 235, in <module>
      File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 231, in main
      File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 222, in find_process_files
        process(root_dir, fromfile, tofile, function, hash_db)
      File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 188, in process
        processor_function(fromfile, tofile)
      File "/tmp/pip-build-1g5p2fm1/numpy/tools/", line 64, in process_pyx
        raise OSError('Cython needs to be installed in Python as a module')
    OSError: Cython needs to be installed in Python as a module
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-1g5p2fm1/numpy/", line 499, in <module>
      File "/tmp/pip-build-1g5p2fm1/numpy/", line 479, in setup_package
      File "/tmp/pip-build-1g5p2fm1/numpy/", line 274, in generate_cython
        raise RuntimeError("Running cythonize failed!")
    RuntimeError: Running cythonize failed!
  Can't rollback numpy, nothing uninstalled.
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-1g5p2fm1/numpy/';f=getattr(tokenize, 'open', open)(__file__);'\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-d81_pz_r-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-1g5p2fm1/numpy/

1、RuntimeError: Running cythonize failed!.重新编译安装cython:

sudo apt cython

2、 ImportError: cannot open shared object file: No such file or directory.在其官网上从源代码构建TensorFlow可以查看各版本对应的CUDA和cudnn,察其原因是因为TensorFlow与CUDA版本不匹配。在这里插入图片描述


TensorRT是NVIDIA推出的一个用于高性能深度学习推理的优化器和运行时库。而YOLO(You Only Look Once)则是一种非常流行的目标检测算法。如果你想在Jetson Nano上使用TensorRTYOLO,你可以按照以下步骤进行: 1. 首先,确保你已经安装了Jetson Nano Developer Kit的软件环境。你可以参考NVIDIA官方的文档来获取Jetson Nano的入门指南。 2. 接下来,你需要安装PyTorch for Jetson。你可以在NVIDIA开发者论坛上找到关于PyTorch for Jetson的相关信息。根据论坛上的说明,你可以下载并安装适用于Jetson Nano的PyTorch版本。 3. 由于Jetson Nano的架构是aarch64,与Windows和Linux不同,因此不能直接安装Anaconda。但你可以安装一个替代品,如archiconda,来管理和使用Python环境。 4. 一旦你完成了Python环境的设置,你就可以开始使用TensorRTYOLO了。你可以使用TensorRT API来优化和部署YOLO模型。具体的实现方法可以参考NVIDIA的官方文档和示例代码。 总结一下,要在Jetson Nano上使用TensorRTYOLO算法,你需要先安装Jetson Nano软件环境,然后安装PyTorch for Jetson,并使用替代品archiconda来管理Python环境。最后,你可以使用TensorRT API来优化和部署YOLO模型。希望这些信息对你有帮助!<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *2* *3* [Jetson nano部署Yolov5目标检测 + Tensor RT加速(超级详细版)]([target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 100%"] [ .reference_list ]
评论 2




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


