本文是NPU运行ssd模型算力的对比,输入的数据大小是300x300 RGB格式
tems | RK3588 | RK3568 | RK1808 |
---|---|---|---|
ssd_inception_v2.rknn | 35 ~ 50 ms/frame | 150ms/frame | 25ms/frame |
FPS | 20 | 6 | 40 |
RK3588运行RK Benchmark的数据
(venv) hua-chips@hua-chips:~/npu/rknpu2/examples/rknn_benchmark/install/rknn_benchmark_Linux$ ./rknn_benchmark ssd_inception_v2.rknn 10 7
rknn_api/rknnrt version: 1.4.0 (a10f100eb@2022-09-09T09:07:14), driver version: 0.7.2
total weight size: 35947840, total internal size: 7473600
total dma used size: 46678016
model input num: 1, output num: 2
input tensors:
index=0, name=Preprocessor/sub:0, n_dims=4, dims=[1, 300, 300, 3], n_elems=270000, size=270000, w_stride = 304, size_with_stride=273600, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=0, scale=0.007843
output tensors:
index=0, name=concat:0, n_dims=4, dims=[1, 1917, 1, 4], n_elems=7668, size=7668, w_stride = 0, size_with_stride=7668, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=50, scale=0.090787
index=1, name=concat_1:0, n_dims=4, dims=[1, 1917, 91, 1], n_elems=174447, size=174447, w_stride = 0, size_with_stride=174447, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=58, scale=0.140090
custom string:
Warmup ...
0: Elapse Time = 32.03ms, FPS = 31.23
1: Elapse Time = 31.38ms, FPS = 31.86
2: Elapse Time = 36.38ms, FPS = 27.48
3: Elapse Time = 43.76ms, FPS = 22.85
4: Elapse Time = 42.17ms, FPS = 23.71
5: Elapse Time = 42.99ms, FPS = 23.26
6: Elapse Time = 44.48ms, FPS = 22.48
7: Elapse Time = 44.33ms, FPS = 22.56
8: Elapse Time = 44.96ms, FPS = 22.24
9: Elapse Time = 47.09ms, FPS = 21.24
Begin perf ...
0: Elapse Time = 46.54ms, FPS = 21.49
1: Elapse Time = 47.15ms, FPS = 21.21
2: Elapse Time = 50.46ms, FPS = 19.82
3: Elapse Time = 49.54ms, FPS = 20.18
4: Elapse Time = 49.52ms, FPS = 20.19
5: Elapse Time = 49.88ms, FPS = 20.05
6: Elapse Time = 49.91ms, FPS = 20.03
7: Elapse Time = 47.64ms, FPS = 20.99
8: Elapse Time = 48.64ms, FPS = 20.56
9: Elapse Time = 50.26ms, FPS = 19.90
Avg FPS = 20.426
RK3568运行RK Benchmark的数据
(venv) hua-chips@hua-chips:~/rk/rknpu2/examples/rknn_benchmark/install/rknn_benchmark_Linux$ ./rknn_benchmark ssd_inception_v2.rknn 10
rknn_api/rknnrt version: 1.4.0 (a10f100eb@2022-09-09T09:07:14), driver version: 0.7.2
total weight size: 35009792, total internal size: 3873600
total dma used size: 40329216
model input num: 1, output num: 2
input tensors:
index=0, name=Preprocessor/sub:0, n_dims=4, dims=[1, 300, 300, 3], n_elems=270000, size=270000, w_stride = 304, size_with_stride=273600, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=0, scale=0.007843
output tensors:
index=0, name=concat:0, n_dims=4, dims=[1, 1917, 1, 4], n_elems=7668, size=7668, w_stride = 0, size_with_stride=7668, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=50, scale=0.090787
index=1, name=concat_1:0, n_dims=4, dims=[1, 1917, 91, 1], n_elems=174447, size=174447, w_stride = 0, size_with_stride=174447, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=58, scale=0.140090
custom string:
E RKNN: [07:29:48.241] rknn_set_core_mask: No implementation found for current platform!
Warmup ...
0: Elapse Time = 157.74ms, FPS = 6.34
1: Elapse Time = 176.21ms, FPS = 5.67
2: Elapse Time = 156.27ms, FPS = 6.40
3: Elapse Time = 143.38ms, FPS = 6.97
4: Elapse Time = 143.27ms, FPS = 6.98
5: Elapse Time = 143.13ms, FPS = 6.99
6: Elapse Time = 145.36ms, FPS = 6.88
7: Elapse Time = 146.58ms, FPS = 6.82
8: Elapse Time = 142.40ms, FPS = 7.02
9: Elapse Time = 145.87ms, FPS = 6.86
Begin perf ...
0: Elapse Time = 145.39ms, FPS = 6.88
1: Elapse Time = 146.34ms, FPS = 6.83
2: Elapse Time = 147.25ms, FPS = 6.79
3: Elapse Time = 143.72ms, FPS = 6.96
4: Elapse Time = 144.48ms, FPS = 6.92
5: Elapse Time = 145.42ms, FPS = 6.88
6: Elapse Time = 139.06ms, FPS = 7.19
7: Elapse Time = 173.21ms, FPS = 5.77
8: Elapse Time = 148.35ms, FPS = 6.74
9: Elapse Time = 143.97ms, FPS = 6.95
Avg FPS = 6.770
详:华芯创辉 www hua-chip com