立即学习:https://edu.csdn.net/course/play/28807/427189?utm_source=blogtoedu
性能评估与硬件选择
- 性能度量:
- 吞吐量(FPS,InferPS)
- 延时
- 效率(frame/sec/watt,frame/sec/$$)
- 影响性能的因素
- 神经网络的拓扑及参数:ResNet-50,Squeezenet
- 目标设备的构架:CPU、GPU、FPGA、AI
- 精度(数据格式,FP16、FP32),Xeon AVX512,VNNI,DL-Boost(3X)
- Batching
- 同步、异步执行(Sync、Async)
- CPU吞吐模式,streams,threads,#ireq
- 使用OpenVINO评估性能
OpenVINO性能评估例子
使用time() 函数记录时间
在推理前后增加计时点 this_time = time.time(),前后相减以获得推理时间。然后使用print函数打印在屏幕上。
start_time=time.time()
exec_net.start_async(request_id=next_request_id, inputs=feed_dict)
end_time=time.time()
exetime=end_time-start_time
sys.stdout.write('%.5s' % exetime)
python3 add-perf-object-detection.py
Frame 0 0.002
Frame 1 0.0063 3 3 3 3 3 3 3 3 3 3 3
Frame 2 0.0011 3 3 3 3 3 3 3 3 3 3 3 3
Frame 3 0.0013 3 3 3 3 3 3 3 3 3 3
Frame 4 0.0011 3 3 3 3 3 3 3 3 3
Frame 5 0.0011 3 3 3 3 3 3 3 3 3 3 3 3 3
Frame 6 0.0003 3 3 3 3 3 3 3 3 3 3 3
Frame 7 0.0011 3 3 3 3 3 3 3 3 3 3 3 3 3 3
使用benchmark_app.py测试
首先测试ssd-mobilenet模型,获得它在本地设备上的性能参数。
python3 benchmark_app.py -m models/ssd-mobilenet.xml -i images/ -t 20
[Step 1/11] Parsing and validating input arguments
[ WARNING ] -nstreams default value is determined automatically for a device. Although the automatic selection usually provides a reasonable performance, but it still may be non-optimal for some cases, for more information look at README.
[Step 2/11] Loading Inference Engine
[ INFO ] InferenceEngine:
API version............. 2.1.2020.3.0-3467-15f2c61a-releases/2020/3
[ INFO ] Device info
CPU
MKLDNNPlugin............ version 2.1
Build................... 2020.3.0-3467-15f2c61a-releases/2020/3
[Step 3/11] Reading the Intermediate Representation network
[ INFO ] Read network took 1177.84 ms
[Step 4/11] Resizing network to match image sizes and given batch
[ INFO ] Network batch size: 1
[Step 5/11] Configuring input of the model
[Step 6/11] Setting device configuration
[Step 7/11] Loading the model to the device
[ INFO ] Load network took 613.86 ms