Hailo-8系列——Hailo Dataflow Compiler使用经验分享

最新推荐文章于 2025-02-25 13:47:05 发布

花月mmc

最新推荐文章于 2025-02-25 13:47:05 发布

阅读量3.3k

点赞数 12

分类专栏： Hailo 文章标签： python 嵌入式硬件边缘计算

本文链接：https://blog.csdn.net/stoat04/article/details/137569246

版权

Hailo 专栏收录该内容

4 篇文章

订阅专栏

Hailo Dataflow Compiler用于将深度学习模型编译为能够在Hailo-8上运行的HEF文件。
安装需求：系统内存大于16G
为了便于安装可以采用在featurize中租用的方式安装文件

转换为HEF文件需要大概三步：
1、编写一个深度学习模型，转换为HAR文件
2、量化模型，通过加入代表数据集，转换为量化的HAR文件
3、编译转换为HEF文件

首先，更新，安装必要的环境

sudo apt-get update  # 避免报错
sudo apt-get install python3-dev graphviz libgraphviz-dev pkg-config

然后，在安装hailo_dataflow_compiler

pip install  hailo_dataflow_compiler-3.27.0-py3-none-linux_x86_64.whl

安装成功后，根据手册编译一个模型

from hailo_sdk_client import ClientRunner

import tensorflow as tf

# Building a simple Keras model
def build_small_example_net():
    inputs = tf.keras.Input(shape=(24, 24, 96), name="img")
    x = tf.keras.layers.Conv2D(24, 1, name='conv1')(inputs)
    x = tf.keras.layers.BatchNormalization(momentum=0.9, name='bn1')(x)
    outputs = tf.keras.layers.ReLU(max_value=6.0, name='relu1')(x)
    model = tf.keras.Model(inputs, outputs, name="small_example_net")
    return model


# Converting the Model to tflite
model = build_small_example_net()
model_name = 'small_example'
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.TFLITE_BUILTINS,  # enable TensorFlow Lite ops.
    tf.lite.OpsSet.SELECT_TF_OPS  # enable TensorFlow ops.
]
tflite_model = converter.convert()  # may cause warnings in jupyter notebook, don't worry.

tflite_model_path = 'small_example.tflite'
with tf.io.gfile.GFile(tflite_model_path, 'wb') as f:
    f.write(tflite_model)


chosen_hw_arch = 'hailo8'
# Parsing the model to Hailo format
runner = ClientRunner(hw_arch=chosen_hw_arch)
hn, npz = runner.translate_tf_model(tflite_model_path, model_name)

hailo_model_har_name = f'{model_name}_hailo_model.har'
runner.save_har(hailo_model_har_name)

# 保存的数据可视化数据图
from IPython.display import SVG
!hailo visualizer {hailo_model_har_name} --no-browser

# 验证流程采用代表数据组
import numpy as np
calib_dataset=np.random.randint(low=0, high=10, size=(50,24, 24, 96))
np.save('calib_set.npy', calib_dataset)
runner.optimize(calib_dataset)

runner.optimize后显示的内容

[info] Starting Model Optimization
[warning] Reducing optimization level to 1 (the accuracy won't be optimized and compression won't be used) because there's less data than the recommended amount (1024)
[info] Model received quantization params from the hn
[info] Starting Mixed Precision
[info] Mixed Precision is done (completion time is 00:00:00.31)
[info] create_layer_norm skipped
[info] Starting Stats Collector
[info] Using dataset with 50 entries for calibration
Calibration: 100%|██████████| 50/50 [00:05<00:00,  9.34entries/s]
[info] Stats Collector is done (completion time is 00:00:05.47)

[info] No shifts available for layer small_example/conv1/conv_op, using max shift instead. delta=0.10578701765651832
[info] Starting Bias Correction
[info] The algorithm Bias Correction will use up to 0.02 GB of storage space
[info] Using dataset with 50 entries for Bias Correction
Bias Correction: 100%|██████████| 1/1 [00:02<00:00,  2.59s/blocks, Layers=['small_example/conv1_output_0']]
[info] Bias Correction is done (completion time is 00:00:02.89)
[info] Adaround skipped
[info] Fine Tune skipped

[info] Starting Layer Noise Analysis
Full Quant Analysis: 100%|██████████| 2/2 [00:03<00:00,  1.51s/iterations]
[info] Layer Noise Analysis is done (completion time is 00:00:03.77)
[info] Output layers signal-to-noise ratio (SNR): measures the quantization noise (higher is better)
[info] 	small_example/output_layer1 SNR:	47.17 dB
[info] Model Optimization is done

转换为量化的HAR文件

# Save the result state to a Quantized HAR file
quantized_model_har_path = f'{model_name}_quantized_model.har'
runner.save_har(quantized_model_har_path)

转换为HEF文件

from hailo_sdk_client import ClientRunner
runner = ClientRunner(har=quantized_model_har_path)

hef = runner.compile()

file_name = f'{model_name}.hef'
with open(file_name, 'wb') as f:
    f.write(hef)

模型分析

har_path = f'{model_name}_compiled_model.har'
runner.save_har(har_path)
!hailo profiler {har_path}

模型输出的信息

[info] Saved HAR to: /home/featurize/small_example_compiled_model.har
[info] Current Time: 00:52:07, 04/11/24
[info] CPU: Architecture: x86_64, Model: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, Number Of Cores: 6, Utilization: 0.7%
[info] Memory: Total: 25GB, Available: 22GB
[info] System info: OS: Linux, Kernel: 5.4.0-91-generic
[info] Hailo DFC Version: 3.27.0
[info] HailoRT Version: Not Installed
[info] PCIe: No Hailo PCIe device was found
[info] Running `hailo profiler small_example_compiled_model.har`
[info] Running profile for small_example in state compiled_model
[info] 
Model Details
--------------------------------  ----------
Input Tensors Shapes              24x24x96
Operations per Input Tensor       0.00 GOPs
Operations per Input Tensor       0.00 GMACs
Pure Operations per Input Tensor  0.00 GOPs
Pure Operations per Input Tensor  0.00 GMACs
Model Parameters                  0.01 M
--------------------------------  ----------

Profiler Input Settings
-----------------  -----------------
Optimization Goal  Reach Highest FPS
Profiler Mode      Compiled
-----------------  -----------------

Performance Summary
----------------------  --------------------
Number of Devices       1
Number of Contexts      1
Throughput              57463.01 FPS
Latency                 0.02 ms
Operations per Second   152.52 GOP/s
MACs per Second         77.05 GMAC/s
Total Input Bandwidth   2.96 Gigabytes/sec
Total Output Bandwidth  757.57 Megabytes/sec
Context Switch Configs  N/A
----------------------  --------------------
[info] Saved Profiler HTML Report to: /home/featurize/small_example_compiled_model.html