【Hailo-8算力卡推断测试】

最新推荐文章于 2024-06-19 11:42:49 发布

花月mmc

最新推荐文章于 2024-06-19 11:42:49 发布

阅读量1.5k

点赞数 10

分类专栏：边缘设备深度学习文章标签：人工智能边缘计算智能硬件深度学习嵌入式硬件

本文链接：https://blog.csdn.net/stoat04/article/details/134688597

版权

边缘设备深度学习专栏收录该内容

7 篇文章 4 订阅

订阅专栏

Hailo-8算力卡推断

Hailo-8算力卡的算力为26TOPS，可以方便地接入带有PCIE扩展的设备中，本次针对Hailo-8算力卡的实际应用过程做一个测试。

Hailo-8算力卡工具链

1.Hailo Dataflow Compiler（模型转换编译为Hailo二进制格式）
2.HailoRT（用于运行网络和与Hailo设备交互的运行时环境和驱动程序）
3.Model Zoo（预先训练的模型在Hailo设备上运行和评估）
4.TAPPAS（部署框架、示例和多网络管道）虽然您可以单独安装每个产品，但Hailo每季度发布一个软件套件，其中所有产品版本都是一致的。因此，使用Hailo AI软件套件可确保最佳兼容性。

因为实际的开发工具链的部署需要较高的计算机配置，尚未搞定。。。
完整的开发套件需要较高的配置，本次仅以安装包的形式进行安装。
HailoRT 及pyHailoRT 作为主要的设备端的开发工具。(HailoRT)

Hailo-8算力卡推断流程分析（Python inference tutorial）

除python外，还存在C/C++进行推断的案例，其比python推断案例更多，但学习难度较大，后续介绍。

Hailo-8算力卡仅需HEF文件，无需配置Tensorflow等环境使用。

单流程推断

import numpy as np
from multiprocessing import Process
from hailo_platform import (HEF, VDevice, HailoStreamInterface, InferVStreams,ConfigureParams,InputVStreamParams, OutputVStreamParams, InputVStreams, OutputVStreams,FormatType)
# 目标可以用作上下文管理器（“with”语句），以确保按时发布。
# 为了简单起见，这里避免了它
target = VDevice()
# 正在将编译的HEF加载到设备：
model_name = 'resnet_v1_18'
hef_path = '../hefs/{}.hef'.format(model_name)
hef = HEF(hef_path)
# 配置网络组
configure_params = ConfigureParams.create_from_hef(hef=hef,
,→interface=HailoStreamInterface.PCIe)
network_groups = target.configure(hef, configure_params)
network_group = network_groups[0]
network_group_params = network_group.create_params()
# 创建输入和输出虚拟流参数
input_vstreams_params = InputVStreamParams.make(network_group, format_type=FormatType.FLOAT32)
output_vstreams_params = OutputVStreamParams.make(network_group, format_type=FormatType.UINT8)
# 定义数据集参数
input_vstream_info = hef.get_input_vstream_infos()[0]
output_vstream_info = hef.get_output_vstream_infos()[0]
image_height, image_width, channels = input_vstream_info.shape
num_of_images = 10
low, high = 2, 20
# 生成随机数据集
dataset = np.random.randint(low, high, (num_of_images, image_height, image_width,
,channels)).astype(np.float32)

推断模型，然后显示输出形状：

# 推断
with InferVStreams(network_group, input_vstreams_params, output_vstreams_params) as infer_pipeline:
input_data = {input_vstream_info.name: dataset}
with network_group.activate(network_group_params):
infer_results = infer_pipeline.infer(input_data)
print('Stream output shape is {}'.format(infer_results[output_vstream_info.name].shape))

注：上述的推断速度实际上是多进程的，同时受到推断数量的影响，在实际运行过程中仅考虑的是推理部分的运行速度，数据越多计算速度越快（FPS越大）。

数据流推理

注意：Windows不支持模式。
我们不会使用infer。相反，我们将使用发送和接收模式。发送功能和接收功能将在不同的进程中运行。
1、定义发送和接收功能：

def send(configured_network, num_frames):
	configured_network.wait_for_activation(1000)
	vstreams_params = InputVStreamParams.make(configured_network)
	with InputVStreams(configured_network, vstreams_params) as vstreams:
		vstream_to_buffer = {vstream: np.ndarray([1] + 	list(vstream.shape),dtype=vstream.dtype) for vstream in vstreams}
	for _ in range(num_frames):
		for vstream, buff in vstream_to_buffer.items():
			vstream.send(buff)
			
def recv(configured_network, vstreams_params, num_frames):
	configured_network.wait_for_activation(1000)
	with OutputVStreams(configured_network, vstreams_params) as vstreams:
		for _ in range(num_frames):
		for vstream in vstreams:
			data = vstream.recv()
			
def recv_all(configured_network, num_frames):
	vstreams_params_groups = OutputVStreamParams.make_groups(configured_network)
	recv_procs = []
	for vstreams_params in vstreams_params_groups:
		proc = Process(target=recv, args=(configured_network, vstreams_params, num_frames))
		proc.start()
		recv_procs.append(proc)
	for proc in recv_procs:
		proc.join()

定义要流式传输的帧数、定义流程、创建目标并运行流程：

# Define the amount of frames to stream
num_of_frames = 1000
send_process = Process(target=send, args=(network_group, num_of_frames))
recv_process = Process(target=recv_all, args=(network_group, num_of_frames))
recv_process.start()
send_process.start()
print('Starting streaming (hef=\'{}\', num_of_frames={})'.format(model_name, num_of_frames))
with network_group.activate(network_group_params):
	send_process.join()
	recv_process.join()
print('Done')
target.release()