买的是野火鲁班猫5,现将训练推理模型转化步骤记录如下:
参考的官方文档地址:1. 垃圾检测和识别 — [野火]嵌入式AI应用开发实战指南—基于LubanCat-RK系列板卡 文档 (embedfire.com)
个人梳理的步骤是这样的,① 将训练好的pt文件转为onnx,用官方指定的项目。
项目地址:GitHub - airockchip/ultralytics_yolov8: NEW - YOLOv8 🚀 in PyTorch > ONNX > CoreML > TFLite
搭建好环境之后按照官方文档修改ultralytics/cfg/default.yaml 中的模型文件路径,model:文件路径,此处可以写成绝对路径,或者执行export PYTHONPATH=./ 后把文件放入根目录,那就只写个名字就ok,其他的参数看着改就行,其实不影响模型结构。
执行 python ultralytics/engine/exporter.py 导出onnx文件
② 参考3. RKNN Toolkit2介绍 — [野火]嵌入式AI应用开发实战指南—基于LubanCat-RK系列板卡 文档 (embedfire.com)
***注意***:官方文档上是这样写的,它的意思是别去git上拉最新代码严格参考官方提供的版本包,否则折腾出来bug查不到原因,此时你就该想想是不是版本的问题。Toolkit2 和Lite2也是版本对应的否则,你转化出来的代码也无法推理。我选择的是1.5.0版本,突出一个稳重。
构建将onnx转为rknn文件的环境。这里记录一个疑惑:RKNN-Toolkit2 是用来在linux x86环境下将onnx文件转成rknn文件的环境搭建包,RKNN Toolkit Lite2 是在3588板子上推理时使用的工具包。虽然文档上写清楚了,但是如果刚上手很容易混成一团,不知道这俩东西的区别。
安装过程很简单 找到packages下的whl文件,根据自身python的版本安装:pip3 install packages/rknn_toolkit2-1.5.0+1fa95b5c-cp10-cp10-linux_x86_64.whl(这是一个linux x86上的转onnx的工具包)有时候没换源可能很慢,或者明明清华源加速了没有指定的包。此时加上一个 -i https://mirror.baidu.com/pypi/simple ,这里有个坑:里边的requment.txt文件好多个其实别管,你直接安装whl即可。我在reqement.txt中踩了很久的坑,后来才发现可以直接安装的。严格按照文档就少走弯路啊。
***注意:此时的你应该在一个x86的linux上折腾,别走错路了啊宝***
③ 参考https://github.com/airockchip/rknn_model_zoo项目。
我的项目是yolov8-seg的项目,我参考的转化代码为:rknn_model_zoo/examples/yolov8_seg/python at main · airockchip/rknn_model_zoo (github.com)
import sys
from rknn.api import RKNN
DATASET_PATH = '../../../datasets/tomo/tomo_img.txt' # 根据yolo的例子做了个验证集的路径txt
DEFAULT_RKNN_PATH = '../model/feng1.rknn' # 导出的文件名
DEFAULT_QUANT = True
def parse_arg():
if len(sys.argv) < 3:
print("Usage: python3 {} onnx_model_path [platform] [dtype(optional)] [output_rknn_path(optional)]".format(sys.argv[0]));
print(" platform choose from [rk3562,rk3566,rk3568,rk3588,rk1808,rv1109,rv1126]")
print(" dtype choose from [i8, fp] for [rk3562,rk3566,rk3568,rk3588]")
print(" dtype choose from [u8, fp] for [rk1808,rv1109,rv1126]")
exit(1)
model_path = sys.argv[1]
platform = sys.argv[2]
do_quant = DEFAULT_QUANT
if len(sys.argv) > 3:
model_type = sys.argv[3]
if model_type not in ['i8', 'u8', 'fp']:
print("ERROR: Invalid model type: {}".format(model_type))
exit(1)
elif model_type in ['i8', 'u8']:
do_quant = True
else:
do_quant = False
if len(sys.argv) > 4:
output_path = sys.argv[4]
else:
output_path = DEFAULT_RKNN_PATH
return model_path, platform, do_quant, output_path
if __name__ == '__main__':
model_path, platform, do_quant, output_path = parse_arg()
# Create RKNN object
rknn = RKNN(verbose=False)
# Pre-process config
print('--> Config model')
rknn.config(mean_values=[[0, 0, 0]], std_values=[[255, 255, 255]], target_platform=platform)
print('done')
# Load model
print('--> Loading model')
ret = rknn.load_onnx(model=model_path)
if ret != 0:
print('Load model failed!')
exit(ret)
print('done')
# Build model
print('--> Building model')
ret = rknn.build(do_quantization=do_quant, dataset=DATASET_PATH)
if ret != 0:
print('Build model failed!')
exit(ret)
print('done')
# Export rknn model
print('--> Export rknn model')
ret = rknn.export_rknn(output_path)
if ret != 0:
print('Export rknn model failed!')
exit(ret)
print('done')
# python3 convert.py ../model/best.onnx rk3588
# Release
rknn.release()
执行python3 convert.py ../model/best.onnx rk3588 第一个是onnx路径 后边是芯片名字。执行导出即可。
# 这里记录一个问题就是convert.py下边的yolo8_seg.py,它是一个验证文件。我没跑通,报torch的版本问题。反正推理是在板子上进行的,直接搞lite2就行了。
④ 搭建lite2的环境。 还是在Toolkit2项目中,进入rknn-toolkit-lite2查看packages文件下有很多whl文件。***注意***,此时你应该在板子的环境下,
pip3 install packages/rknn_toolkit_lite2-1.5.0-cp10-cp10-linux_aarch64.whl
安装完毕之后转化代码参考:example/yolov8/yolov8_seg/onnx2rknn.py · LubanCat/lubancat_ai_manual_code - 码云 - 开源中国 (gitee.com)
我修改的代码如下:
import os
import time
import cv2
import sys
import argparse
import torch
import torchvision
import torch.nn.functional as F
import numpy as np
from copy import copy
from rknnlite.api import RKNNLite
OBJ_THRESH = 0.25
NMS_THRESH = 0.45
MAX_DETECT = 100
# The follew two param is for mAP test
# OBJ_THRESH = 0.001
# NMS_THRESH = 0.65
IMG_SIZE = (640, 640) # (width, height), such as (1280, 736)
target = "rk3588"
rknn_model_path = "./feng1.rknn"
CLASSES = ("b_fully_ripened", "b_green", "b_half_ripened",
"l_fully_ripened","l_green","l_half_ripened",)
class Colors:
# Ultralytics color palette https://ultralytics.com/
def __init__(self):
# hex = matplotlib.colors.TABLEAU_COLORS.values()
hexs = ('FF3838', 'FF9D97', 'FF701F', 'FFB21D', 'CFD231', '48F90A', '92CC17', '3DDB86', '1A9334', '00D4BB',
'2C99A8', '00C2FF', '344593', '6473FF', '0018EC', '8438FF', '520085', 'CB38FF', 'FF95C8', 'FF37C7')
self.palette = [self.hex2rgb(f'#{c}') for c in hexs]
self.n = len(self.palette)
def __call__(self, i, bgr=False):
c = self.palette[int(i) % self.n]
return (c[2], c[1], c[0]) if bgr else c
@staticmethod
def hex2rgb(h): # rgb order (PIL)
return tuple(int(h[1 + i:1 + i + 2], 16) for i in (0, 2, 4))
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def filter_boxes(boxes, box_confidences, box_class_probs, seg_part):
"""Filter boxes with object threshold.
"""
box_confidences = box_confidences.reshape(-1)
candidate, class_num = box_class_probs.shape
class_max_score = np.max(box_class_probs, axis=-1)
classes = np.argmax(box_class_probs, axis=-1)
_class_pos = np.where(class_max_score * box_confidences >= OBJ_THRESH)
scores = (class_max_score * box_confidences)[_class_pos]
boxes = boxes[_class_pos]
classes = classes[_class_pos]
seg_part = (seg_part * box_confidences.reshape(-1, 1))[_class_pos]
return boxes, classes, scores, seg_part
def dfl(position):
# Distribution Focal Loss (DFL)
x = torch.tensor(position)
n, c, h, w = x.shape
p_num = 4
mc = c // p_num
y = x.reshape(n, p_num, mc, h, w)
y = y.softmax(2)
acc_metrix = torch.tensor(range(mc)).float().reshape(1, 1, mc, 1, 1)
y = (y * acc_metrix).sum(2)
return y.numpy()
def box_process(position):
grid_h, grid_w = position.shape[2:4]
col, row = np.meshgrid(np.arange(0, grid_w), np.arange(0, grid_h))
col = col.reshape(1, 1, grid_h, grid_w)
row = row.reshape(1, 1, grid_h, grid_w)
grid = np.concatenate((col, row), axis=1)
stride = np.array([IMG_SIZE[1] // grid_h, IMG_SIZE[0] // grid_w]).reshape(1, 2, 1, 1)
position = dfl(position)
box_xy = grid + 0.5 - position[:, 0:2, :, :]
box_xy2 = grid + 0.5 + position[:, 2:4, :, :]
xyxy = np.concatenate((box_xy * stride, box_xy2 * stride), axis=1)
return xyxy
def post_process(input_data):
# input_data[0], input_data[4], and input_data[8] are detection box information
# input_data[1], input_data[5], and input_data[9] are category score information
# input_data[2], input_data[6], and input_data[10] are confidence score information
# input_data[3], input_data[7], and input_data[11] are segmentation information
# input_data[12] is the proto information
proto = input_data[-1]
boxes, scores, classes_conf, seg_part = [], [], [], []
defualt_branch = 3
pair_per_branch = len(input_data) // defualt_branch
for i in range(defualt_branch):
boxes.append(box_process(input_data[pair_per_branch * i]))
classes_conf.append(input_data[pair_per_branch * i + 1])
scores.append(np.ones_like(input_data[pair_per_branch * i + 1][:, :1, :, :], dtype=np.float32))
seg_part.append(input_data[pair_per_branch * i + 3])
def sp_flatten(_in):
ch = _in.shape[1]
_in = _in.transpose(0, 2, 3, 1)
return _in.reshape(-1, ch)
boxes = [sp_flatten(_v) for _v in boxes]
classes_conf = [sp_flatten(_v) for _v in classes_conf]
scores = [sp_flatten(_v) for _v in scores]
seg_part = [sp_flatten(_v) for _v in seg_part]
boxes = np.concatenate(boxes)
classes_conf = np.concatenate(classes_conf)
scores = np.concatenate(scores)
seg_part = np.concatenate(seg_part)
# filter according to threshold
boxes, classes, scores, seg_part = filter_boxes(boxes, scores, classes_conf, seg_part)
zipped = zip(boxes, classes, scores, seg_part)
sort_zipped = sorted(zipped, key=lambda x: (x[2]), reverse=True)
result = zip(*sort_zipped)
max_nms = 30000
n = boxes.shape[0] # number of boxes
if not n:
return None, None, None, None
elif n > max_nms: # excess boxes
boxes, classes, scores, seg_part = [np.array(x[:max_nms]) for x in result]
else:
boxes, classes, scores, seg_part = [np.array(x) for x in result]
# nms
nboxes, nclasses, nscores, nseg_part = [], [], [], []
agnostic = 0
max_wh = 7680
c = classes * (0 if agnostic else max_wh)
ids = torchvision.ops.nms(
torch.tensor(boxes, dtype=torch.float32) + torch.tensor(c, dtype=torch.float32).unsqueeze(-1),
torch.tensor(scores, dtype=torch.float32), NMS_THRESH)
real_keeps = ids.tolist()[:MAX_DETECT]
nboxes.append(boxes[real_keeps])
nclasses.append(classes[real_keeps])
nscores.append(scores[real_keeps])
nseg_part.append(seg_part[real_keeps])
if not nclasses and not nscores:
return None, None, None, None
boxes = np.concatenate(nboxes)
classes = np.concatenate(nclasses)
scores = np.concatenate(nscores)
seg_part = np.concatenate(nseg_part)
ph, pw = proto.shape[-2:]
proto = proto.reshape(seg_part.shape[-1], -1)
seg_img = np.matmul(seg_part, proto)
seg_img = sigmoid(seg_img)
seg_img = seg_img.reshape(-1, ph, pw)
seg_threadhold = 0.5
# crop seg outside box
seg_img = F.interpolate(torch.tensor(seg_img)[None], torch.Size([640, 640]), mode='bilinear', align_corners=False)[
0]
seg_img_t = _crop_mask(seg_img, torch.tensor(boxes))
seg_img = seg_img_t.numpy()
seg_img = seg_img > seg_threadhold
return boxes, classes, scores, seg_img
def draw(image, boxes, scores, classes):
for box, score, cl in zip(boxes, scores, classes):
top, left, right, bottom = [int(_b) for _b in box]
print("%s @ (%d %d %d %d) %.3f" % (CLASSES[cl], top, left, right, bottom, score))
cv2.rectangle(image, (top, left), (right, bottom), (255, 0, 0), 2)
cv2.putText(image, '{0} {1:.2f}'.format(CLASSES[cl], score),
(top, left - 6), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2, cv2.LINE_AA)
def _crop_mask(masks, boxes):
"""
"Crop" predicted masks by zeroing out everything not in the predicted bbox.
Vectorized by Chong (thanks Chong).
Args:
- masks should be a size [h, w, n] tensor of masks
- boxes should be a size [n, 4] tensor of bbox coords in relative point form
"""
n, h, w = masks.shape
x1, y1, x2, y2 = torch.chunk(boxes[:, :, None], 4, 1) # x1 shape(1,1,n)
r = torch.arange(w, device=masks.device, dtype=x1.dtype)[None, None, :] # rows shape(1,w,1)
c = torch.arange(h, device=masks.device, dtype=x1.dtype)[None, :, None] # cols shape(h,1,1)
return masks * ((r >= x1) * (r < x2) * (c >= y1) * (c < y2))
def merge_seg(image, seg_img, classes):
color = Colors()
print(len(seg_img),"长度")
for i in range(len(seg_img)):
seg = seg_img[i]
seg = seg.astype(np.uint8)
seg = cv2.cvtColor(seg, cv2.COLOR_GRAY2BGR)
seg = seg * color(classes[i])
seg = seg.astype(np.uint8)
image = cv2.add(image, seg)
return image
def letter_box(im, new_shape, color=(0, 0, 0)):
# Resize and pad image while meeting stride-multiple constraints
shape = im.shape[:2] # current shape [height, width]
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)
# Scale ratio (new / old)
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
# Compute padding
ratio = r # width, height ratios
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
dw /= 2 # divide padding into 2 sides
dh /= 2
if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
return im, ratio, (dw, dh)
def get_real_box(src_shape, box, dw, dh, ratio):
bbox = copy(box)
# unletter_box result
bbox[:, 0] -= dw
bbox[:, 0] /= ratio
bbox[:, 0] = np.clip(bbox[:, 0], 0, src_shape[1])
bbox[:, 1] -= dh
bbox[:, 1] /= ratio
bbox[:, 1] = np.clip(bbox[:, 1], 0, src_shape[0])
bbox[:, 2] -= dw
bbox[:, 2] /= ratio
bbox[:, 2] = np.clip(bbox[:, 2], 0, src_shape[1])
bbox[:, 3] -= dh
bbox[:, 3] /= ratio
bbox[:, 3] = np.clip(bbox[:, 3], 0, src_shape[0])
return bbox
def get_real_seg(origin_shape, new_shape, dw, dh, seg):
# ! fix side effect
dw = int(dw)
dh = int(dh)
if (dh == 0) and (dw == 0) and origin_shape == new_shape:
return seg
elif dh == 0 and dw != 0:
seg = seg[:, :, dw:-dw] # a[0:-0] = []
elif dw == 0 and dh != 0:
seg = seg[:, dh:-dh, :]
seg = np.where(seg, 1, 0).astype(np.uint8).transpose(1, 2, 0)
seg = cv2.resize(seg, (origin_shape[1], origin_shape[0]), interpolation=cv2.INTER_LINEAR)
if len(seg.shape) < 3:
return seg[None, :, :]
else:
return seg.transpose(2, 0, 1)
if __name__ == '__main__':
img_path = "./x1.jpg"
rknn = RKNNLite()
rknn.list_devices()
# 加载rknn模型
rknn.load_rknn(path=rknn_model_path)
# 设置运行环境,目标默认是rk3588
rknn.load_rknn(rknn_model_path)
rknn.init_runtime(core_mask=RKNNLite.NPU_CORE_0_1_2)# NPU_CORE_0
# 输入图像
img_src = cv2.imread(img_path)
if img_src is None:
print("Load image failed!")
exit()
src_shape = img_src.shape[:2]
img, ratio, (dw, dh) = letter_box(img_src, IMG_SIZE)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# 推理运行
print('--> Running model')
outputs = rknn.inference(inputs=[img])
print('done')
start = time.time()
# 后处理
boxes, classes, scores, seg_img = post_process(outputs)
print("use_time:", time.time() - start)
if boxes is not None:
real_boxs = get_real_box(src_shape, boxes, dw, dh, ratio)
real_segs = get_real_seg(src_shape, IMG_SIZE, dw, dh, seg_img)
print(real_boxs,"rrrrr")
img_p = merge_seg(img_src, real_segs, classes)
draw(img_p, real_boxs, scores, classes)
cv2.imwrite("t_res1.jpg", img_p)
哈哈,看到代码是不是想跑起来别急。你还差了一步:
将librknnrt.so文件拷贝到你的/usr/lib 下。这个文件也在官方提供的ai工具包中,请选择跟你版本一致的so文件,具体路径:Linux\librknn_api\aarch64\librknnrt.so
拷贝好了就可以执行代码了python3 xxx.py
2024-05-17 更正1:由于1.5.0版本推理时会产生大量干扰日志,紧急切换至rknn-toolkit2-2.0.0-beta0版本
The input [0] need 4dims input, but 3dims input buffer feed
报错:设置输入加一个维度 img = np.expand_dims(img, axis=0) 解决了