1.尽量选择relu激活函数
2.原forward函数记得修改为如下,再export为ONNX。
# 在models/yolo.py中,修改类Detect的forward函数
# def forward(self, x):
# z = [] # inference output
# for i in range(self.nl):
# x[i] = self.m[i](x[i]) # conv
# bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85)
# x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous()
# if not self.training: # inference
# if self.dynamic or self.grid[i].shape[2:4] != x[i].shape[2:4]:
# self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i)
# if isinstance(self, Segment): # (boxes + masks)
# xy, wh, conf, mask = x[i].split((2, 2, self.nc + 1, self.no - self.nc - 5), 4)
# xy = (xy.sigmoid() * 2 + self.grid[i]) * self.stride[i] # xy
# wh = (wh.sigmoid() * 2) ** 2 * self.anchor_grid[i] # wh
# y = torch.cat((xy, wh, conf.sigmoid(), mask), 4)
# else: # Detect (boxes only)
# xy, wh, conf = x[i].sigmoid().split((2, 2, self.nc + 1), 4)
# xy = (xy * 2 + self.grid[i]) * self.stride[i] # xy
# wh = (wh * 2) ** 2 * self.anchor_grid[i] # wh
# y = torch.cat((xy, wh, conf), 4)
# z.append(y.view(bs, self.na * nx * ny, self.no))
# return x if self.training else (torch.cat(z, 1), ) if self.export else (torch.cat(z, 1), x)
def forward(self, x):
z = [] # inference output
for i in range(self.nl):
z.append(torch.sigmoid(self.m[i](x[i])))
return z
可选修改(不修改也能够成功跑通,但亦有手册如此建议,请自行查看,博主没有修改)
# 在export.py文件run()函数中修改:
if half and not coreml:
im, model = im.half(), model.half() # to FP16
- shape = tuple((y[0] if isinstance(y, tuple) else y).shape) # model output shape
+ shape = tuple((y[0] if (isinstance(y, tuple) or (isinstance(y, list))) else y).shape) # model output shape
metadata = {'stride': int(max(model.stride)), 'names': model.names} # model metadata
LOGGER.info(f"\n{colorstr('PyTorch:')} starting from {file} with output shape {shape} ({file_size(file):.1f} MB)")
opset设置11/12 高版本容易出错
3.转换建议在x86上进行,使用镜像。板端建议使用RKNN Toolkit Lite。
4.尽管rk官方有提供函数rknn_lite.init_runtime(core_mask=RKNNLite.NPU_CORE_0_1_2),使得RKNN模型能使用三核心进行推演,但NPU的使用率仍处于一个较低的水平。应考虑多线程/多进程方式,构建线程池,将视频帧绑定到核心。rknn_lite.init_runtime(core_mask=RKNNLite.NPU_CORE_0)
rknn_lite.init_runtime(core_mask=RKNNLite.NPU_CORE_1)
rknn_lite.init_runtime(core_mask=RKNNLite.NPU_CORE_2)
能够充分利用资源。此外应将其作定频操作,使其更加稳定发挥。
q2500050191
b站熬夜冲浪冠军
闲鱼认真专注开卷
Tb深度学习地表最强小店