玩转rk3588（二）：rknn模型转换、部署及性能测试，解决视频流处理高延时问题（二）

八级玄仙

已于 2024-07-18 16:56:30 修改

阅读量1.6k

点赞数 27

分类专栏： rk3588 文章标签： python linux 开发语言

于 2023-12-26 16:53:14 首次发布

本文链接：https://blog.csdn.net/qq_32636415/article/details/135146473

版权

rk3588 专栏收录该内容

6 篇文章 4 订阅

订阅专栏

4、解决opencv读取RTSP进行图像处理时，高延迟

6、参考

1、环境准备

开发板环境

开发环境	软件版本/配置
开发板	firefly rk3588J
操作系统	openEuler 20.03 LTS
python版本	3.9.18

虚拟机环境

开发环境	软件版本/配置
操作系统	Ubuntu 18.04.6 LTS （查看命令：lsb_release -a）
python版本	3.6
rknn-toolkit版本	rknn-toolkit2-1.4.0

开发环境

软件版本/配置

操作系统

Ubuntu 18.04.6 LTS

（查看命令：lsb_release -a）

python版本

3.6

rknn-toolkit版本

rknn-toolkit2-1.4.0

2、模型转换

思路：在虚拟机上将pt模型转化成onnx模型，然后通过rknn-toolkit将onnx模型转换成rknn模型

虚拟机环境准备

下载 rknn-toolkit2-1.4.0.zip，yolov5-v5.0

# 创建python环境
conda create -n rk3588 python=3.6
conda activate rk3588

pip install numpy==1.19.5
git clone https://gitcode.net/mirrors/rockchip-linux/rknn-toolkit2.git (下载rknn-toolkit2)
cd ../rknn-toolkit2/doc
pip install -r rknn-toolkit2/doc/requirements_cp36-1.4.0.txt
pip install pandas pyyaml matplotlib seaborn

# 安装rknn-toolkit2工具包
cd ../rknn-toolkit2/packages
pip install rknn_toolkit2-1.4.0_22dcfef4-cp36-cp36m-linux_x86_64.whl

python环境验证是否安装成功

from rknn.api import RKNN

下载yolov5v6.0

GitHub - ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

GitCode - 开发者的代码家园该工程不包含预训练模型yolov5s.pt

git clone https://gitcode.com/mirrors/ultralytics/yolov5.git -b v6.0

下载yolov5s.pt，下载地址如下，可浏览器或迅雷直接下载

https://github.com/ultralytics/yolov5/releases/download/v6.0/yolov5s.pt

（Releases · ultralytics/yolov5 · GitHub）

将 yolov5s.pt 放在 yolov5/weights下

生成onnx模型

（1）修改 yolov5/emodels/yolo.py 脚本

将

改为

def forward(self, x): 
    z = []  # inference output
    for i in range(self.nl):
        x[i] = self.m[i](x[i])  # conv
    return x

改后只能用于模型转换，模型训练训练时需要改回来

（2）修改yolov5/export.py脚本

--- opset_version=opset 
+++ opset_version=12

（3）生成onnx

python export.py --weights weights/yolov5s.pt --img 640 --batch 1 --include onnx

export: data=data/coco128.yaml, weights=weights/yolov5s.pt, imgsz=[640], batch_size=1, device=cpu, half=False, inplace=False, train=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=13, topk_per_class=100, topk_all=100, iou_thres=0.45, conf_thres=0.25, include=['onnx']
YOLOv5 🚀 v6.0-0-g956be8e torch 1.10.1+cu102 CPU

Fusing layers...
Model Summary: 213 layers, 7225885 parameters, 0 gradients

PyTorch: starting from weights/yolov5s.pt (14.7 MB)

ONNX: starting export with onnx 1.9.0...
ONNX: export success, saved as weights/yolov5s.onnx (28.9 MB)
ONNX: run --dynamic ONNX model inference with: 'python detect.py --weights weights/yolov5s.onnx'

Export complete (2.55s)
Results saved to /home/xxx/work/opt/rk3588/yolov5/weights
Visualize with https://netron.app

执行结束 yolov5/weights 下面生成 yolov5s.onnx

生成rknn模型

修改 rknn-toolkit2-1.4.0/examples/onnx/yolov5/test.py脚本，根据具体情况修改如下位置

执行 python test.py


D RKNN: [14:09:36.249] 138 Conv          model.23.cv3.conv.bias         INT32    (512)          | 0x0061e100 0x0061f100 0x00001000
D RKNN: [14:09:36.249] 140 Conv          model.24.m.2.weight            INT8     (255,512,1,1)  | 0x006d9100 0x006f9100 0x00020000
D RKNN: [14:09:36.249] 140 Conv          model.24.m.2.bias              INT32    (255)          | 0x006f9100 0x006f9900 0x00000800
D RKNN: [14:09:36.249] 141 Conv          model.24.m.1.weight            INT8     (255,256,1,1)  | 0x006c8900 0x006d8900 0x00010000
D RKNN: [14:09:36.249] 141 Conv          model.24.m.1.bias              INT32    (255)          | 0x006d8900 0x006d9100 0x00000800
D RKNN: [14:09:36.249] 142 Conv          model.24.m.0.weight            INT8     (255,128,1,1)  | 0x006c0100 0x006c8100 0x00008000
D RKNN: [14:09:36.249] 142 Conv          model.24.m.0.bias              INT32    (255)          | 0x006c8100 0x006c8900 0x00000800
D RKNN: [14:09:36.249] -------------------------------------------------------------------------+---------------------------------
D RKNN: [14:09:36.265] ----------------------------------------
D RKNN: [14:09:36.265] Total Weight Memory Size: 7322880
D RKNN: [14:09:36.265] Total Internal Memory Size: 7782400
D RKNN: [14:09:36.265] Predict Internal Memory RW Amount: 134144000
D RKNN: [14:09:36.265] Predict Weight Memory RW Amount: 7322880
D RKNN: [14:09:36.265] ----------------------------------------
D RKNN: [14:09:36.265] <<<<<<<< end: N4rknn21RKNNMemStatisticsPassE
I rknn buiding done
done
--> Export rknn model
done
--> Init runtime environment
W init_runtime: Target is None, use simulator!
done
--> Running model
Analysing : 100%|███████████████████████████████████████████████| 146/146 [00:00<00:00, 1480.19it/s]
Preparing : 100%|████████████████████████████████████████████████| 146/146 [00:01<00:00, 129.98it/s]
W inference: The dims of input(ndarray) shape (640, 640, 3) is wrong, expect dims is 4! Try expand dims to (1, 640, 640, 3)!
done
class: person, score: 0.8515207767486572
box coordinate left,top,right,down: [211.582561314106, 243.80196797847748, 285.89955538511276, 509.6801487207413]
class: person, score: 0.8174995183944702
box coordinate left,top,right,down: [106.79295039176941, 235.57937800884247, 227.72493290901184, 537.9245892763138]
class: person, score: 0.7538331747055054
box coordinate left,top,right,down: [484.5194814801216, 232.34051251411438, 558.8364755511284, 516.3035304546356]
class: person, score: 0.32313376665115356
box coordinate left,top,right,down: [79.61335974931717, 332.74662923812866, 129.30076378583908, 521.0901017189026]
class: bus , score: 0.7289537787437439
box coordinate left,top,right,down: [93.63216829299927, 114.85382556915283, 558.5412936210632, 473.31963634490967]

同目录下生成 yolov5s.rknn

3、rk3588部署及测试

参考玩转rk3588（二）：openEuler系统创建python环境，验证rknn模型-CSDN博客

cd rknn-toolkit2/rknn_toolkit_lite2/examples/inference_with_lite

python test.py

测试环境通过后，验证自己的模型

通过摄像头测试，测试代码如下

参考： https://github.com/ChuanSe/yolov5-PT-to-RKNN/blob/main/detect.py

import urllib
import time
import sys
import numpy as np
import cv2
from rknnlite.api import RKNNLite


RKNN_MODEL = 'yolov5s.rknn'
IMG_PATH = './1.jpeg'
OBJ_THRESH = 0.25
NMS_THRESH = 0.45
IMG_SIZE = 640
CLASSES = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
        'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
        'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
        'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
        'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
        'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
        'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
        'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
        'hair drier', 'toothbrush']

def sigmoid(x):
    return 1 / (1 + np.exp(-x))


def xywh2xyxy(x):
    # Convert [x, y, w, h] to [x1, y1, x2, y2]
    y = np.copy(x)
    y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
    y[:, 1] = x[:, 1] - x[:, 3] / 2  # top left y
    y[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right x
    y[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right y
    return y


def process(input, mask, anchors):

    anchors = [anchors[i] for i in mask]
    grid_h, grid_w = map(int, input.shape[0:2])

    box_confidence = sigmoid(input[..., 4])
    box_confidence = np.expand_dims(box_confidence, axis=-1)

    box_class_probs = sigmoid(input[..., 5:])

    box_xy = sigmoid(input[..., :2])*2 - 0.5

    col = np.tile(np.arange(0, grid_w), grid_w).reshape(-1, grid_w)
    row = np.tile(np.arange(0, grid_h).reshape(-1, 1), grid_h)
    col = col.reshape(grid_h, grid_w, 1, 1).repeat(3, axis=-2)
    row = row.reshape(grid_h, grid_w, 1, 1).repeat(3, axis=-2)
    grid = np.concatenate((col, row), axis=-1)
    box_xy += grid
    box_xy *= int(IMG_SIZE/grid_h)

    box_wh = pow(sigmoid(input[..., 2:4])*2, 2)
    box_wh = box_wh * anchors

    box = np.concatenate((box_xy, box_wh), axis=-1)

    return box, box_confidence, box_class_probs


def filter_boxes(boxes, box_confidences, box_class_probs):
    boxes = boxes.reshape(-1, 4)
    box_confidences = box_confidences.reshape(-1)
    box_class_probs = box_class_probs.reshape(-1, box_class_probs.shape[-1])

    _box_pos = np.where(box_confidences >= OBJ_THRESH)
    boxes = boxes[_box_pos]
    box_confidences = box_confidences[_box_pos]
    box_class_probs = box_class_probs[_box_pos]

    class_max_score = np.max(box_class_probs, axis=-1)
    classes = np.argmax(box_class_probs, axis=-1)
    _class_pos = np.where(class_max_score >= OBJ_THRESH)

    boxes = boxes[_class_pos]
    classes = classes[_class_pos]
    scores = (class_max_score* box_confidences)[_class_pos]

    return boxes, classes, scores


def nms_boxes(boxes, scores):
    x = boxes[:, 0]
    y = boxes[:, 1]
    w = boxes[:, 2] - boxes[:, 0]
    h = boxes[:, 3] - boxes[:, 1]

    areas = w * h
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        i = order[0]
        keep.append(i)

        xx1 = np.maximum(x[i], x[order[1:]])
        yy1 = np.maximum(y[i], y[order[1:]])
        xx2 = np.minimum(x[i] + w[i], x[order[1:]] + w[order[1:]])
        yy2 = np.minimum(y[i] + h[i], y[order[1:]] + h[order[1:]])

        w1 = np.maximum(0.0, xx2 - xx1 + 0.00001)
        h1 = np.maximum(0.0, yy2 - yy1 + 0.00001)
        inter = w1 * h1

        ovr = inter / (areas[i] + areas[order[1:]] - inter)
        inds = np.where(ovr <= NMS_THRESH)[0]
        order = order[inds + 1]
    keep = np.array(keep)
    return keep


def yolov5_post_process(input_data):
    masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
    anchors = [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45],
               [59, 119], [116, 90], [156, 198], [373, 326]]

    boxes, classes, scores = [], [], []
    for input, mask in zip(input_data, masks):
        b, c, s = process(input, mask, anchors)
        b, c, s = filter_boxes(b, c, s)
        boxes.append(b)
        classes.append(c)
        scores.append(s)

    boxes = np.concatenate(boxes)
    boxes = xywh2xyxy(boxes)
    classes = np.concatenate(classes)
    scores = np.concatenate(scores)

    nboxes, nclasses, nscores = [], [], []
    for c in set(classes):
        inds = np.where(classes == c)
        b = boxes[inds]
        c = classes[inds]
        s = scores[inds]

        keep = nms_boxes(b, s)

        nboxes.append(b[keep])
        nclasses.append(c[keep])
        nscores.append(s[keep])

    if not nclasses and not nscores:
        return None, None, None

    boxes = np.concatenate(nboxes)
    classes = np.concatenate(nclasses)
    scores = np.concatenate(nscores)

    return boxes, classes, scores


def draw1(image, boxes, scores, classes):
    for box, score, cl in zip(boxes, scores, classes):
        top, left, right, bottom = box

        print('class: {}, score: {}'.format(CLASSES[cl], score))
        print('box coordinate left,top,right,down: [{}, {}, {}, {}]'.format(top, left, right, bottom))
        top = int(top)
        left = int(left)
        right = int(right)
        bottom = int(bottom)

        
        cv2.rectangle(image, (top, left), (right, bottom), (255, 0, 0), 2)
        cv2.putText(image, '{0} {1:.2f}'.format(CLASSES[cl], score),
                    (top, left - 6),
                    cv2.FONT_HERSHEY_SIMPLEX,
                    0.6, (0, 0, 255), 2)


def letterbox(im, new_shape=(640, 640), color=(0, 0, 0)):
    shape = im.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])

    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    return im, ratio, (dw, dh)


if __name__ == '__main__':
    rknn = RKNNLite()

    print('--> Load RKNN model')
    ret = rknn.load_rknn(RKNN_MODEL)
    if ret != 0:
        print('Load RKNN model failed')
        exit(ret)
    print('done')
    ret = rknn.init_runtime()
    if ret != 0:
        print('Init runtime environment failed!')
        exit(ret)
    print('done')
    
    capture = cv2.VideoCapture("rtsp://xxx")

    ref, frame = capture.read()
    if not ref:
        raise ValueError("error reading")
 
    fps = 0.0
    while(True):
        t1 = time.time()
        # 
        ref, frame = capture.read()
        if not ref:
            break
        # BGRtoRGB
        frame = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB)
        
        #############
        # 
        img = frame
        img, ratio, (dw, dh) = letterbox(img, new_shape=(IMG_SIZE, IMG_SIZE))
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
 
        # Inference
        print('--> Running model')
        outputs = rknn.inference(inputs=[img])
        

        input0_data = outputs[0]
        input1_data = outputs[1]
        input2_data = outputs[2]

        input0_data = input0_data.reshape([3, -1]+list(input0_data.shape[-2:]))
        input1_data = input1_data.reshape([3, -1]+list(input1_data.shape[-2:]))
        input2_data = input2_data.reshape([3, -1]+list(input2_data.shape[-2:]))

        input_data = list()
        input_data.append(np.transpose(input0_data, (2, 3, 0, 1)))
        input_data.append(np.transpose(input1_data, (2, 3, 0, 1)))
        input_data.append(np.transpose(input2_data, (2, 3, 0, 1)))

        boxes, classes, scores = yolov5_post_process(input_data)

        img_1 = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
        if boxes is not None:
            draw1(img_1, boxes, scores, classes)

        fps  = ( fps + (1./(time.time()-t1)) ) / 2
        print("fps= %.2f"%(fps))
        cv2.imshow("video",img_1[:,:,::-1])
        c= cv2.waitKey(1) & 0xff 
        if c==27:
            capture.release()
            break
    print("Video Detection Done!")
    capture.release()
    cv2.destroyAllWindows()

视频流 fps=25，检测平均帧率 fps= 12，视频处理延迟非常大，解决办法见第四节
该代码NPU占用率：NPU load: Core0: 34%, Core1: 0%, Core2: 0%。可通过多线程提高NPU占用率，进而提高yolov5检测帧率

采用多线程后NPU占用率提升，fps=25

参考：多线程异步提高RK3588的NPU占用率，进而提高yolov5s帧率_rk3588 多线程-CSDN博客

4、解决opencv读取RTSP进行图像处理时，高延迟

问题：上述代码延迟非常大

分析：由于图像处理时间大于帧时间时，如果不跳帧则会导致延迟越来越大

解决方案：

1、根据延迟判断是否需要跳帧即可

2、创建2个线程，一个线程获取视频帧，另一个线程处理数据（推荐）

方法1核心代码如下：

FPS = 25  # 直播流帧率
maxDelay = 0.05  # 最大容许延时
startTime = time.time()  # 开始时间
frames = 0
    while(True):
        frames += 1
        t1 = time.time()

        # 延时小于最大容许延时才进行识别
        if frames > (time.time()-startTime-maxDelay)*FPS:
            """
            # ---------- 数据处理 --------------
            """
         else:
            print("=============跳过一帧=====")

经验证，问题得到解决

方法2参考：Opencv读取RTSP进行图像处理延迟高的解决方法_opencv 手势识别延迟大-CSDN博客

5、相关错误

1、xxx.onnx 模型转xxx.rknn模型时报错： E build: ImportError: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found

该报错在 python3.8环境出现，用python3.6就可以了，不需要升级glibc-2.29

I
I sparse_weight ...
I sparse_weight done.
I
Analysing : 100%|███████████████████████████████████████████████| 142/142 [00:00<00:00, 1332.79it/s]
Quantizating : 100%|█████████████████████████████████████████████| 142/142 [00:00<00:00, 154.40it/s]
I
I quant_optimizer ...
I quant_optimizer results:
I     adjust_no_change_node: ['MaxPool_109', 'MaxPool_108', 'MaxPool_107']
I quant_optimizer done.
I
W build: The default input dtype of 'images' is changed from 'float32' to 'int8' in rknn model for performance!
                      Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of 'output' is changed from 'float32' to 'int8' in rknn model for performance!
                      Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of '327' is changed from 'float32' to 'int8' in rknn model for performance!
                      Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of '328' is changed from 'float32' to 'int8' in rknn model for performance!
                      Please take care of this change when deploy rknn model with Runtime API!
E build: Catch exception when building RKNN model!
E build: Traceback (most recent call last):
E build:   File "rknn/api/rknn_base.py", line 1580, in rknn.api.rknn_base.RKNNBase.build
E build:   File "rknn/api/rknn_base.py", line 341, in rknn.api.rknn_base.RKNNBase._generate_rknn
E build:   File "rknn/api/rknn_base.py", line 204, in rknn.api.rknn_base.RKNNBase._build_rknn
E build: ImportError: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /home/xxx/anaconda3/envs/rk3588/lib/python3.8/site-packages/rknn/api/lib/linux-x86_64/cp38/librknnc.so)
Build model failed!

分析：

在服务器上查看

strings /lib/x86_64-linux-gnu/libm.so.6 | grep GLIBC_

返回结果

GLIBC_2.2.5
GLIBC_2.4
GLIBC_2.15
GLIBC_2.18
GLIBC_2.23
GLIBC_2.24
GLIBC_2.25
GLIBC_2.26
GLIBC_2.27
GLIBC_PRIVATE

说明该服务器没有 GLIBC_2.29，需要安装

安装GLIBC_2.29：

cd ~/Downloads
wget http://ftp.gnu.org/gnu/glibc/glibc-2.29.tar.gz

tar -zxvf glibc-2.29.tar.gz
cd glibc-2.29
mkdir build
cd build/
../configure --prefix=/usr/local/glibc2.9 --disable-sanity-checks
 
make -j8
sudo make install

--disable-sanity-checks 不能少

补充：

../configure --prefix=/usr/local --disable-sanity-checks 报错：LD_LIBRARY_PATH shouldn't contain the current directory when

checking for python3... python3
checking version of python3... 3.8.16, ok
configure: WARNING:
*** These auxiliary programs are missing or incompatible versions: makeinfo
*** some features or tests will be disabled.
*** Check the INSTALL file for required versions.
checking LD_LIBRARY_PATH variable... contains current directory
configure: error:
*** LD_LIBRARY_PATH shouldn't contain the current directory when
*** building glibc. Please change the environment variable
*** and run configure again.

解决办法：

解决办法如下：

echo $LD_LIBRARY_PATH

export LD_LIBRARY_PATH=

echo $LD_LIBRARY_PATH

sudo make install

/usr/local/etc/ld.so.conf: No such file or directory

test ! -x /home/xxx/downloads/glibc-2.29/build/elf/ldconfig || LC_ALL=C \
  /home/xxx/downloads/glibc-2.29/build/elf/ldconfig  \
                        /usr/local/lib /usr/local/lib
/home/xxx/downloads/glibc-2.29/build/elf/ldconfig: Warning: ignoring configuration file that cannot be opened: /usr/local/etc/ld.so.conf: No such file or directory
make[1]: Leaving directory '/home/xxx/downloads/glibc-2.29'

用python3.6就可以了，不需要升级glibc-2.29

6、参考

Ubuntu18.04升级GLIBC_2.29，解决ImportError: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29‘-CSDN博客

https://www.cnblogs.com/relax-zw/p/11328435.html

yolov5篇---yolov5训练pt模型并转换为rknn模型，部署在RK3588开发板上——从训练到部署全过程_rknn yolov5-CSDN博客

1. NPU使用 — Firefly Wiki

八级玄仙

关注

27
点赞
踩
26

收藏

觉得还不错? 一键收藏
1
评论
玩转rk3588（二）：rknn模型转换、部署及性能测试，解决视频流处理高延时问题（二）

./configure --prefix=/usr/local --disable-sanity-checks 报错：LD_LIBRARY_PATH shouldn't contain the current directory when。修改 rknn-toolkit2-1.4.0/examples/onnx/yolov5/test.py脚本，根据具体情况修改如下位置。思路：在虚拟机上将pt模型转化成onnx模型，然后通过rknn-toolkit将onnx模型转换成rknn模型。
复制链接

扫一扫

专栏目录