（三）DepthAI-python相关接口：OAK Nodes

OAK中国_官方

于 2022-11-30 10:00:00 发布

阅读量855

点赞数

分类专栏： OAK深度相机使用教程文章标签： python 人工智能开发语言 OAK相机 depthai

本文链接：https://blog.csdn.net/oakchina/article/details/128100017

版权

OAK深度相机使用教程专栏收录该内容

73 篇文章 46 订阅

订阅专栏

消息快播：OpenCV众筹了一款ROS2机器人rae，开源、功能强、上手简单。来瞅瞅~

编辑：OAK中国
首发：oakchina.cn
喜欢的话，请多多👍⭐️✍
内容可能会不定期更新，官网内容都是最新的，请查看首发地址链接。

▌前言

Hello，大家好，这里是OAK中国，我是助手君。

最近在知乎看到有朋友写了depthai python接口相关的内容，内容非常不错。我整理了一下，分享给大家。

本系列一共四篇博客，原文出处：石满@知乎。

▌Nodes API

在这里插入图片描述

在DepthAI中，每个节点提供一个特定的功能，一组可配置的属性和输入输出。

每个节点有零个、一个或多个输入输出。

Node input

节点输入队列是一个消息队列，它可以连接其他节点的输出。如果输入节点是堵塞模式，当节点输入队列已满时，摄像机上的新消息将无法输入到输入队列中，这就意味着相机将会堵塞并等待发送其消息，直到它可以推送信息到输入队列中。如果相机的previce连接多个输入时，这意味着相同的行为，消息将会依此被推送到每个输入队列中。节点不是堵塞状态，新消息将代替旧消息，这将会消除管道冻结的危险性，但是会丢失部分信息。

Node output

节点输出消息，一些节点有一个可配置的输出信息池。当节点创建一个输出消息时，将会发送消息给特定的节点。池的大小指定了创建和发送消息的对少当其他消息已经在管道当中时。

ColorCamera node

ColorCamera节点是影像帧的来源，通过InputControl和InputConfig在运行的时候控制

pipeline = dai.Pipeline()
cam = pipeline.create(dai.node.ColorCamera)

cam.setPreviewSize(300, 300)
cam.setBoardSocket(dai.CameraBoardSocket.RGB)
cam.setResolution(dai.ColorCameraProperties.SensorResolution.THE_1080_P)
cam.setInterleaved(False)
cam.setColorOrder(dai.ColorCameraProperties.ColorOrder.RGB)

EdgeDetector node

边缘检测通过Sobel滤波器来创建强调边缘的影像

pipeline = dai.Pipeline()
edgeDetector = pipeline.create(dai.node.EdgeDetector)

sobelHorizontalKernel = [[1, 0, -1], [2, 0, -2], [1, 0, -1]]
sobelVerticalKernel = [[1, 2, 1], [0, 0, 0], [-1, -2, -1]]
edgeDetector.initialConfig.setSobelFilterKernels(sobelHorizontalKernel, sobelVerticalKernel)

FeatureTracker node

特征跟踪检测在一帧影像上的关键点并在下一帧追踪它们。主要通过Harris score or Shi-Tomasi获取有效特征。默认特征个数为320，默认最大特征个数为480，支持720p和480p两个分辨率。

pipeline = dai.Pipeline()
featureTracker = pipeline.create(dai.node.FeatureTracker)

# Set number of shaves and number of memory slices to maximum
featureTracker.setHardwareResources(2, 2)
# Specify to wait until configuration message arrives to inputConfig Input.
featureTracker.setWaitForConfigInput(True)

# You have to use Feature tracker in combination with
# an image frame source - mono/color camera or xlinkIn node

ImageManip node

ImageManip节点用于裁切，旋转矩形区域或者执行不同的影像转换：旋转、镜面、翻转、透视变换等。

pipeline = dai.Pipeline()
manip = pipeline.create(dai.node.ImageManip)

manip.initialConfig.setResize(300, 300)
manip.initialConfig.setFrameType(dai.ImgFrame.Type.BGR888p)

IMU node

内部测试单元节点可以从设备的IMU芯片上获取数据。深度学习设备使用BNO086 9 轴传感器，支持在IMU芯片上的传感器融合。IMU芯片通过SPI和Myraid X(VPU)连结。

pipeline = dai.Pipeline()
imu = pipeline.create(dai.node.IMU)

# enable RAW_ACCELEROMETER and RAW_GYROSCOPE at 100 hz rate
imu.enableIMUSensor([dai.IMUSensor.RAW_ACCELEROMETER, dai.IMUSensor.RAW_GYROSCOPE], 100)
# above this threshold packets will be sent in batch of X, if the host is not blocked and USB bandwidth is available
imu.setBatchReportThreshold(1)
# maximum number of IMU packets in a batch, if it's reached device will block sending until host can receive it
# if lower or equal to batchReportThreshold then the sending is always blocking on device
# useful to reduce device's CPU load  and number of lost packets, if CPU load is high on device side due to multiple nodes
imu.setMaxBatchReports(10)

MobileNetDetectionNetwork node

MobileNet Detection 节点是一个类似于神经网络NeuralNetwork节点。不同的是这个节点是具体的MobileNet神经网络，它解码了神经网络的输出，这以为着这个节点的out不是一个字节数组而是图像检测的结果，方便被代码使用。

pipeline = dai.Pipeline()
mobilenetDet = pipeline.create(dai.node.MobileNetDetectionNetwork)

mobilenetDet.setConfidenceThreshold(0.5)
mobilenetDet.setBlobPath(nnBlobPath)
mobilenetDet.setNumInferenceThreads(2)
mobilenetDet.input.setBlocking(False)

MobileNetSpatialDetectionNetwork node

是MobileNet网络的空间检测，类似于神经网络检测（MobileNetDetectionNetwork）和空间定位（SpatialLocationCalculator）的一个结合。

pipeline = dai.Pipeline()
mobilenetSpatial = pipeline.create(dai.node.MobileNetSpatialDetectionNetwork)

mobilenetSpatial.setBlobPath(nnBlobPath)
# Will ingore all detections whose confidence is below 50%
mobilenetSpatial.setConfidenceThreshold(0.5)
mobilenetSpatial.input.setBlocking(False)
# How big the ROI will be (smaller value can provide a more stable reading)
mobilenetSpatial.setBoundingBoxScaleFactor(0.5)
# Min/Max threshold. Values out of range will be set to 0 (invalid)
mobilenetSpatial.setDepthLowerThreshold(100)
mobilenetSpatial.setDepthUpperThreshold(5000)

# Link depth from the StereoDepth node
stereo.depth.link(mobilenetSpatial.inputDepth)

MonoCamera node

MonoCamera节点是图像帧的来源，可以通过inputControl来控制，一些深度学习的模型没有单目相机。两个单目相机可被用于计算立体深度（通过StereoDepth节点）。

pipeline = dai.Pipeline()
mono = pipeline.create(dai.node.MonoCamera)
mono.setBoardSocket(dai.CameraBoardSocket.RIGHT)
mono.setResolution(dai.MonoCameraProperties.SensorResolution.THE_720_P)

NeuralNetwork node

节点通过输入数据进行神经推理。所有的OpenVINO神经网络都可以通过这个节点运行，只用VPU支持的层。支持从200+Open Model Zoo和DepthAI Model Zoo预训练模型中直接跑它。

神经网络.blob格式可以和VPU兼容。可将自定义的网络模型转换成.blob格式。

pipeline = dai.Pipeline()
nn = pipeline.create(dai.node.NeuralNetwork)
nn.setBlobPath(bbBlobPath)
cam.out.link(nn.input)

# Send NN out to the host via XLink
nnXout = pipeline.create(dai.node.XLinkOut)
nnXout.setStreamName("nn")
nn.out.link(nnXout.input)

with dai.Device(pipeline) as device:
  qNn = device.getOutputQueue("nn")

  nnData = qNn.get() # Blocking

  # NN can output from multiple layers. Print all layer names:
  print(nnData.getAllLayerNames())

  # Get layer named "Layer1_FP16" as FP16
  layer1Data = nnData.getLayerFp16("Layer1_FP16")

  # You can now decode the output of your NN

直通机制

当节点将它的输入指定为非阻塞时，消息会被覆盖，这时直通机制非常有效果。这里我们节点在那条信息上进行的处理（例如nn，推理的第25帧或者跳过了25帧在26帧进行了推理）。这就意味着如果xlink和主机队列正在堵塞，我们获取了直通和输出，我们可以进行一个堵塞处理来获取这两个队列，而且确保这两个队列获得匹配的帧。他们可能不是同时到达，但是它们都会到达，且在正确的位置排队等待被一起取出。

ObjectTracker node

对象跟踪节点从影像中通过kalman滤波器和hungarian算法跟踪目标对象。

pipeline = dai.Pipeline()
objectTracker = pipeline.create(dai.node.ObjectTracker)

objectTracker.setDetectionLabelsToTrack([15])  # Track only person
# Possible tracking types: ZERO_TERM_COLOR_HISTOGRAM, ZERO_TERM_IMAGELESS, SHORT_TERM_IMAGELESS, SHORT_TERM_KCF
objectTracker.setTrackerType(dai.TrackerType.ZERO_TERM_COLOR_HISTOGRAM)
# Take the smallest ID when new object is tracked, possible options: SMALLEST_ID, UNIQUE_ID
objectTracker.setTrackerIdAssignmentPolicy(dai.TrackerIdAssignmentPolicy.SMALLEST_ID)

# You have to use Object tracker in combination with detection network
# and an image frame source - mono/color camera or xlinkIn node

Zero term tracking

Zero term tracking 展示了对象的联系，这意味着它不基于先前跟踪历史来进行预测和跟踪。目标联系意味着从映射了跟踪目标的外部检测器（已经从之前的帧中检测并跟踪）来检测物体。

Short term tracking

Short term tracking允许跟踪帧间的目标，因此减少了在每帧上允许目标检测。这适用于无法达到30fps的神经网络模型，例如yolov5。tracker可以在没有推理的前提下提供tracklets，所以整个系统可以达到30fps的速度。

Supported object tracker types

SHORT_TEAM_KCF：内核相关性滤波跟踪。KCF用循环矩阵属性来提升程序的速度
SHORT_TEAM_INAGELESS：当对象检测被跳过，允许在帧间跟踪对象通过在先前的检测中外推对象弹道
ZERO_TERM_COLOR_HISTOGRAM：利用位置，形状、输入影像信息例如RGB直方图来执行目标跟踪
ZERO_TERM_IMAGELESS：只利用检测对象的矩形形状和位置信息来进行目标跟踪。它不用颜色信息来进行目标跟踪。比ZERO_TERM_COLOR_HISTOGRAM获得更高的吞吐量。使用者需要权衡吞吐量和准确率当选择目标跟踪类型时。

Maximum number of tracked objects

short_team_kcf一次性可跟踪60个以内的对象，其他的跟踪类型理论上一次性可跟踪1000个对象。

Script node

允许在设备上使用传统的PYTHON脚本。由于计算资源的约束，脚本节点不能用于大型的计算（例如影像的处理/CV），而是用于管理通道的流程。例如使用控制节点ImageManip,ColorCamera,SpatialLocationCalculator，解码神经网络的结果，或者与GPIOs的接口。对于调试脚本，建议使用Script node logging。

script = pipeline.create(dai.node.Script)
script.setScript("""
    import time
    import marshal
    num = 123
    node.warn(f"Number {num}") # Print to host
    x = [1, "Hello", {"Foo": "Bar"}]
    x_serial = marshal.dumps(x)
    b = Buffer(len(x_serial))
    while True:
        time.sleep(1)
        b.setData(x_serial)
        node.io['out'].send(b)
""")
script.outputs['out'].link(xout.input)

# ...
# After initializing the device, enable log levels
device.setLogLevel(dai.LogLevel.WARN)
device.setLogOutputLevel(dai.LogLevel.WARN)

SpatialLocationCaculator node

空间位置计算节点计算感兴趣区域的空间坐标，基于深度影像。它会平均ROI的深度值，删除超出范围的值。也可以计算在主机端的空间坐标。

pipeline = dai.Pipeline()
spatialCalc = pipeline.create(dai.node.SpatialLocationCalculator)
spatialCalc.setWaitForConfigInput(False)

# Set initial config
config = dai.SpatialLocationCalculatorConfigData()
config.depthThresholds.lowerThreshold = 100
config.depthThresholds.upperThreshold = 10000

topLeft = dai.Point2f(0.4, 0.4)
bottomRight = dai.Point2f(0.6, 0.6)
config.roi = dai.Rect(topLeft, bottomRight)

spatial_calc.initialConfig.addROI(config)

# You can later send configs from the host (XLinkIn) / Script node to the InputConfig

SPIIn node

获取来自MCU的数据（通过SPI）。可以通过MCU控制ColorCamer或ImageManip或通过MCU向脚本节点发送一个Buffer。

SPIOut被用来从YPU向MCU发送数据（通过SPI）。

pipeline = dai.Pipeline()
spi = pipeline.create(dai.node.SPIIn)

spi.setStreamName("control")
spi.setBusId(0)

SPIOut node

用于传输消息到单片机。

pipeline = dai.Pipeline()
spi = pipeline.create(dai.node.SPIOut)

spi.setStreamName("spimetaout")
spi.setBusId(0)

StereoDepth node

通过两个单目相机计算差距或深度。

pipeline = dai.Pipeline()
stereo = pipeline.create(dai.node.StereoDepth)

# Better handling for occlusions:
stereo.setLeftRightCheck(False)
# Closer-in minimum depth, disparity range is doubled:
stereo.setExtendedDisparity(False)
# Better accuracy for longer distance, fractional disparity 32-levels:
stereo.setSubpixel(False)

# Define and configure MonoCamera nodes beforehand
left.out.link(stereo.left)
right.out.link(stereo.right)

SystemLogger node

获取设备系统信息。

pipeline = dai.Pipeline()
logger = pipeline.create(dai.node.SystemLogger)

logger.setRate(1)  # 1 Hz

# Send system info to the host via XLink
xout = pipeline.create(dai.node.XLinkOut)
xout.setStreamName("sysinfo")
logger.out.link(xout.input)

VideoEncoder node

将image frames 编码成H264/H265/JPEG格式。从设备上编码比特流，直接保存.mp4容器不需要在主机上做额外的计算。

pipeline = dai.Pipeline()

# Create ColorCamera beforehand
# Set H265 encoding for the ColorCamera video output
videoEncoder = pipeline.create(dai.node.VideoEncoder)
videoEncoder.setDefaultProfilePreset(cam.getVideoSize(), cam.getFps(), dai.VideoEncoderProperties.Profile.H265_MAIN)

# Create MJPEG encoding for still images
stillEncoder = pipeline.create(dai.node.VideoEncoder)
stillEncoder.setDefaultProfilePreset(cam.getStillSize(), 1, dai.VideoEncoderProperties.Profile.MJPEG)

cam.still.link(stillEncoder.input)
cam.video.link(videoEncoder.input)

Limitations限制

对于H.264/H.265编码，有以下限制：对于编码器每秒有248百万个像素或在30fps上有3840*2160个像素。分辨率和帧的速率可以分给多个流，但每秒所有像素的个数不超过248百万个像素。由于HW的限制，视频编码只能在宽度值为32的倍数的帧上进行-帧的宽度的最大值为4096个像素，最多三个编码流并行。

MJPEG编码器以500M像素每秒的速度可处理163848192分辨率的，在我们的测试中，可以在30fps处理4K，55fps下处理2800p。

编码器的处理资源在H.26X和JPEG上共享。

XLinkIn node

通过XLink从主机上传输数据到设备上。

pipeline = dai.Pipeline()
xIn = pipeline.create(dai.node.XLinkIn)
xIn.setStreamName("camControl")

# Create ColorCamera beforehand
xIn.out.link(cam.inputControl)

with dai.Device(pipeline) as device:
  device.startPipeline()
  qCamControl = device.getInputQueue("camControl")

  # Send a message to the ColorCamera to capture a still image
  ctrl = dai.CameraControl()
  ctrl.setCaptureStill(True)
  qCamControl.send(ctrl)

XLinkOut node

通过XLink从设备上传输数据到主机上。

pipeline = dai.Pipeline()
xOut = pipeline.create(dai.node.XLinkOut)
xOut.setStreamName("camOut")

# Here we will send camera preview (ImgFrame) to the host via XLink.
# Host can then display the frame to the user
cam.preview.link(xOut.input)

YoloDetectionNetwork node

类似于NeuralNetwork。不同的是，这个节点是特定的yoloV3/V4的神经网络，它在设备上对神经网络的结果进行了解码，这就意味着这个节点的out不是一个NNData，而是ImgDetection，更容易被代码使用。

pipeline = dai.Pipeline()
yoloDet = pipeline.create(dai.node.YoloDetectionNetwork)
yoloDet.setBlobPath(nnBlobPath)

# Yolo specific parameters
yoloDet.setConfidenceThreshold(0.5)
yoloDet.setNumClasses(80)
yoloDet.setCoordinateSize(4)
yoloDet.setAnchors([10,14, 23,27, 37,58, 81,82, 135,169, 344,319])
yoloDet.setAnchorMasks({"side26": [1, 2, 3], "side13": [3, 4, 5]})
yoloDet.setIouThreshold(0.5)

YoloSpatialDetectionNetwork node

对YOLO NN网络进行空间检测，类似于YoloDetectionNetworkandSpatialLocationCalculator的一个集合。

pipeline = dai.Pipeline()
yoloSpatial = pipeline.create(dai.node.YoloSpatialDetectionNetwork)
yoloSpatial.setBlobPath(nnBlobPath)

# Spatial detection specific parameters
yoloSpatial.setConfidenceThreshold(0.5)
yoloSpatial.input.setBlocking(False)
yoloSpatial.setBoundingBoxScaleFactor(0.5)
yoloSpatial.setDepthLowerThreshold(100) # Min 10 centimeters
yoloSpatial.setDepthUpperThreshold(5000) # Max 5 meters

# Yolo specific parameters
yoloSpatial.setNumClasses(80)
yoloSpatial.setCoordinateSize(4)
yoloSpatial.setAnchors([10,14, 23,27, 37,58, 81,82, 135,169, 344,319])
yoloSpatial.setAnchorMasks({ "side26": [1,2,3], "side13": [3,4,5] })
yoloSpatial.setIouThreshold(0.5)