光流估计原理 + realsense435i相机测试

最新推荐文章于 2024-04-17 16:24:57 发布

qq_28470381

最新推荐文章于 2024-04-17 16:24:57 发布

阅读量1.3k

点赞数

文章标签：人工智能计算机视觉深度学习

本文链接：https://blog.csdn.net/qq_28470381/article/details/126363208

版权

光流是由观察者和场景之间的[相对运动]引起的视觉场景中物体、表面和边缘的运动模式。一般而言，光流是由于场景中前景目标本身的移动、观测者运动，或者两者的共同运动所产生的。在SLAM中，多是观察者，也就是相机运动。利用光流跟踪找到对应的2D-2D点对，以便后续进行对应的相机姿态的解算。

LK光流估计

Lucas–Kanade光流算法是一种两帧差分的光流估计算法，算法前提是三个假设：

1. 亮度不变假设：

$I(x,y,t)=I(x+\Delta x,y+\Delta y,t+\Delta t)$

上式根据一阶泰勒展开式 $f(a+\Delta x)=f(a)+f{}'(a)\Delta x$ 可得：

$I(x+\Delta x,y+\Delta y,t+\Delta t)=I(x,y,t)+I^{_{x}^{{}'}}\Delta x+I^{_{y}^{{}'}}\Delta y+I^{_{t}^{{}'}}$

则： $[I^{_{x}^{{}'}} I^{_{y}^{{}'}}][\Delta x \Delta y]^{^{T}}=-I^{_{t}^{{}'}}$

其中 $I_{x}^{'}=\frac{\partial I}{\partial x}$ , $I_{y}^{'}=\frac{\partial I}{\partial y}$ , $I_{t}^{'}=\frac{\partial I}{\partial t}$ , $u=\Delta x$ , $v=\Delta y$ , $\Delta t=1$ ，则包含两个未知数。要想求解未知数，需要更多的方程。

2.时间连续小位移假设 + 空间一致性假设

这两个假设将像素点的运动扩展到邻域内的像素运动。例如在5*5的像素邻域内，可得：

$\begin{bmatrix} I_{x1}^{'} & I_{y1}^{'}\\ I_{x2}^{'} & I_{y2}^{'}\\ I_{x3}^{'} & I_{y3}^{'}\\ ... & ... \end{bmatrix}\begin{bmatrix} \Delta x\\\Delta y \end{bmatrix}=-\begin{bmatrix} I^{_{t1}^{'}}\\ I^{_{t2}^{'}}\\ I^{_{t3}^{'}}\\ ... \end{bmatrix}$ 简化为 $Au=b$

则上述等式可用最小二乘法求解，对于邻域内所有的像素点，求解等式：

$min\left \| Au-b \right \|$

但邻域内像素处于不同的位置，我们希望邻域越靠近中心的位置起到的作用越大。

则 $min\left \| w(Au-b)) \right \|$

由最小二乘解公式可得： $u=(A^{^{T}}W^{^{2}}A)^{^{-1}}A^{^{T}}W^{^{2}}b$

其中 $A^{^{T}}A=\begin{bmatrix} \sum f^{_{x}^{2}} & \sum f^{_{x}}f^{_{y}}\\ \sum f^{_{x}}f^{_{y}} & \sum f^{_{y}^{2}} \end{bmatrix}=\sum \begin{bmatrix} f_{x}\\ f_{y}\ \end{bmatrix}\begin{bmatrix} f_{x} & f_{y} \end{bmatrix}=\sum \bigtriangledown f(\bigtriangledown f)^{T}$

若像素灰度值变换平缓，则像素偏导可能是0，则导致上述矩阵不可逆，进而导致最小二乘解失效。则我们进行光流跟踪时是使用灰度值变换明显的角点进行跟踪。

在realsense 435i 进行光流跟踪实验：

# read.py

import pyrealsense2 as rs
import numpy as np
import cv2
from PIL import Image

ctx = rs.context()
if len(ctx.devices) > 0:
    for d in ctx.devices:
        print('Found device: ',
              d.get_info(rs.camera_info.name), ' ',
              d.get_info(rs.camera_info.serial_number))
else:
    print("No Intel Device connected")

num = 0
path = "./save/"

pipeline = rs.pipeline()
config = rs.config()
w = 640
h = 480
fps = 15
imu_pipeline = rs.pipeline()
imu_config = rs.config()
config.enable_stream(rs.stream.depth, w, h, rs.format.z16, fps)
config.enable_stream(rs.stream.color, w, h, rs.format.bgr8, fps)
config.enable_stream(rs.stream.infrared, 1, w, h, rs.format.y8, fps)
config.enable_stream(rs.stream.infrared, 2, w, h, rs.format.y8, fps)

pipeline_profile = pipeline.start(config)

sensor = pipeline.get_active_profile().get_device().query_sensors()[0]
ss = sensor.get_supported_options()
for s in ss:
    print(s)
sensor.set_option(rs.option.emitter_enabled, 0)
print('option.stereo_baseline: ', sensor.get_option(rs.option.stereo_baseline))
default_align = rs.align(rs.stream.color)

feature_params = dict(maxCorners=100, qualityLevel=0.2,
                      minDistance=10, blockSize=5)
lk_params = dict(winSize=(15, 15), maxLevel=8,
                 criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))

frame_count = 0
try:
    while True:

        frames = pipeline.wait_for_frames()
        aligned_frame = default_align.process(frames)
        # depth_frame = aligned_frame.get_depth_frame()  # depth
        color_frame = aligned_frame.get_color_frame()  # color
        # intr = color_frame.profile.as_video_stream_profile().intrinsics
        # intr_matrix = np.array([
        #     [intr.fx, 0, intr.ppx], [0, intr.fy, intr.ppy], [0, 0, 1]
        # ])
        # print(intr_matrix)

        # ir_frame_left = aligned_frame.get_infrared_frame(1)  #
        # ir_frame_right = aligned_frame.get_infrared_frame(2)  #

        # if not depth_frame or not color_frame:
        #     continue
        color_image = np.asanyarray(color_frame.get_data())
        frame_gray0 = cv2.cvtColor(color_image, cv2.COLOR_BGR2GRAY)
        # Convert images to numpy arrays
        # depth_image = np.asanyarray(depth_frame.get_data())
        # color_image = np.asanyarray(color_frame.get_data())
        if frame_count == 0:
            p0 = cv2.goodFeaturesToTrack(frame_gray0, mask=None, **feature_params)
            print("FeatureNumber: " + str(len(p0)))
            frame_count = 1
            last_frame = frame_gray0
            continue
        frame_count = frame_count + 1

        p1, st, err = cv2.calcOpticalFlowPyrLK(frame_gray0, last_frame, p0, None, **lk_params)  # p0是之前的特征点 p1是之后的特征点
        good_new = p1[st == 1]
        good_old = p0[st == 1]
        mask = np.zeros_like(color_image)
        for ii, (new, old) in enumerate(zip(good_new, good_old)):
            a, b = new.ravel()
            c, d = old.ravel()
            cv2.circle(color_image, (int(a), int(b)), 3, (255, 255, 0), -1)
        plot_p0 = good_new.copy()
        # gray = frame_gray.copy()
        p0 = good_new.reshape(-1, 1, 2)

        img = cv2.add(color_image, mask)
        cv2.imshow('1', img)
        last_frame = frame_gray0

        # 是否成功新提取跟踪的特征点

        key = cv2.waitKey(1)
        # Press esc or 'q' to close the image window
        if key & 0xFF == ord('q') or key == 27:
            cv2.destroyAllWindows()
            break
        if num > 1000:
            break
        num += 1
finally:
    # Stop streaming
    pipeline.stop()