optical flow（OFM）简介

最新推荐文章于 2024-09-12 01:15:00 发布

C小C

最新推荐文章于 2024-09-12 01:15:00 发布

阅读量2.3k

点赞数 2

分类专栏： OpenCV 文章标签： OFM 光流 calcOpticalFlowFarneback optical flow

本文链接：https://blog.csdn.net/C_chuxin/article/details/82867019

版权

OpenCV 专栏收录该内容

6 篇文章 1 订阅

订阅专栏

【时间】2018.09.27

【题目】optical flow（OFM）简介

【参考文献】https://blog.csdn.net/zouxy09/article/details/8683859。

概述

今天在看一篇论文《Integration of image quality and motion cues for face anti-spoofing: A neural network approach》，里面提到了OFM（ Optical flow-based motion feature）的概念，于是在网上查询各种资料，整理如下。如有错误，望君指正。

一、光流（Optical Flow）的概念

【通俗理解】

光流（optic flow）是什么呢？名字很专业，感觉很陌生，但本质上，我们是最熟悉不过的了。因为这种视觉现象我们每天都在经历。从本质上说，光流就是你在这个运动着的世界里感觉到的明显的视觉运动。

这就给我们提供了一个挺有意思的信息：通过不同目标的运动速度判断它们与我们的距离。一些比较远的目标，例如云、山，它们移动很慢，感觉就像静止一样。但一些离得比较近的物体，例如建筑和树，就比较快的往后退，然后离我们的距离越近，它们往后退的速度越快。一些非常近的物体，例如路面的标记啊，草地啊等等，快到好像在我们耳旁发出嗖嗖的声音。

光流除了提供远近外，还可以提供角度信息。与咱们的眼睛正对着的方向成90度方向运动的物体速度要比其他角度的快，当小到0度的时候，也就是物体朝着我们的方向直接撞过来，我们就是感受不到它的运动（光流）了，看起来好像是静止的。当它离我们越近，就越来越大（当然了，我们平时看到感觉还是有速度的，因为物体较大，它的边缘还是和我们人眼具有大于0的角度的）。

【官方定义】

光流的概念是Gibson在1950年首先提出来的。它是空间运动物体在观察成像平面上的像素运动的瞬时速度，是利用图像序列中像素在时间域上的变化以及相邻帧之间的相关性来找到上一帧跟当前帧之间存在的对应关系，从而计算出相邻帧之间物体的运动信息的一种方法。一般而言，光流是由于场景中前景目标本身的移动、相机的运动，或者两者的共同运动所产生的。

当人的眼睛观察运动物体时，物体的景象在人眼的视网膜上形成一系列连续变化的图像，这一系列连续变化的信息不断“流过”视网膜（即图像平面），好像一种光的“流”，故称之为光流（optical flow）。光流表达了图像的变化，由于它包含了目标运动的信息，因此可被观察者用来确定目标的运动情况。

研究光流场的目的就是为了从图片序列中近似得到不能直接得到的运动场。运动场，其实就是物体在三维真实世界中的运动；光流场，是运动场在二维图像平面上（人的眼睛或者摄像头）的投影。

【形象理解】

通俗的讲就是通过一个图片序列，把每张图像中每个像素的运动速度和运动方向找出来就是光流场。那怎么找呢？咱们直观理解肯定是：第t帧的时候A点的位置是(x1, y1)，那么我们在第t+1帧的时候再找到A点，假如它的位置是(x2,y2)，那么我们就可以确定A点的运动了：(ux, vy) = (x2, y2) - (x1,y1)。

【计算光流的方法】

1981年，Horn和Schunck创造性地将二维速度场与灰度相联系，引入光流约束方程，得到光流计算的基本算法。人们基于不同的理论基础提出各种光流计算方法，算法性能各有不同。Barron等人对多种光流计算技术进行了总结，按照理论基础与数学方法的区别把它们分成四种：基于梯度的方法、基于匹配的方法、基于能量的方法、基于相位的方法。近年来神经动力学方法也颇受学者重视。

二、光流算法

OpenCV中实现了不少的光流算法。

1）calcOpticalFlowPyrLK

通过金字塔（Pyramid）Lucas-Kanade 光流方法计算某些点集的光流（稀疏光流）。理解的话，可以参考这篇论文：”Pyramidal Implementation of the Lucas Kanade Feature TrackerDescription of the algorithm”

2）calcOpticalFlowFarneback

用Gunnar Farneback 的算法计算稠密光流（即图像上所有像素点的光流都计算出来）。它的相关论文是："Two-Frame Motion Estimation Based on PolynomialExpansion"

3）CalcOpticalFlowBM

通过块匹配（Block matching）的方法来计算光流。

4）CalcOpticalFlowHS

用Horn-Schunck 的算法计算稠密光流（Dense optical flow）。相关论文好像是这篇：”Determining Optical Flow”

5）calcOpticalFlowSF

这一个是2012年欧洲视觉会议的一篇文章的实现："SimpleFlow: A Non-iterative, Sublinear Optical FlowAlgorithm"，工程网站是：http://graphics.berkeley.edu/papers/Tao-SAN-2012-05/ 在OpenCV新版本中有引入。

补充说明：稠密光流需要使用某种插值方法在比较容易跟踪的像素之间进行插值以解决那些运动不明确的像素，所以它的计算开销是相当大的。而对于稀疏光流来说，在他计算时需要在被跟踪之前指定一组点（容易跟踪的点，例如角点），因此在使用LK方法之前我们需要配合使用cvGoodFeatureToTrack()来寻找角点，然后利用金字塔LK光流算法，对运动进行跟踪。但个人感觉，对于少纹理的目标，例如人手，LK稀疏光流就比较容易跟丢。

【API的使用】

至于他们的API的使用说明，我们直接参考OpenCV的官方手册就行：

http://www.opencv.org.cn/opencvdoc/2.3.2/html/modules/video/doc/motion_analysis_and_object_tracking.html

以calcOpticalFlowFarneback为例：

Computes a dense optical flow using the Gunnar Farneback’s algorithm.

C++: void calcOpticalFlowFarneback(InputArray prevImg, InputArray nextImg, InputOutputArray flow, double pyrScale, int levels, int winsize, int iterations, int polyN, double polySigma, int flags)¶

C: void cvCalcOpticalFlowFarneback(const CvArr* prevImg, const CvArr* nextImg, CvArr* flow, double pyrScale, int levels, int winsize, int iterations, int polyN, double polySigma, int flags)¶

Python: cv2.calcOpticalFlowFarneback(prevImg, nextImg, pyr_scale, levels, winsize, iterations, poly_n, poly_sigma, flags[, flow]) → flow¶

Parameters:

prevImg – First 8-bit single-channel input image.
nextImg – Second input image of the same size and the same type as prevImg .
flow – Computed flow image that has the same size as prevImg and type CV_32FC2 .
pyrScale – Parameter specifying the image scale (<1) to build pyramids for each image. pyrScale=0.5 means a classical pyramid, where each next layer is twice smaller than the previous one.
levels – Number of pyramid layers including the initial image. levels=1 means that no extra layers are created and only the original images are used.
winsize – Averaging window size. Larger values increase the algorithm robustness to image noise and give more chances for fast motion detection, but yield more blurred motion field.
iterations – Number of iterations the algorithm does at each pyramid level.
polyN – Size of the pixel neighborhood used to find polynomial expansion in each pixel. Larger values mean that the image will be approximated with smoother surfaces, yielding more robust algorithm and more blurred motion field. Typically, polyN =5 or 7.
polySigma – Standard deviation of the Gaussian that is used to smooth derivatives used as a basis for the polynomial expansion. For polyN=5 , you can set polySigma=1.1 . For polyN=7 , a good value would be polySigma=1.5 .
flags –

Operation flags that can be a combination of the following:
- OPTFLOW_USE_INITIAL_FLOW Use the input flow as an initial flow approximation.
- OPTFLOW_FARNEBACK_GAUSSIAN Use the Gaussian filter instead of a box filter of the same size for optical flow estimation. Usually, this option gives z more accurate flow than with a box filter, at the cost of lower speed. Normally, winsize for a Gaussian window should be set to a larger value to achieve the same level of robustness.