研究光流场目的就是为了从图片序列中近似得到不能直接得到的运动场。运动场,其实就是物体在三维真实世界中的运动;光流场,是运动场在二维图像平面上(人的眼睛或者摄像头)的投影。通俗的讲,通过一个图片序列,把每张图像中每个像素的运动速度和运动方向找出来就是光流场。那怎么找呢?直观理解肯定是:第t帧的时候A点的位置是(x1, y1),那么我们在第t+1帧的时候再找到A点,假如它的位置是(x2,y2),那么我们就可以确定A点的运动了:(ux, vy) = (x2, y2) - (x1,y1)


  1. 基于梯度的方法:该类方法是建立在图像亮度为常数的假设基础之上的,利用序列图像亮度的时空梯度函数来计算2D速度场(光流)。由于计算简单而且效果比较好,该方法成为使用最广泛的一种光流估计方法,此类方法的最具代表性的是Horn-Schunck光流法,它计算出的光流场是在光流基本方程的基础上引入了另外一个约束条件,即全局光流平滑约束假设。后来人们根据这种思想又提出了大量的改进算法。基于梯度的光流法在使用中存在一些问题:
  2. 基于匹配的方法:包括基于特征和基于区域两种方法。基于特征的方法不断地对目标主要特征进行定位和跟踪,对大目标的运动和亮度变化具有鲁棒性。存在问题是光流通常很稀疏,且特征提取和精确匹配也十分困难。基于区域的方法先对类似的区域进行定位,然后通过相似区域的位移计算光流。这种方法在视频编码中得到了广泛的应用,然而它计算的光流仍不稠密。
  3. 基于频域(能量)的方法:在使用该类方法的过程中,要获得均匀流场准确的速度估计,就必须对输入图像进行时空滤波处理,即对时间和空间整合,但是这样会降低光流的时间和空间分辨率。基于频率的方法往往会涉及大量的计算,另外,要进行可靠性评价也比较困难。
  4. 基于相位的方法:由FleetJepson提出的,FleetJepson最先提出将相位信息用于光流计算的思想。当我们计算光流的时候,相比亮度信息,图像的相位信息更加可靠,所以利用相位信息获得的光流场具有更好的鲁棒性。基于相位的光流算法的优点是:对图像序列的适用范围较宽,而且速度估计比较精确,但也存在着一些问题:
  5. 神经动力学方法:利用神经网络建立的视觉运动感知的神经动力学模型,是对生物视觉系统功能与结构比较直接的模拟。








  1. 亮度恒定,就是同一点随着时间的变化,其亮度不会发生改变。这是基本光流法的假定(所有光流法变种都必须满足),用于得到光流法基本方程;
  2. 小运动,这个也必须满足,就是时间的变化不会引起位置的剧烈变化,这样灰度才能对位置求偏导(换句话说,小运动情况下我们才能用前后帧之间单位位置变化引起的灰度变化去近似灰度对位置的偏导数),这也是光流法不可或缺的假定;
  3. 空间一致,一个场景上邻近的点投影到图像上也是邻近点,且邻近点速度一致。这是Lucas-Kanade光流法特有的假定,因为光流法基本方程约束只有一个,而要求x,y方向的速度,有两个未知变量。我们假定特征点邻域内做相似运动,就可以连立n多个方程求取x,y方向的速度(n为特征点邻域总点数,包括该特征点)。


  1. 对一个连续的视频帧序列进行处理;
  2. 针对每一个视频序列,利用一定的目标检测方法,检测可能出现的前景目标;
  3. 如果某一帧出现了前景目标,找到其具有代表性的关键特征点(可以随机产生,也可以利用角点来做特征点);
  4. 对之后的任意两个相邻视频帧而言,寻找上一帧中出现的关键特征点在当前帧中的最佳位置,从而得到前景目标在当前帧中的位置坐标;
  5. 如此迭代进行,便可实现目标的跟踪;





  1. 亮度恒定不变。目标像素在不同帧间运动时外观上是保持不变的,对于灰度图像,假设在整个被跟踪期间,像素亮度不变;
  2. 时间连续或者运动是“小运动”。图像运动相对于时间来说比较缓慢,实际应用中指时间变化相对图像中运动比例要足够小,这样目标在相邻帧间的运动就比较小;
  3. 空间一致。同一场景中同一表面上的邻近点运动情况相似,且这些点在图像上的投影也在邻近区域。


对光流算法的研究,最早可追溯到二十世纪五十年代,GibsonWallach等学者提出的SFMStructure From Motion)假设,即以心理学实验为基础,开创性的提出从二维平面的光流场可以恢复到三维空间运动参数和结构参数的假设,但该假设直到七十年代末才有ULLman等学者验证该假设。真正提出有效光流计算方法还归功于HornSchunck1981年创造性地将二维速度场灰度相联系,引入光流约束方程的算法,是光流算法发展的基石。光流算法发展至今不下几十种,其中许多是基于一阶时空梯度技术,其方法不仅效率高,且易于实现,总的深入分析可将光流算法分为四类。




在光流算法快速算法研究中,Braillon提出一种光流模型检测障碍物的方法,该方法是为摄像机建立经典的针孔模型,并在针孔模型上建光流模型,并不需要计算完整的光流场;Valentinotti在并行的DSP系统上实现了基于相位的光流算法,对128×12864×64图像序列实现了快速处理;Andres Bruhn等人在对光流计算时引入了多尺度分析,提高了计算的效率;昌猛等人在Horn-Schunck算法的递归方程的基础上添加了一个惯性因子,性能不降低,然而收敛时间缩短了1/2-1/3,也极大的提高了计算效率。


在光流场计算基本公式导出过程中由于利用泰勒级数展开,实际上认为图像灰度以及亮度场变化都是连续的。然而,实际景物各个独立的表面就使光流的速度场成为非连续的,因此当光流场计算基本公式出现不连续时,是个值得讨论的问题,日本学者 Mukauwa考虑光流场计算基本等式应用泰勒级数展开后,实际上是不连续的,他引入了一个修正因子q后,很好的解决了不连续性的问题。























optical_flow_mg — Compute the optical flow between two images.


optical_flow_mg computes the optical flow between two images. The optical flow represents information about the movement between two consecutive images of a monocular image sequence. The movement in the images can be caused by objects that move in the world or by a movement of the camera (or both) between the acquisition of the two images. The projection of these 3D movements into the 2D image plane is called the optical flow.


The two consecutive images of the image sequence are passed in ImageT1 and ImageT2. The computed optical flow is returned in VectorField. The vectors in the vector field VectorField represent the movement in the image plane between ImageT1 andImageT2. The point in ImageT2 that corresponds to the point (r,c) in ImageT1 is given by (r',c') = (r+u(r,c),c+v(r,c)), where u(r,c) and v(r,c) denote the value of the row and column components of the vector field image VectorField at the point (r,c).

图像序列的两个连续图像传递到ImageT1ImageT2。计算得到的光流返回到VectorFieldVectorField向量场中的向量表示ImageT1ImageT2之间的图像平面上的运动。ImageT2中对应于ImageT1(r,c)点的点由(r'c') = (r+u(r,c)c+v(r,c))给出,其中u(r,c)v(r,c)表示VectorField的行和列分量在点(r,c)处的值。

The parameter Algorithm allows the selection of three different algorithms for computing the optical flow. All three algorithms are implemented by using multigrid solvers to ensure an efficient solution of the underlying partial differential equations.


For Algorithm = 'fdrig', the method proposed by Brox, Bruhn, Papenberg, and Weickert is used. This approach is flow-driven, robust, isotropic, and uses a gradient constancy term.

For Algorithm = 'ddraw', a robust variant of the method proposed by Nagel and Enkelmann is used. This approach is data-driven, robust, anisotropic, and uses warping (in contrast to the original approach).

For Algorithm = 'clg' the combined local-global method proposed by Bruhn, Weickert, Feddern, Kohlberger, and Schnörr is used.

对于Algorithm = 'fdrig',使用了BroxBruhnPapenbergWeickert提出的方法。该方法是流驱动的、鲁棒的、各向同性的、并使用了梯度恒常性。

对于Algorithm = 'ddraw',使用了NagelEnkelmann提出方法的鲁棒变体。这种方法是数据驱动的、鲁棒的、各向异性的,并且使用了扭曲(与原始方法相反)

对于Algorithm = 'clg'采用了BruhnWeickertFeddernKohlbergerSchnorr提出的局部-全局结合方法。

In all three algorithms, the input images can first be smoothed by a Gaussian filter with a standard deviation of SmoothingSigma (see derivate_gauss).


All three approaches are variational approaches that compute the optical flow as the minimizer of a suitable energy functional. In general, the energy functionals have the following form:

where w=(u,v,1) is the optical flow vector field to be determined (with a time step of 1 in the third coordinate). The image sequence is regarded as a continuous function f(x), where x=(r,c,t) and (r,c) denotes the position and t the time. Furthermore, denotes the data term, while  denotes the smoothness term, and α is a regularization parameter that determines the smoothness of the solution. The regularization parameter α is passed in FlowSmoothness.

While the data term encodes assumptions about the constancy of the object features in consecutive images, e.g., the constancy of the gray values or the constancy of the first spatial derivative of the gray values, the smoothness term encodes assumptions about the (piecewise) smoothness of the solution, i.e., the smoothness of the vector field to be determined.




The FDRIG algorithm is based on the minimization of an energy functional that contains the following assumptions:

Constancy of the gray values: It is assumed that corresponding pixels in consecutive images of an image sequence have the same gray value, i.e., that f(r+u,c+v,t+1) = f(r,c,t). This can be written more compactly as f(x+w) = f(x) using vector notation.

Constancy of the spatial gray value derivatives: It is assumed that corresponding pixels in consecutive images of an image sequence additionally have the same spatial gray value derivatives, i.e, that  also holds, where . This can be written more compactly as . In contrast to the gray value constancy, the gradient constancy has the advantage that it is invariant to additive global illumination changes.

Large displacements: It is assumed that large displacements, i.e., displacements larger than one pixel, occur. Under this assumption, it makes sense to consciously abstain from using the linearization of the constancy assumptions in the model that is typically proposed in the literature.

Statistical robustness in the data term: To reduce the influence of outliers, i.e., points that violate the constancy assumptions, they are penalized in a statistically robust manner, i.e., the customary non-robust quadratical penalization  is replaced by a linear penalization via , where  is a fixed regularization constant.

Preservation of discontinuities in the flow field I: The solution is assumed to be piecewise smooth. While the actual smoothness is achieved by penalizing the first derivatives of the flow , the use of a statistically robust (linear) penalty function  with  provides the desired preservation of edges in the movement in the flow field to be determined. This type of smoothness term is called flow-driven and isotropic.

Taking into account all of the above assumptions, the energy functional of the FDRIG algorithm can be written as

Here, α is the regularization parameter passed in FlowSmoothness, whileγis the gradient constancy weight passed in GradientConstancy. These two parameters, which constitute the model parameters of the FDRIG algorithm, are described in more detail below.


灰度值的恒常性:假设图像序列的连续图像中对应像素具有相同的灰度值,即f(r+u,c+v,t+1) = f(r,c,t),可以更简洁地用向量表示法写成f(x+w) = f(x)







The DDRAW algorithm is based on the minimization of an energy functional that contains the following assumptions:

Constancy of the gray values: It is assumed that corresponding pixels in consecutive images of an image sequence have the same gray value, i.e., that f(x+w) = f(x).

Large displacements: It is assumed that large displacements, i.e., displacements larger than one pixel, occur. Under this assumption, it makes sense to consciously abstain from using the linearization of the constancy assumptions in the model that is typically proposed in the literature.

Statistical robustness in the data term: To reduce the influence of outliers, i.e., points that violate the constancy assumptions, they are penalized in a statistically robust manner, i.e., the customary non-robust quadratical penalization  is replaced by a linear penalization via , where  is a fixed regularization constant.

Preservation of discontinuities in the flow field II: The solution is assumed to be piecewise smooth. In contrast to the FDRIG algorithm, which allows discontinuities everywhere, the DDRAW algorithm only allows discontinuities at the edges in the original image. Here, the local smoothness is controlled in such a way that the flow field is sharp across image edges, while it is smooth along the image edges. This type of smoothness term is called data-driven and anisotropic.

All assumptions of the DDRAW algorithm can be combined into the following energy functional:

where  is a normalized projection matrix orthogonal to , for which

holds. This matrix ensures that the smoothness of the flow field is only assumed along the image edges. In contrast, no assumption is made with respect to the smoothness across the image edges, resulting in the fact that discontinuities in the solution may occur across the image edges. In this respect,  serves as a regularization parameter that prevents the projection matrix  from becoming singular. In contrast to the FDRIG algorithm, there is only one model parameter for the DDRAW algorithm: the regularization parameter α. As mentioned above, α is described in more detail below.


灰度值的恒常性:假设图像序列的连续图像中对应像素具有相同的灰度值,即f(x+w) = f(x)








As for the two approaches described above, the CLG algorithm uses certain assumptions:

Constancy of the gray values: It is assumed that corresponding pixels in consecutive images of an image sequence have the same gray value, i.e., that f(x+w) = f(x).

Small displacements: In contrast to the two approaches above, it is assumed that only small displacements can occur, i.e., displacements in the order of a few pixels. This facilitates a linearization of the constancy assumptions in the model, and leads to the approximation , i.e.,  should hold. Here,  denotes the gradient in the spatial as well as the temporal domain.

Local constancy of the solution: Furthermore, it is assumed that the flow field to be computed is locally constant. This facilitates the integration of the image data in the data term over the respective neighborhood of each pixel. This, in turn, increases the robustness of the algorithm against noise. Mathematically, this can be achieved by reformulating the quadratic data term as. By performing a local Gaussian-weighted integration over a neighborhood specified by the ρ (passed in IntegrationSigma), the following data term is obtained: . Here,  denotes a convolution of the 3x3 matrix  with a Gaussian filter with a standard deviation of ρ (see derivate_gauss).

General smoothness of the flow field: Finally, the solution is assumed to be smooth everywhere in the image. This particular type of smoothness term is called homogeneous.

All of the above assumptions can be combined into the following energy functional:


灰度值的恒常性:假设图像序列的连续图像中对应像素具有相同的灰度值,即f(x+w) = f(x)







To compute the optical flow vector field for two consecutive images of an image sequence with the FDRIG, DDRAW, or CLG algorithm, the solution that best fulfills the assumptions of the respective algorithm must be determined. From a mathematical point of view, this means that a minimization of the above energy functionals should be performed. For the FDRIG and DDRAW algorithms, so called coarse-to-fine warping strategies play an important role in this minimization, because they enable the calculation of large displacements. Thus, they are a suitable means to handle the omission of the linearization of the constancy assumptions numerically in these two approaches.

To calculate large displacements, coarse-to-fine warping strategies use two concepts that are closely interlocked: The successive refinement of the problem (coarse-to-fine) and the successive compensation of the current image pair by already computed displacements (warping). Algorithmically, such coarse-to-fine warping strategies can be described as follows:

1. First, both images of the current image pair are zoomed down to a very coarse resolution level.

2. Then, the optical flow vector field is computed on this coarse resolution.

3. The vector field is required on the next resolution level: It is applied there to the second image of the image sequence, i.e., the problem on the finer resolution level is compensated by the already computed optical flow field. This step is also known as warping.

4. The modified problem (difference problem) is now solved on the finer resolution level, i.e., the optical flow vector field is computed there.

5. The steps 3-4 are repeated until the finest resolution level is reached.

6. The final result is computed by adding up the vector fields from all resolution levels.

This incremental computation of the optical flow vector field has the following advantage: While the coarse-to-fine strategy ensures that the displacements on the finest resolution level are very small, the warping strategy ensures that the displacements remain small for the incremental displacements (optical flow vector fields of the difference problems). Since small displacements can be computed much more accurately than larger displacements, the accuracy of the results typically increases significantly by using such a coarse-to-fine warping strategy.



1. 首先,将当前图像对的两个图像都缩小到非常粗的分辨率级别;

2. 然后,在此粗分辨率基础上计算光流向量场;

3. 向量场是下一分辨率层所需要的:它应用于图像序列的第二幅图像,即在较细分辨率水平上的问题是由已计算的光流向量场补偿。这一步也被称为扭曲;

4. 修改后的问题(差分问题)现在在较细的分辨率级别上得到解决,即计算了光流向量场;

5. 重复上述步骤34,直到达到最佳分辨率;

6. 最后的结果是通过将所有分辨率级别的向量场相加来计算的。


However, instead of having to solve a single correspondence problem, an entire hierarchy of these problems must now be solved. For the CLG algorithm, such a coarse-to-fine warping strategy is unnecessary since the model already assumes small displacements.

The maximum number of resolution levels (warping levels), the resolution ratio between two consecutive resolution levels, as well as the finest resolution level can be specified for the FDRIG as well as the DDRAW algorithm. Details can be found below.

The minimization of functionals is mathematically very closely related to the minimization of functions: Like the fact that the zero crossing of the first derivative is a necessary condition for the minimum of a function, the fulfillment of the so called Euler-Lagrange equations is a necessary condition for the minimizing function of a functional (the minimizing function corresponds to the desired optical flow vector field in this case). The Euler-Lagrange equations are partial differential equations. By discretizing these Euler-Lagrange equations using finite differences, large sparse nonlinear equation systems result for the FDRIG and DDRAW algorithms. Because coarse-to-fine warping strategies are used, such an equation system must be solved for each resolution level, i.e., for each warping level. For the CLG algorithm, a single sparse linear equation system must be solved.




To ensure that the above nonlinear equation systems can be solved efficiently, the FDRIG and DDRAW use bidirectional multigrid methods. From a numerical point of view, these strategies are among the fastest methods for solving large linear and nonlinear equation systems. In contrast to conventional nonhierarchical iterative methods, e.g., the different linear and nonlinear Gauss-Seidel variants, the multigrid methods have the advantage that corrections to the solution can be determined efficiently on coarser resolution levels. This, in turn, leads to a significantly faster convergence. The basic idea of multigrid methods additionally consists of hierarchically computing these correction steps, i.e., the computation of the error on a coarser resolution level itself uses the same strategy and efficiently computes its error (i.e., the error of the error) by correction steps on an even coarser resolution level. Depending on whether one or two error correction steps are performed per cycle, a so called V or W cycle is obtained. The corresponding strategies for stepping through the resolution hierarchy are as follows for two to four resolution levels:

Here, iterations on the original problem are denoted by large markers, while small markers denote iterations on error correction problems.



Algorithmically, a correction cycle can be described as follows:

1. In the first step, several (few) iterations using an interative linear or nonlinear basic solver are performed (e.g., a variant of the Gauss-Seidel solver). This step is called pre-relaxation step.

2. In the second step, the current error is computed to correct the current solution (the solution after step 1). For efficiency reasons, the error is calculated on a coarser resolution level. This step, which can be performed iteratively several times, is called coarse grid correction step.

3. In a final step, again several (few) iterations using the interative linear or nonlinear basic solver of step 1 are performed. This step is called post-relaxation step.



1. 在第一步中,使用交互式线性或非线性基本求解器(例如,Gauss-Seidel求解器的一个变体)执行几个(少数)迭代。这个步骤叫做预松弛步骤;

2. 在第二步中,计算当前误差来修正当前解(第一步后的解)。为了提高效率,计算误差的分辨率较粗。此步骤可迭代执行多次,称为粗网格校正步骤;

3. 在最后一个步骤中,再次使用第1步的交互式线性或非线性基本求解器进行几次迭代。这个步骤叫做后松弛步骤。

In addition, the solution can be initialized in a hierarchical manner. Starting from a very coarse variant of the original (non)linear equation system, the solution is successively refined. To do so, interpolated solutions of coarser variants of the equation system are used as the initialization of the next finer variant. On each resolution level itself, the V or W cycles described above are used to efficiently solve the (non)linear equation system on that resolution level. The corresponding multigrid methods are called full multigrid methods in the literature. The full multigrid algorithm can be visualized as follows:

This example represents a full multigrid algorithm that uses two W correction cycles per resolution level of the hierarchical initialization. The interpolation steps of the solution from one resolution level to the next are denoted by i and the two W correction cycles by W1 and W2. Iterations on the original problem are denoted by large markers, while small markers denote iterations on error correction problems.



In the multigrid implementation of the FDRIG, DDRAW, and CLG algorithm, the following parameters can be set: whether a hierarchical initialization is performed; the number of coarse grid correction steps; the maximum number of correction levels (resolution levels); the number of pre-relaxation steps; the number of post-relaxation steps. These parameters are described in more detail below.

The basic solver for the FDRIG algorithm is a point-coupled fixed-point variant of the linear Gauss-Seidel algorithm. The basic solver for the DDRAW algorithm is an alternating line-coupled fixed-point variant of the same type. The number of fixed-point steps can be specified for both algorithms with a further parameter. The basic solver for the CLG algorithm is a point-coupled linear Gauss-Seidel algorithm. The transfer of the data between the different resolution levels is performed by area-based interpolation and area-based averaging, respectively.



After the algorithms have been described, the effects of the individual parameters are discussed in the following.


The input images, along with their domains (regions of interest) are passed in ImageT1 and ImageT2. The computation of the optical flow vector field VectorField is performed on the smallest surrounding rectangle of the intersection of the domains of ImageT1 and ImageT2. The domain of VectorField is the intersection of the two domains. Hence, by specifying reduced domains for ImageT1 and ImageT2, the processing can be focused and runtime can potentially be saved. It should be noted, however, that all methods compute a global solution of the optical flow. In particular, it follows that the solution on a reduced domain need not (and cannot) be identical to the resolution on the full domain restricted to the reduced domain.


SmoothingSigma specifies the standard deviation of the Gaussian kernel that is used to smooth both input images. The larger the value of SmoothingSigma, the larger the low-pass effect of the Gaussian kernel, i.e., the smoother the preprocessed image. Usually,SmoothingSigma = 0.8 is a suitable choice. However, other values in the interval [0,2] are also possible. Larger standard deviations should only be considered if the input images are very noisy. It should be noted that larger values of SmoothingSigma lead to slightly longer execution times.

SmoothingSigma指定用于平滑两个输入图像的Gaussian核的标准差。SmoothingSigma越大,Gaussian核的低通(low-pass)效应越大,即预处理后的图像越平滑。通常,SmoothingSigma = 0.8是一个合适的选择。然而,区间[0,2]中的其他值也是可能的。只有在输入图像噪声较大时才应考虑较大的标准差。应该注意的是,SmoothingSigma值越大,执行时间越长。

IntegrationSigma specifies the standard deviation ρ of the Gaussian kernel Gρ that is used for the local integration of the neighborhood information of the data term. This parameter is used only in the CLG algorithm and has no effect on the other two algorithms. Usually, IntegrationSigma = 1.0 is a suitable choice. However, other values in the interval [0,3] are also possible. Larger standard deviations should only be considered if the input images are very noisy. It should be noted that larger values of IntegrationSigmalead to slightly longer execution times.

IntegrationSigma指定用于对数据项的邻域信息进行局部积分的GaussianGρ的标准差ρ。该参数仅在CLG算法中使用,对其他两种算法没有影响。通常,IntegrationSigma = 1.0是一个合适的选择。然而,区间[0,3]中的其他值也是可能的。只有在输入图像噪声较大时才应考虑较大的标准差。应该注意的是,较大的IntegrationSigmalead值会使执行时间稍微长一些。

FlowSmoothness specifies the weight α of the smoothness term with respect to the data term. The larger the value of FlowSmoothness, the smoother the computed optical flow field. It should be noted that choosing FlowSmoothness too small can lead to unusable results, even though statistically robust penalty functions are used, in particular if the warping strategy needs to predict too much information outside of the image. For byte images with a gray value range of [0,255], values of FlowSmoothness around 20 for the flow-driven FDRIG algorithm and around 1000 for the data-driven DDRAW algorithm and the homogeneous CLG algorithm typically yield good results.

flow smooth指定平滑项相对于数据项的权重α。FlowSmoothness越大,计算得到的光流场越光滑。需要注意的是,选择过小的FlowSmoothness可能会导致不可用的结果,即使使用了统计上健壮的惩罚函数,特别是当扭曲策略需要预测太多图像之外的信息时。对于灰度值范围为[0 255]的字节图像,流驱动的FDRIG算法的FlowSmoothness20左右,数据驱动的DDRAW算法和齐次CLG算法的FlowSmoothness1000左右,通常会得到很好的结果。

GradientConstancy specifies the weight γ of the gradient constancy with respect to the gray value constancy. This parameter is used only in the FDRIG algorithm. For the other two algorithms, it does not influence the results. For byte images with a gray value range of [0,255], a value of GradientConstancy = 5 is typically a good choice, since then both constancy assumptions are used to the same extent. For large changes in illumination, however, significantly larger values of GradientConstancy may be necessary to achieve good results. It should be noted that for large values of the gradient constancy weight the smoothness parameter FlowSmoothness must also be chosen larger.

GradientConstancy指定梯度恒常性相对于灰度值恒常性的权重γ。这个参数只在FDRIG算法中使用。对于另外两种算法,它不影响结果。对于灰度值范围为[0,255]的字节图像,GradientConstancy = 5通常是一个不错的选择,因为这两个恒常性假设在同一程度上使用。然而,对于光照变化较大的情况,可能需要显著增大的GradientConstancy值才能获得较好结果。需要注意的是,对于梯度恒常性权值较大的情况,还必须选择更大的平滑度参数FlowSmoothness

The parameters of the multigrid solver and for the coarse-to-fine warping strategy can be specified with the generic parameters MGParamName and MGParamValue. Usually, it suffices to use one of the four default parameter sets via MGParamName = 'default_parameters'and MGParamValue = 'very_accurate', 'accurate', 'fast_accurate', or 'fast'. The default parameter sets are described below. If the parameters should be specified individually, MGParamName and MGParamValue must be set to tuples of the same length. The values corresponding to the parameters specified in MGParamName must be specified at the corresponding position in MGParamValue.

多网格求解器的参数以及由粗到细的扭曲策略可以用通用参数MGParamNameMGParamValue来指定。通常,通过MGParamName = 'default_parameters'MGParamValue = 'very_accurate''accurate''fast_accurate''fast',使用四个默认参数集中的一个就足够了。下面将描述缺省参数集。如果应该单独指定参数,则必须将MGParamNameMGParamValue设置为相同长度的元组。与MGParamName中指定的参数对应的值必须在MGParamValue中对应的位置指定。

MGParamName = 'warp_zoom_factor' can be used to specify the resolution ratio between two consecutive warping levels in the coarse-to-fine warping hierarchy. 'warp_zoom_factor' must be selected from the open interval (0,1). For performance reasons,'warp_zoom_factor' is typically set to 0.5, i.e., the number of pixels is halved in each direction for each coarser warping level. This leads to an increase of 33% in the calculations that need to be performed with respect to an algorithm that does not use warping. Values for'warp_zoom_factor' close to 1 can lead to slightly better results. However, they require a disproportionately larger computation time, e.g., 426% for 'warp_zoom_factor' = 0.9.

MGParamName = 'warp_zoom_factor'可用于指定粗到细的扭曲层次结构中两个连续扭曲级别之间的分辨率比率。必须从开区间(0,1)中选择'warp_zoom_factor'。出于性能原因,'warp_zoom_factor'通常设置为0.5,即对于每个较粗的扭曲级别,每个方向上的像素数减半。对于不使用扭曲的算法,这将导致需要执行的计算增加33%'warp_zoom_factor'的值接近1时,结果会稍微好一些。但是,它们需要更多的计算时间,例如,对于'warp_zoom_factor' = 0.9,需要426%的计算时间。

MGParamName = 'warp_levels' can be used to restrict the warping hierarchy to a maximum number of levels. For 'warp_levels' = 0, the largest possible number of levels is used. If the image size does not allow to use the specified number of levels (taking the resolution ratio 'warp_zoom_factor' into account), the largest possible number of levels is used. Usually, 'warp_levels' should be set to 0.

MGParamName = 'warp_levels'可用于将扭曲层次结构限制为最大的层次数量。对于'warp_levels' = 0,使用尽可能多的级别。如果图像大小不允许使用指定数量的级别(考虑到分辨率'warp_zoom_factor'),则使用尽可能多的级别。通常,'warp_levels'应该设置为0

MGParamName = 'warp_last_level' can be used to specify the number of warping levels for which the flow increment should no longer be computed. Usually, 'warp_last_level' is set to 1 or 2, i.e., a flow increment is computed for each warping level, or the finest warping level is skipped in the computation. Since in the latter case the computation is performed on an image of half the resolution of the original image, the gained computation time can be used to compute a more accurate solution, e.g., by using a full multigrid algorithm with additional iterations. The more accurate solution is then interpolated to the full resolution.

MGParamName = 'warp_last_level'可用于指定不再计算流增量的扭曲水平的数量。通常,'warp_last_level'被设置为12,即对于每个扭曲水平计算流量增量,或在计算中跳过最细的扭曲水平。由于后一种情况下的计算是在原图像分辨率一半的图像上进行的,因此获得的计算时间可以用来计算更精确的解决方案,例如使用带有附加迭代的完整多重网格算法。然后将更精确的解插值到整个分辨率。

The three parameters that specify the coarse-to-fine warping strategy are only used in the FDRIG and DDRAW algorithms. They are ignored for the CLG algorithm.


MGParamName = 'mg_solver' can be used to specify the general multigrid strategy for solving the (non)linear equation system (in each warping level). For 'mg_solver' = 'multigrid', a normal multigrid algorithm (without coarse-to-fine initialization) is used, while for 'mg_solver'= 'full_multigrid' a full multigrid algorithm (with coarse-to-fine initialization) is used. Since a resolution reduction of 0.5 is used between two consecutive levels of the coarse-to-fine initialization (in contrast to the resolution reduction in the warping strategy, this value is hard-coded into the algorithm), the use of a full multigrid algorithm results in an increase of the computation time by approximately 33% with respect to the normal multigrid algorithm. Using 'mg_solver' to 'full_multigrid' typically yields numerically more accurate results than 'mg_solver' = 'multigrid'.

MGParamName = 'mg_solver'可用于指定求解(非线性)方程组(在每个扭曲级别)的通用多重网格策略。对于'mg_solver'= 'multigrid',使用普通的multigrid算法(没有粗到细的初始化);而对于'mg_solver'= 'full_multigrid',使用完整的multigrid算法(具有粗到细的初始化)。由于低分辨率0.5用于两个连续级别之间的由粗到细的初始化(与低分辨率扭曲策略相比,这个值是指算法硬编码),使用一个完全的多重网格算法比正常的多重网格算法增加了大约33%的计算时间。使用“mg_solver=full_multigrid”通常会得到比“mg_solver=multigrid”更精确的数值结果。

MGParamName = 'mg_cycle_type' can be used to specify whether a V or W correction cycle is used per multigrid level. Since a resolution reduction of 0.5 is used between two consecutive levels of the respective correction cycle, using a W cycle instead of a V cycle increases the computation time by approximately 50%. Using 'mg_cycle_type' = 'w' typically yields numerically more accurate results than 'mg_cycle_type' = 'v'.

MGParamName = 'mg_cycle_type'可用于指定每个多网格水平使用VW校正周期。由于在各自校正周期的两个连续级别之间使用0.5的低分辨率,因此使用W循环而不是V循环会将计算时间增加约50%。使用'mg_cycle_type=w'通常比'mg_cycle_type=v'在数值上得到更准确的结果。

MGParamName = 'mg_levels' can be used to restrict the multigrid hierarchy for the coarse-to-fine initialization as well as for the actual V or W correction cycles. For 'mg_levels' = 0, the largest possible number of levels is used. If the image size does not allow to use the specified number of levels, the largest possible number of levels is used. Usually, 'mg_levels' should be set to 0.


MGParamName = 'mg_levels'可用于限制从粗到细的初始化以及实际的VW纠正周期的多重网格层次结构。对于'mg_levels' = 0,使用尽可能多的级别。如果图像大小不允许使用指定数量的级别,则使用尽可能多的级别。通常,'mg_levels'应该设置为0

MGParamName = 'mg_cycles' can be used to specify the total number of V or W correction cycles that are being performed. If a full multigrid algorithm is used, 'mg_cycles' refers to each level of the coarse-to-fine initialization. Usually, one or two cycles are sufficient to yield a sufficiently accurate solution of the equation system. Typically, the larger 'mg_cycles', the more accurate the numerical results. This parameter enters almost linearly into the computation time, i.e., doubling the number of cycles leads approximately to twice the computation time.

MGParamName = 'mg_cycles'可用于指定正在执行的VW纠正周期的总数。如果使用完整的多重网格算法,“mg_cycles”表示从粗到细的初始化的每个水平。通常,一个或两个周期足够产生一个足够精确的方程组的解。通常,“mg_cycles”越大,数值结果越精确。该参数几乎线性地影响计算时间,即若将循环数加倍,则计算时间约为原来的两倍。

MGParamName = 'mg_pre_relax' can be used to specify the number of iterations that are performed on each level of the V or W correction cycles using the iterative basic solver before the actual error correction is performed. Usually, one or two pre-relaxation steps are sufficient. Typically, the larger 'mg_pre_relax', the more accurate the numerical results.


MGParamName = 'mg_pre_relax'可用于指定在执行实际错误纠正之前,使用迭代基本求解器在VW纠正周期的每个水平上执行的迭代次数。通常,一个或两个预松弛步骤就足够了。通常,“mg_pre_relax”越大,数值结果就越准确。

MGParamName = 'mg_post_relax' can be used to specify the number of iterations that are performed on each level of the V or W correction cycles using the iterative basic solver after the actual error correction is performed. Usually, one or two post-relaxation steps are sufficient. Typically, the larger 'mg_post_relax', the more accurate the numerical results.


MGParamName = 'mg_post_relax'可用于指定在实际执行错误纠正后,使用迭代基本求解器在VW纠正周期的每个水平上执行的迭代次数。通常,一个或两个后松弛步骤就足够了。通常,“mg_post_relax”越大,数值结果就越准确。

Like when increasing the number of correction cycles, increasing the number of pre- and post-relaxation steps increases the computation time asymptotically linearly. However, no additional restriction and prolongation operations (zooming down and up of the error correction images) are performed. Consequently, a moderate increase in the number of relaxation steps only leads to a slight increase in the computation times.


MGParamName = 'mg_inner_iter' can be used to specify the number of iterations to solve the linear equation systems in each fixed-point iteration of the nonlinear basic solver. Usually, one iteration is sufficient to achieve a sufficient convergence speed of the multigrid algorithm. The increase in computation time is slightly smaller than for the increase in the relaxation steps. This parameter only influences the FDRIG and DDRAW algorithms since for the CLG algorithm no nonlinear equation system needs to be solved.

MGParamName = 'mg_inner_iter'可用于指定非线性基本求解器每次定点迭代求解线性方程组的迭代次数。通常情况下,一次迭代就足以达到多重网格算法足够的收敛速度。计算时间的增加略小于松弛步骤的增加。这个参数只影响FDRIGDDRAW算法,因为CLG算法不需要求解非线性方程组。

As described above, usually it is sufficient to use one of the default parameter sets for the parameters described above by using MGParamName = 'default_parameters' and MGParamValue = 'very_accurate', 'accurate', 'fast_accurate', or 'fast'. If necessary, individual parameters can be modified after the default parameter set has been chosen by specifying a subset of the above parameters and corresponding values after 'default_parameters' in MGParamName and MGParamValue (e.g., MGParamName =['default_parameters','warp_zoom_factor'] and MGParamValue = ['accurate',0.6]).

如上所述,通常使用MGParamName = 'default_parameters'MGParamValue = 'very_accurate''accurate''fast_accurate''fast'为上述参数使用一个默认参数集就足够了。如果需要,可以在MGParamNameMGParamValue中的‘default_parameters’后面指定上述参数的子集和相应的值(例如,MGParamName =['default_parameters''warp_zoom_factor']MGParamValue =[' accurate'0.6]),从而在选择默认参数集之后修改单个参数。

The default parameter sets use the following values for the above parameters:

'default_parameters' = 'very_accurate': 'warp_zoom_factor' = 0.5, 'warp_levels' = 0, 'warp_last_level' = 1, 'mg_solver' = 'full_multigrid', 'mg_cycle_type' = 'w', 'mg_levels' = 0, 'mg_cycles' = 1, 'mg_pre_relax' = 2, 'mg_post_relax' = 2, 'mg_inner_iter' = 1.

'default_parameters' = 'accurate': 'warp_zoom_factor' = 0.5, 'warp_levels' = 0, 'warp_last_level' = 1, 'mg_solver' = 'multigrid', 'mg_cycle_type' = 'v', 'mg_levels' = 0, 'mg_cycles' = 1, 'mg_pre_relax' = 1, 'mg_post_relax' = 1, 'mg_inner_iter' = 1.

'default_parameters' = 'fast_accurate': 'warp_zoom_factor' = 0.5, 'warp_levels' = 0, 'warp_last_level' = 2, 'mg_solver' = 'full_multigrid', 'mg_cycle_type' = 'w', 'mg_levels' = 0, 'mg_cycles' = 1, 'mg_pre_relax' = 2, 'mg_post_relax' = 2, 'mg_inner_iter' = 1.

'default_parameters' = 'fast': 'warp_zoom_factor' = 0.5, 'warp_levels' = 0, 'warp_last_level' = 2, 'mg_solver' = 'multigrid', 'mg_cycle_type' = 'v', 'mg_levels' = 0, 'mg_cycles' = 1, 'mg_pre_relax' = 1, 'mg_post_relax' = 1, 'mg_inner_iter' = 1.



It should be noted that for the CLG algorithm the two modes 'fast_accurate' and 'fast' are identical to the modes 'very_accurate' and 'accurate' since the CLG algorithm does not use a coarse-to-fine warping strategy.





derivate_vector_field — Convolve a vector field with derivatives of the Gaussian.


derivate_vector_field convolves the components of a vector field with the derivatives of a Gaussian and calculates various features derived therefrom. derivate_vector_field only accepts vector fields of the semantic type 'vector_field_relative'. The VectorField F(r,c)=(u(r,c),v(r,c)) is defined as in optical_flow_mg. Sigma is the parameter of the Gaussian (i.e., the amount of smoothing). If a single value is passed in Sigma, the amount of smoothing in the column and row direction is identical. If two values are passed in Sigma, the first value specifies the amount of smoothing in the column direction, while the second value specifies the amount of smoothing in the row direction. The possible values for Component are:

derivate_vector_field将向量场的分量与高斯函数的导数进行卷积,并计算由此得到的各种特征。derivate_vector_field只接受语义类型为“vector_field_relative”的向量场。VectorField F(r,c)=(u(r,c), v(r,c))optical_flow_mg定义。Sigma是高斯函数的参数(平滑量)。如果在Sigma中传递一个值,那么在列和行方向上的平滑量是相同的。如果在Sigma中传递两个值,第一个值指定列方向的平滑量,第二个值指定行方向的平滑量。组件的可能值为:

'curl': The curl of the vector field. One application of using 'curl' is to analyse optical flow fields. Metaphorically speaking, the curl is how much a small boat would rotate if the vector field was a fluid.


'divergence': The divergence of the vector field. One application of using 'divergence' is to analyze optical flow fields. Metaphorically speaking, the divergence is where the source and sink would be if the vector field was a fluid.


When used in context of photometric stereo, the operator derivate_vector_field offers two more parameters, which are especially designed to process the gradient field that is returned by photometric_stereo. In this case, we interpret the input vector field as gradient of the underlying surface.

In the following formulas, the input vector field is therefore noted as

where the first and second component of the input is the gradient field of the surface f(r,c). In the formulas below f_rc denotes the first derivative in column direction of the first component of the gradient field.




'mean_curvature': Mean curvature H of the underlying surface when the input vector field VectorField is interpreted as gradient field. One application of using 'mean_curvature' is to process the vector field that is returned by photometric_stereo. After filtering the vector field, even tiny scratches or bumps can be segmented.


'gauss_curvature': Gaussian curvature K of the underlying surface when the input vector field VectorField is interpreted as gradient field. One application of using 'gauss_curvature' is to process the vector field that is returned by photometric_stereo. After filtering the vector field, even tiny scratches or bumps can be segmented. If the underlying surface of the vector field is developable, the Gaussian curvature is zero.




unwarp_image_vector_field — Unwarp an image using a vector field.


unwarp_image_vector_field unwarps the image Image using the vector field VectorField and returns the unwarped image in ImageUnwarped. The vector field must be of the semantic type 'vector_field_relative' and is typically determined with optical_flow_mg. Hence, unwarp_image_vector_field can be used to unwarp the second input image of optical_flow_mg to the first input image. It should be noted that because of the above semantics the vector field image represents an inverse transformation from the destination image of the vector field to the source image.




vector_field_length — Compute the length of the vectors of a vector field.


vector_field_length compute the length of the vectors of the vector field VectorField and returns them in Length. vector_field_length only accepts vector fields of the semantic type 'vector_field_relative'. The parameter Mode can be used to specify how the lengths are computed. For Mode = 'length', the Euclidean length of the vectors is computed. For Mode = 'squared_length', the square of the length of the vectors is computed. This avoids having to compute a square root internally, which is a costly operation on many processors, and hence saves runtime on these processors. Note that the VectorField must be in relative coordinates as returned by optical_flow_mg.

vector_field_length计算向量场VectorField的长度并返回Lengthvector_field_length只接受语义类型“vector_field_relative”的向量场。参数模式可用于指定如何计算长度。对于Mode = 'length',计算向量的欧氏长度。对于Mode = 'squared_length',计算向量长度的平方。这避免了在内部计算平方根,这在许多处理器上是一个费时的操作,因此节省了这些处理器上的运行时。注意VectorField必须在optical_flow_mg返回的相对坐标中。






