泛函极值 变分法 梯度下降流
在上一篇文章基于区域的主动轮廓模型-Chan Vese模型1中,我们通过构造能量泛函,从而巧妙的把图像中的目标分割问题转化为求解能量泛函极值的问题,然后利用变分法获得曲线演化方程。有人提出了这个推导原理或者推导过程是怎么回事,所以我今天专门写篇博客(参考重庆大学唐利明博士论文2 )分析以下这个问题。
1、积分泛函 欧拉-拉格朗日方程
假设能量泛函为:
E
(
u
)
=
∫
a
b
F
(
x
,
u
,
u
′
)
d
x
(1)
E(u)=\int_{a}^{b} F\left(x, u, u^{\prime}\right) d x \tag{1}
E(u)=∫abF(x,u,u′)dx(1)我们的目标则是寻找目标函数
u
(
x
)
u(x)
u(x)使得能量函数
E
(
u
)
E(u)
E(u)取极小值,假设函数
u
(
x
)
u(x)
u(x)能够使得
E
(
u
)
E(u)
E(u)取极小值,那么对函数
u
(
x
)
u(x)
u(x)引入任意小的扰动后,或者说对于任意的
u
(
x
)
+
α
v
(
x
)
u(x)+\alpha v(x)
u(x)+αv(x)都有:
E
(
u
)
⩽
E
(
u
+
α
v
)
E(u) \leqslant E(u+\alpha v)
E(u)⩽E(u+αv)其中
α
\alpha
α为常数,
v
(
a
)
=
v
(
b
)
=
0
v(a) =v(b)=0
v(a)=v(b)=0,那么将
E
(
u
+
α
v
)
E(u+\alpha v)
E(u+αv)看成是
α
\alpha
α的函数
ϕ
(
α
)
\phi(\alpha)
ϕ(α),那么对于一元函数
ϕ
(
α
)
\phi(\alpha)
ϕ(α)在
α
=
0
\alpha=0
α=0的导数也就是泛函
E
E
E的一阶变分
δ
E
\delta E
δE应等于0,即
δ
E
=
d
ϕ
d
α
∣
α
=
0
=
∫
a
b
(
∂
F
(
x
,
u
,
u
′
)
∂
u
v
+
∂
F
(
x
,
u
,
u
′
)
∂
u
′
v
′
)
d
x
=
0
\delta E =\left.\frac{d \phi}{d \alpha}\right|_{\alpha=0}=\int_{a}^{b}\left(\frac{\partial F\left(x, u, u^{\prime}\right)}{\partial u} v+\frac{\partial F\left(x, u, u^{\prime}\right)}{\partial u^{\prime}} v^{\prime}\right) d x = 0
δE=dαdϕ∣∣∣∣α=0=∫ab(∂u∂F(x,u,u′)v+∂u′∂F(x,u,u′)v′)dx=0
进一步整理可得:
δ
E
=
∫
a
b
(
∂
F
∂
u
v
+
∂
F
∂
u
′
v
′
)
d
x
=
0
\delta E =\int_{a}^{b}\left(\frac{\partial F}{\partial u} v+\frac{\partial F}{\partial u^{\prime}} v^{\prime}\right) d x=0
δE=∫ab(∂u∂Fv+∂u′∂Fv′)dx=0,然后利用分步积分法对上式进一步分析推导:
δ
E
=
∫
a
b
(
∂
F
∂
u
v
+
∂
F
∂
u
′
v
′
)
d
x
=
∫
a
b
∂
F
∂
u
v
d
x
+
∫
a
b
∂
F
∂
u
′
d
v
=
∫
a
b
[
∂
F
d
u
v
−
d
d
x
(
∂
F
∂
u
′
)
v
]
d
x
+
v
∂
F
∂
u
′
∣
a
b
=
0
\begin{aligned} \delta E &=\int_{a}^{b}\left(\frac{\partial F}{\partial u} v+\frac{\partial F}{\partial u^{\prime}} v^{\prime}\right) d x=\int_{a}^{b} \frac{\partial F}{\partial u} v d x+\int_{a}^{b} \frac{\partial F}{\partial u^{\prime}} dv \\ & =\int_{a}^{b}\left[\frac{\partial F}{d u} v-\frac{d}{d x}\left(\frac{\partial F}{\partial u^{\prime}}\right) v\right] dx+\left.v \frac{\partial F}{\partial u^{\prime}}\right|_{a} ^{b}=0 \end{aligned}
δE=∫ab(∂u∂Fv+∂u′∂Fv′)dx=∫ab∂u∂Fvdx+∫ab∂u′∂Fdv=∫ab[du∂Fv−dxd(∂u′∂F)v]dx+v∂u′∂F∣∣∣∣ab=0因为
v
(
a
)
=
v
(
b
)
=
0
v(a) = v(b) = 0
v(a)=v(b)=0,那么
δ
E
=
∫
a
b
[
∂
F
∂
u
−
d
d
x
(
∂
F
∂
u
′
)
]
v
d
x
=
0
\delta E=\int_{a}^{b}\left[\frac{\partial F}{\partial u}-\frac{d}{d x}\left(\frac{\partial F}{\partial u^{\prime}}\right)\right] v d x=0
δE=∫ab[∂u∂F−dxd(∂u′∂F)]vdx=0,如果对于任意函数
v
v
v都能成立,那么根据变分法引理,
∂
F
∂
u
−
d
d
x
(
∂
F
∂
u
′
)
=
0
\frac{\partial F}{\partial u}-\frac{d}{d x}\left(\frac{\partial F}{\partial u^{\prime}}\right)=0
∂u∂F−dxd(∂u′∂F)=0 这个引理的证明也很简单,利用反证法,只需假设
v
=
(
∂
F
∂
u
−
d
d
x
(
∂
F
∂
u
′
)
)
(
x
−
a
)
(
x
−
b
)
v = (\frac{\partial F}{\partial u}-\frac{d}{d x}(\frac{\partial F}{\partial u^{\prime}}) )(x-a)(x-b)
v=(∂u∂F−dxd(∂u′∂F))(x−a)(x−b),因为
(
x
−
a
)
(
x
−
b
)
<
0
(x-a)(x-b)<0
(x−a)(x−b)<0,所以
δ
E
<
0
\delta E < 0
δE<0,与已知矛盾,故
∂
F
∂
u
−
d
d
x
(
∂
F
∂
u
′
)
=
0
(2)
\frac{\partial F}{\partial u}-\frac{d}{d x}\left(\frac{\partial F}{\partial u^{\prime}}\right)=0 \tag{2}
∂u∂F−dxd(∂u′∂F)=0(2) 上式就是大名鼎鼎的欧拉-拉格朗日方程(Euler-Lagrange),对于积分泛函内包含高阶导数的问题,比如
E
(
u
)
=
∫
a
b
F
(
x
,
u
,
u
′
,
u
′
′
)
d
x
E(u)=\int_{a}^{b} F\left(x, u, u^{\prime}, u^{\prime \prime}\right) d x
E(u)=∫abF(x,u,u′,u′′)dx利用上述变分法,完全可以获得对应的Euler-Lagrange方程
∂
F
∂
u
−
d
d
x
(
∂
F
∂
u
′
)
+
d
2
d
x
2
(
∂
F
∂
u
′
′
)
=
0
(3)
\frac{\partial F}{\partial u}-\frac{d}{d x}\left(\frac{\partial F}{\partial u^{\prime}}\right)+\frac{d^2}{d x^2}\left(\frac{\partial F}{\partial u^{\prime \prime}}\right)=0 \tag{3}
∂u∂F−dxd(∂u′∂F)+dx2d2(∂u′′∂F)=0(3)对于变函积分内包括多元函数的问题,比如
E
(
u
)
=
∬
Ω
F
(
x
,
y
,
u
,
u
x
,
u
y
,
u
x
,
u
y
y
)
d
x
d
y
(4)
E(u)=\iint_{\Omega} F\left(x, y, u, u_{x}, u_{y}, u_{x}, u_{y y}\right) d x d y \tag{4}
E(u)=∬ΩF(x,y,u,ux,uy,ux,uyy)dxdy(4)同样可以推导出其对应的Euler-Lagrange
∂
F
d
u
−
d
d
x
(
∂
F
∂
u
x
)
−
d
d
y
(
∂
F
∂
u
y
)
+
d
2
d
x
2
(
∂
F
∂
u
x
x
)
+
d
2
d
y
2
(
∂
F
∂
u
y
y
)
=
0
(5)
\frac{\partial F}{d u}-\frac{d}{d x}\left(\frac{\partial F}{\partial u_{x}}\right)-\frac{d}{d y}\left(\frac{\partial F}{\partial u_{y}}\right)+\frac{d^{2}}{d x^{2}}\left(\frac{\partial F}{\partial u_{x x}}\right)+\frac{d^{2}}{d y^{2}}\left(\frac{\partial F}{\partial u_{y y}}\right)=0 \tag{5}
du∂F−dxd(∂ux∂F)−dyd(∂uy∂F)+dx2d2(∂uxx∂F)+dy2d2(∂uyy∂F)=0(5)不过对于在大部分情况下,我们获得的Euler-Langarange方程一般都是非线性的偏微分方程,很难直接获得对应的解析解,因此,我们需要引入辅组的时间变量,将求解静态偏微分方程问题转化为动态偏微分方程问题,当演化至稳态时,就可以得到我们想要的偏微分方程的解了。
2、积分泛函 梯度下降流
假设随时间变化的函数
u
(
⋅
,
t
)
u(\cdot ,t)
u(⋅,t)能够使得
δ
E
<
0
\delta E < 0
δE<0,那么能量泛函
E
(
u
(
⋅
,
t
)
)
E(u(\cdot, t))
E(u(⋅,t))则会不断减小,那么
u
(
⋅
,
t
)
u(\cdot ,t)
u(⋅,t)从时间
t
t
t 到时间
t
+
Δ
t
t + \Delta t
t+Δt则会给能量函数引入一个新的扰动
V
=
∂
u
∂
t
Δ
t
V=\frac{\partial u}{\partial t} \Delta t
V=∂t∂uΔt那么对应的
δ
E
=
∫
a
b
[
∂
F
∂
u
−
d
d
x
(
∂
F
∂
u
′
)
]
v
d
x
=
Δ
t
∫
a
b
[
∂
F
∂
u
−
d
d
x
(
∂
F
∂
u
′
)
]
∂
u
∂
t
d
x
<
0
\delta E=\int_{a}^{b}\left[\frac{\partial F}{\partial u}-\frac{d}{d x}\left(\frac{\partial F}{\partial u^{\prime}}\right)\right] v d x=\Delta t \int_{a}^{b}\left[\frac{\partial F}{\partial u}-\frac{d}{d x}\left(\frac{\partial F}{\partial u^{\prime}}\right)\right] \frac{\partial u}{\partial t} d x<0
δE=∫ab[∂u∂F−dxd(∂u′∂F)]vdx=Δt∫ab[∂u∂F−dxd(∂u′∂F)]∂t∂udx<0
要想使上式恒成立,则可取(充分不必要但是最简洁)
∂
u
∂
t
=
−
∂
F
∂
u
+
d
d
x
(
∂
F
∂
u
′
)
(6)
\frac{\partial u}{\partial t}=-\frac{\partial F}{\partial u}+\frac{d}{d x}\left(\frac{\partial F}{\partial u^{\prime}}\right) \tag{6}
∂t∂u=−∂u∂F+dxd(∂u′∂F)(6)
那么对于能量泛函
E
(
u
)
=
∫
a
b
F
(
x
,
u
,
u
′
)
d
x
E(u)=\int_{a}^{b} F\left(x, u, u^{\prime}\right) d x
E(u)=∫abF(x,u,u′)dx,想要获得其数值解,则需要构造偏微分方程$
{
∂
u
∂
t
=
δ
E
(
u
)
u
(
x
,
0
)
=
u
0
(
x
)
\left\{\begin{array}{l}\frac{\partial u}{\partial t}=\delta E(u) \\ u(x, 0)=u_{0}(x)\end{array}\right.
{∂t∂u=δE(u)u(x,0)=u0(x)通过计算上述方程组,不断迭代计算,就能够找到使得能量函数
E
(
u
)
E(u)
E(u)取得极小值的函数
u
(
x
)
u(x)
u(x)。
3、以Chan-Vese模型为例的推导过程
关于Chan-Vese模型的详细内容大家可以看这一篇我的另一篇博文Chan-Vese模型。已知模型的能量泛函为:
E
(
ϕ
(
x
,
y
)
)
=
μ
∫
Ω
∣
∇
H
(
ϕ
(
x
,
y
)
)
∣
d
x
d
y
+
ν
∫
Ω
H
(
ϕ
(
x
,
y
)
d
x
d
y
+
λ
1
∫
Ω
∣
I
0
(
x
,
y
)
−
C
1
∣
2
H
(
ϕ
(
x
,
y
)
d
x
d
y
+
λ
2
∫
o
u
t
s
i
d
e
(
C
)
∣
I
0
(
x
,
y
)
−
C
2
∣
2
(
1
−
H
(
ϕ
(
x
,
y
)
)
d
x
d
y
(7)
E(\phi(x,y)) =\mu \int_\Omega{|\nabla H(\phi(x,y))|}dxdy + \nu \int_\Omega{H( \phi(x,y)}dxdy \\ + \lambda_1 \int_{\Omega}{|I_0(x,y)-C_1|^2 H( \phi(x,y)}\,{\rm d}x{\rm d}y + \\ \lambda_2 \int_{outside(C)}{|I_0(x,y)-C_2|^2 (1-H( \phi(x,y))}\,{\rm d}x{\rm d}y \tag{7}
E(ϕ(x,y))=μ∫Ω∣∇H(ϕ(x,y))∣dxdy+ν∫ΩH(ϕ(x,y)dxdy+λ1∫Ω∣I0(x,y)−C1∣2H(ϕ(x,y)dxdy+λ2∫outside(C)∣I0(x,y)−C2∣2(1−H(ϕ(x,y))dxdy(7) 那么怎样借助上面的梯度下降流公式来获得对应的用来演化的偏微方程:
∂
ϕ
∂
t
=
δ
ϵ
(
ϕ
)
[
μ
d
i
v
(
∇
ϕ
∣
∇
ϕ
∣
)
−
ν
−
λ
1
(
I
−
C
1
)
2
+
λ
2
(
I
−
C
2
)
2
]
(8)
\frac{\partial \phi}{\partial t} = \delta_{\epsilon}(\phi)[\mu div(\frac{\nabla\phi}{|\nabla\phi|})-\nu - \lambda_1(I-C_1)^2+\lambda_2(I-C_2)^2] \tag{8}
∂t∂ϕ=δϵ(ϕ)[μdiv(∣∇ϕ∣∇ϕ)−ν−λ1(I−C1)2+λ2(I−C2)2](8)在开始推导之前我首先需要补充几点内容:
1.
d
i
v
div
div是表示散度的意思,是一个标量值,比如说对于一个梯度向量
(
f
x
,
f
y
)
(f_x,f_y)
(fx,fy),则该梯度向量对应的散度值为
∂
∂
x
f
x
+
∂
∂
y
f
y
\frac{\partial}{\partial x}f_x + \frac{\partial}{\partial y}f_y
∂x∂fx+∂y∂fy或者说
f
x
x
+
f
y
y
f_{xx}+f_{yy}
fxx+fyy;
2. 公式(7)中的函数
H
(
ϕ
)
H(\phi)
H(ϕ)为Heaviside函数(也就是阶跃函数),其对应的导函数为
δ
(
ϕ
)
\delta(\phi)
δ(ϕ);
3. 公式(7)中的梯度的模值即
∣
∇
H
(
ϕ
(
x
,
y
)
)
∣
|\nabla H(\phi(x,y))|
∣∇H(ϕ(x,y))∣的展开式为:
∣
∇
H
(
ϕ
(
x
,
y
)
)
∣
=
(
δ
(
ϕ
)
ϕ
x
)
2
+
(
δ
(
ϕ
)
ϕ
y
)
2
=
(
(
δ
(
ϕ
)
ϕ
x
)
2
+
(
δ
(
ϕ
)
ϕ
y
)
2
)
1
2
|\nabla H(\phi(x,y))| = \sqrt{\left(\delta(\phi) \phi_{x}\right)^{2}+\left(\delta(\phi) \phi_{y}\right)^{2}} = ({\left(\delta(\phi) \phi_{x}\right)^{2}+\left(\delta(\phi) \phi_{y}\right)^{2}}) ^{\frac{1}{2}}
∣∇H(ϕ(x,y))∣=(δ(ϕ)ϕx)2+(δ(ϕ)ϕy)2=((δ(ϕ)ϕx)2+(δ(ϕ)ϕy)2)21
4. 公式(7)的能量函数中,
ϕ
(
x
,
y
)
\phi(x,y)
ϕ(x,y)是一个二元函数,且公式(7)中只存在
ϕ
(
x
,
y
)
\phi(x,y)
ϕ(x,y)的一阶导数,所以公式(7)对应的梯度下降流公式需要参照如下公式:
∂
u
∂
t
=
−
∂
F
∂
u
+
d
d
x
(
∂
F
∂
u
x
)
+
d
d
y
(
∂
F
∂
u
y
)
(9)
\frac{\partial u}{\partial t} = - \frac{\partial F}{\partial u}+\frac{d}{d x}\left(\frac{\partial F}{\partial u_x}\right)+\frac{d}{d y}\left(\frac{\partial F}{\partial u_y}\right) \tag{9}
∂t∂u=−∂u∂F+dxd(∂ux∂F)+dyd(∂uy∂F)(9)
因此对于能量函数公式(7),参照公式(9),可得:
F
:
μ
∣
∇
H
(
ϕ
)
∣
+
ν
H
(
ϕ
)
+
λ
1
(
I
0
−
C
1
)
2
H
(
ϕ
)
+
λ
2
(
I
0
−
C
2
)
2
(
1
−
H
(
ϕ
)
)
u
:
ϕ
\begin{array}{l} F: \mu|\nabla H(\phi)|+\nu H(\phi)+\lambda_{1}\left(I_{0}-C_{1}\right)^{2} H(\phi)+\lambda_{2}\left(I_{0}-C_{2}\right)^{2}(1-H(\phi)) \\ u: \phi \end{array}
F:μ∣∇H(ϕ)∣+νH(ϕ)+λ1(I0−C1)2H(ϕ)+λ2(I0−C2)2(1−H(ϕ))u:ϕ进一步求导计算可得:
∂
F
∂
ϕ
=
v
δ
(
ϕ
)
+
λ
1
(
I
0
−
C
1
)
2
δ
(
ϕ
)
−
λ
2
(
I
0
−
C
2
)
2
δ
(
ϕ
)
(10)
\frac{\partial F}{\partial \phi}=v \delta(\phi)+\lambda_{1}\left(I_{0}-C_{1}\right)^{2} \delta(\phi)-\lambda_{2}\left(I_{0}-C_{2}\right)^{2} \delta(\phi) \tag{10}
∂ϕ∂F=vδ(ϕ)+λ1(I0−C1)2δ(ϕ)−λ2(I0−C2)2δ(ϕ)(10) 以及
d
d
x
(
∂
F
∂
ϕ
x
)
+
d
d
y
(
∂
F
∂
ϕ
y
)
=
μ
(
d
d
x
(
δ
(
ϕ
)
ϕ
x
(
δ
(
ϕ
)
ϕ
x
)
2
+
(
δ
(
ϕ
)
ϕ
y
)
2
)
+
d
d
y
(
δ
(
ϕ
)
ϕ
y
(
δ
(
ϕ
)
ϕ
x
)
2
+
(
δ
(
ϕ
)
ϕ
y
)
2
)
)
=
μ
δ
(
ϕ
)
div
(
∇
ϕ
∣
∇
ϕ
∣
)
(11)
\begin{aligned} \frac{d}{d x}\left(\frac{\partial F}{\partial \phi_{x}}\right)+\frac{d}{d y}\left(\frac{\partial F}{\partial \phi_{y}}\right) &=\mu\left(\frac{d}{d x}\left(\frac{\delta(\phi) \phi_{x}}{\sqrt{\left(\delta(\phi) \phi_{x}\right)^{2}+\left(\delta(\phi) \phi_{y}\right)^{2}}}\right)+\frac{d}{d y}\left(\frac{\delta(\phi) \phi_{y}}{\sqrt{\left(\delta(\phi) \phi_{x}\right)^{2}+\left(\delta(\phi) \phi_{y}\right)^{2}}}\right)\right) \\ &=\mu \delta(\phi) \operatorname{div}\left(\frac{\nabla \phi}{|\nabla \phi|}\right) \end{aligned} \tag{11}
dxd(∂ϕx∂F)+dyd(∂ϕy∂F)=μ⎝⎛dxd⎝⎛(δ(ϕ)ϕx)2+(δ(ϕ)ϕy)2δ(ϕ)ϕx⎠⎞+dyd⎝⎛(δ(ϕ)ϕx)2+(δ(ϕ)ϕy)2δ(ϕ)ϕy⎠⎞⎠⎞=μδ(ϕ)div(∣∇ϕ∣∇ϕ)(11)将公式(10)和公式(11)和整合进公式(9)后就可得到用来演化计算的偏微分方程:
∂
ϕ
∂
t
=
δ
(
ϕ
)
[
μ
d
i
v
(
∇
ϕ
∣
∇
ϕ
∣
)
−
ν
−
λ
1
(
I
−
C
1
)
2
+
λ
2
(
I
−
C
2
)
2
]
(12)
\frac{\partial \phi}{\partial t} = \delta(\phi)[\mu div(\frac{\nabla\phi}{|\nabla\phi|})-\nu - \lambda_1(I-C_1)^2+\lambda_2(I-C_2)^2] \tag{12}
∂t∂ϕ=δ(ϕ)[μdiv(∣∇ϕ∣∇ϕ)−ν−λ1(I−C1)2+λ2(I−C2)2](12) 对上述计算或者推导过程有任何疑问或者觉得我没有表达清楚的,欢迎在评论区留言评论!!
参考文献