最优控制(一)——最优控制问题概述和静态最优问题的解
最优控制问题,其实就是一类最优化问题,如果优化过程中 x \boldsymbol{x} x 与时间无关,或在所讨论的时间范围内为常量,就将该最优化问题称为静态最优化问题,反之就称为动态最优化问题。显然,在最优控制领域中,被控对象的状态会随着时间而变化,因此属于动态最优化问题。在动态最优化问题中,目标函数不再是普通函数,而是时间函数的函数,称为泛函数,简称泛函。
1. 研究最优控制的前提条件
-
给出受控系统的动态描述,即状态空间表达式。
-
明确控制作用域:
-
控制矢量 u ( t ) \boldsymbol{u}(t) u(t) 的可取范围组成的集合:
U = { u ( t ) ∣ φ j ( x , u ) ⩽ 0 , j = 1 , 2 , … m , ( m ⩽ r ) } U=\{\boldsymbol{u}(t)|\varphi_j(\boldsymbol{x},\boldsymbol{u})\leqslant0,j=1,2,\dots m,(m\leqslant r)\} U={u(t)∣φj(x,u)⩽0,j=1,2,…m,(m⩽r)}称为控制集,把控制集中的元素称为容许控制。
-
-
明确初始条件:
-
如果初始条件 x ( t 0 ) \boldsymbol{x}(t_0) x(t0) 是给定的,就将该最优控制问题称为固定始端,否则称为自由始端。
-
如果自由始端问题的 x ( t 0 ) \boldsymbol{x}(t_0) x(t0) 满足某些约束条件,那么就将满足约束的初始条件组成的集合称为始端集:
Ω 0 = { x ( t 0 ) ∣ ρ j [ x ( t 0 ) ] = 0 , j = 1 , 2 , … m , ( m ⩽ r ) } \Omega_0=\{\boldsymbol{x}(t_0)|\rho_j[\boldsymbol{x}(t_0)]=0,j=1,2,\dots m,(m\leqslant r)\} Ω0={x(t0)∣ρj[x(t0)]=0,j=1,2,…m,(m⩽r)}此时就将始端集中的元素称为可变始端。
-
-
明确终端条件:
-
固定终端是指终端时刻 t f t_f tf 和终端状态 x ( t f ) \boldsymbol{x}(t_f) x(tf) 都是给定的。
-
可变终端是指终端状态 x ( t f ) ∈ Ω f \boldsymbol{x}(t_f)\in\Omega_f x(tf)∈Ωf 的情况,其中
Ω f = { x ( t f ) ∣ φ j [ x ( t f ) ] = 0 , j = 1 , 2 , … m , ( m ⩽ r ) } \Omega_f=\{\boldsymbol{x}(t_f)|\varphi_j[\boldsymbol{x}(t_f)]=0,j=1,2,\dots m,(m\leqslant r)\} Ωf={x(tf)∣φj[x(tf)]=0,j=1,2,…m,(m⩽r)}是由约束条件所形成的目标集。
-
-
给出目标泛函或者说性能指标:
- 对于连续时间系统一般表述为:
J = Φ [ x ( t f ) ] + ∫ t 0 t f L [ x ( t ) , u ( t ) ] d t J=\varPhi[\boldsymbol{x}(t_f)]+\int_{t_0}^{t_f}L[\boldsymbol{x}(t),\boldsymbol{u}(t)]\textrm{d}t J=Φ[x(tf)]+∫t0tfL[x(t),u(t)]dt
- 对于离散时间系统一般表述为:
J = Φ [ x ( N ) ] + ∑ k = k 0 N − 1 L [ x ( k ) , u ( k ) , k ] J=\varPhi[\boldsymbol{x}(N)]+\sum_{k=k_0}^{N-1}L[\boldsymbol{x}(k),\boldsymbol{u}(k),k] J=Φ[x(N)]+k=k0∑N−1L[x(k),u(k),k]
- 上述的形式称为综合型或者鲍尔扎型。它主要由描述对终端性能的要求的终端指标函数和描述对动态品质及能量或者燃料消耗的要求的动态指标函数。
-
如果不考虑终端指标函数项,那么就有 Φ = 0 \varPhi=0 Φ=0,则有:
J = ∫ t 0 t f L [ x ( t ) , u ( t ) ] d t J = ∑ k = k 0 N − 1 L [ x ( k ) , u ( k ) , k ] J=\int_{t_0}^{t_f}L[\boldsymbol{x}(t),\boldsymbol{u}(t)]\textrm{d}t\\ J=\sum_{k=k_0}^{N-1}L[\boldsymbol{x}(k),\boldsymbol{u}(k),k] J=∫t0tfL[x(t),u(t)]dtJ=k=k0∑N−1L[x(k),u(k),k]
这种形式的性能指标称为积分型或者拉格朗日型。 -
如果不考虑动态指标函数项, L = 0 L=0 L=0,那么就有:
J = Φ [ x ( t f ) ] J = Φ [ x ( N ) ] J=\varPhi[\boldsymbol{x}(t_f)]\\ J=\varPhi[\boldsymbol{x}(N)] J=Φ[x(tf)]J=Φ[x(N)]
称为终端型或者梅耶型。
-
最优控制问题主要就是从容许控制集 U U U 中,寻求一个控制矢量 u ( t ) \boldsymbol{u}(t) u(t),使受控系统在时间域 [ t 0 , t f ] [t_0,t_f] [t0,tf] 内,从初态 x ( t 0 ) \boldsymbol{x}(t_0) x(t0) 转移到终态 x ( t f ) \boldsymbol{x}(t_f) x(tf) 或者目标集 x ( t f ) ∈ Ω f \boldsymbol{x}(t_f)\in\Omega_f x(tf)∈Ωf 时性能指标 J J J 取得最大值或者最小值。满足上述条件的控制 u ( t ) \boldsymbol{u}(t) u(t) 称为最优控制 u ∗ ( t ) \boldsymbol{u}^*(t) u∗(t)。在 u ∗ ( t ) \boldsymbol{u}^*(t) u∗(t) 作用下状态空间方程的解称为最佳轨线 x ∗ ( t ) \boldsymbol{x}^*(t) x∗(t)。沿最佳轨线 x ∗ ( t ) \boldsymbol{x}^*(t) x∗(t) 使性能指标 J J J 所达到的最优值称为最优指标 J ∗ \boldsymbol{J^*} J∗。
为了工程上实现的便利,一般会按照二次型性能指标设计系统,线性二次型性能指标的一般形式为:
J
=
1
2
x
T
(
t
f
)
Q
0
x
(
t
f
)
+
1
2
∫
t
0
t
f
[
x
T
(
t
)
Q
1
x
(
t
)
+
u
T
(
t
)
Q
2
u
(
t
)
]
d
t
J
=
1
2
x
T
(
N
)
Q
0
(
N
)
x
(
N
)
+
1
2
∑
k
=
k
0
N
−
1
[
x
T
(
k
)
Q
1
(
k
)
x
(
k
)
+
u
T
(
k
)
Q
2
(
k
)
u
(
k
)
]
J=\frac{1}{2}\boldsymbol{x}^T(t_f)\boldsymbol{Q}_0\boldsymbol{x}(t_f)+\frac{1}{2}\int_{t_0}^{t_f}\left[\boldsymbol{x}^T(t)\boldsymbol{Q}_1\boldsymbol{x}(t)+\boldsymbol{u}^T(t)\boldsymbol{Q}_2\boldsymbol{u}(t)\right]\textrm{d}t\\ J=\frac{1}{2}\boldsymbol{x}^T(N)\boldsymbol{Q}_0(N)\boldsymbol{x}(N)+\frac{1}{2}\sum_{k=k_0}^{N-1}\left[\boldsymbol{x}^T(k)\boldsymbol{Q}_1(k)\boldsymbol{x}(k)+\boldsymbol{u}^T(k)\boldsymbol{Q}_2(k)\boldsymbol{u}(k)\right]
J=21xT(tf)Q0x(tf)+21∫t0tf[xT(t)Q1x(t)+uT(t)Q2u(t)]dtJ=21xT(N)Q0(N)x(N)+21k=k0∑N−1[xT(k)Q1(k)x(k)+uT(k)Q2(k)u(k)]
式中的 Q 0 , Q 1 , Q 2 \boldsymbol{Q}_0,\boldsymbol{Q}_1,\boldsymbol{Q}_2 Q0,Q1,Q2 和 Q 0 ( N ) , Q 1 ( k ) , Q 2 ( k ) \boldsymbol{Q}_0(N),\boldsymbol{Q}_1(k),\boldsymbol{Q}_2(k) Q0(N),Q1(k),Q2(k) 都称为加权矩阵。
2. 静态最优化问题的解
静态最优化问题的目标使一个多元普通函数,因此可以直接使用多元微分学来解决。
2.1 一元函数的极值
如果一个一元实函数
J
=
f
(
u
)
J=f(u)
J=f(u) 在闭区间
[
a
,
b
]
[a,b]
[a,b] 上连续可微,那么该函数在该区间内存在非端点的极值点
u
∗
u^*
u∗ 的条件是:
f
′
(
u
)
∣
u
=
u
∗
=
0
f'(u)\bigg|_{u=u^*}=0
f′(u)
u=u∗=0
u
∗
u^*
u∗ 是极小值点的充要条件是:
f
′
(
u
∗
)
=
0
,
f
′
′
(
u
∗
)
>
0
f'(u^*)=0,f''(u^*)>0
f′(u∗)=0,f′′(u∗)>0
u
∗
u^*
u∗ 是极大值点的充要条件是:
f
′
(
u
∗
)
=
0
,
f
′
′
(
u
∗
)
<
0
f'(u^*)=0,f''(u^*)<0
f′(u∗)=0,f′′(u∗)<0
通过上述方法计算出来的极值点一般称为驻点,具有局部最值性质,如果将区间内所有极值进行比较就可以得到最小的极值也就是最值,它具有全局最值性:
J
∗
=
f
(
u
∗
)
=
min
u
∈
U
f
(
u
)
J^*=f(u^*)=\min_{u\in U}f(u)
J∗=f(u∗)=u∈Uminf(u)
2.2 多元函数的极值
设 n 元函数
f
(
u
)
f(\boldsymbol{u})
f(u),这里
u
=
(
u
1
,
u
2
,
⋯
,
u
n
)
T
\boldsymbol{u}=\begin{pmatrix} u_1,&u_2,&\cdots,&u_n \end{pmatrix}^T
u=(u1,u2,⋯,un)T 是一个 n 维列向量。那么该函数取到极值的必要条件是:
∂
f
∂
u
=
0
Or
∇
f
u
=
0
\frac{\partial f}{\partial \boldsymbol{u}}=\boldsymbol{0}\textrm{ Or }\nabla f_u=\boldsymbol{0}
∂u∂f=0 Or ∇fu=0
如果需要取到极小值,那么还需要 Hessian 矩阵正定:
∂
2
f
∂
u
2
=
(
∂
2
f
∂
u
1
2
∂
2
f
∂
u
1
∂
u
2
⋯
∂
2
f
∂
u
1
∂
u
n
∂
2
f
∂
u
2
∂
u
1
∂
2
f
∂
u
2
2
⋯
∂
2
f
∂
u
2
∂
u
n
⋮
⋮
⋮
∂
2
f
∂
u
n
∂
u
1
∂
2
f
∂
u
n
∂
u
2
⋯
∂
2
f
∂
u
n
2
)
>
0
\frac{\partial^2f}{\partial \boldsymbol{u}^2}=\begin{pmatrix} \displaystyle\frac{\partial^2f}{\partial u_1^2} & \displaystyle\frac{\partial^2f}{\partial u_1\partial u_2} & \cdots & \displaystyle\frac{\partial^2f}{\partial u_1\partial u_n}\\ \displaystyle\frac{\partial^2f}{\partial u_2\partial u_1} & \displaystyle\frac{\partial^2f}{\partial u_2^2} & \cdots & \displaystyle\frac{\partial^2f}{\partial u_2\partial u_n}\\ \vdots & \vdots & & \vdots\\ \displaystyle\frac{\partial^2f}{\partial u_n\partial u_1} & \displaystyle\frac{\partial^2f}{\partial u_n\partial u_2} & \cdots & \displaystyle\frac{\partial^2f}{\partial u_n^2} \end{pmatrix}>0
∂u2∂2f=
∂u12∂2f∂u2∂u1∂2f⋮∂un∂u1∂2f∂u1∂u2∂2f∂u22∂2f⋮∂un∂u2∂2f⋯⋯⋯∂u1∂un∂2f∂u2∂un∂2f⋮∂un2∂2f
>0
2.3 具有等式约束的条件极值
主要考虑消元法或者拉格朗日乘数法,后者具有普遍意义。设连续可微函数
J
=
f
(
x
,
u
)
J=f(\boldsymbol{x},\boldsymbol{u})
J=f(x,u)
和等式约束
g
(
x
,
u
)
=
0
\boldsymbol{g}(\boldsymbol{x},\boldsymbol{u})=\boldsymbol{0}
g(x,u)=0
其中, x \boldsymbol{x} x 为 n 维列向量, u \boldsymbol{u} u 为 r 维列向量, g \boldsymbol{g} g 为 n 维矢量函数。
接下来,使用与
g
\boldsymbol{g}
g 同维的乘子矢量
λ
\boldsymbol{\lambda}
λ 与约束条件相乘并与目标函数相加得到拉格朗日函数:
H
=
J
+
λ
T
g
=
f
(
x
,
u
)
+
λ
T
g
(
x
,
u
)
H=J+\boldsymbol{\lambda}^T\boldsymbol{g}=f(\boldsymbol{x},\boldsymbol{u})+\boldsymbol{\lambda}^T\boldsymbol{g}(\boldsymbol{x},\boldsymbol{u})
H=J+λTg=f(x,u)+λTg(x,u)
就可以按照无约束的方法进行求解,目标函数存在极值的必要条件是:
{
∂
H
∂
x
=
0
∂
H
∂
u
=
0
∂
H
∂
λ
=
0
⇒
{
∂
f
∂
x
+
(
∂
g
∂
x
)
T
λ
=
0
∂
f
∂
u
+
(
∂
g
∂
u
)
T
λ
=
0
g
(
x
,
u
)
=
0
\begin{cases} \displaystyle\frac{\partial H}{\partial\boldsymbol{x}}=\boldsymbol{0}\\ \displaystyle\frac{\partial H}{\partial\boldsymbol{u}}=\boldsymbol{0}\\ \displaystyle\frac{\partial H}{\partial\boldsymbol{\lambda}}=\boldsymbol{0} \end{cases}\\ \Rightarrow\begin{cases} \displaystyle\frac{\partial f}{\partial\boldsymbol{x}}+\left(\displaystyle\frac{\partial\boldsymbol{g}}{\partial\boldsymbol{x}}\right)^T\boldsymbol{\lambda}=0\\ \displaystyle\frac{\partial f}{\partial\boldsymbol{u}}+\left(\displaystyle\frac{\partial\boldsymbol{g}}{\partial\boldsymbol{u}}\right)^T\boldsymbol{\lambda}=0\\ \boldsymbol{g}(\boldsymbol{x},\boldsymbol{u})=\boldsymbol{0} \end{cases}
⎩
⎨
⎧∂x∂H=0∂u∂H=0∂λ∂H=0⇒⎩
⎨
⎧∂x∂f+(∂x∂g)Tλ=0∂u∂f+(∂u∂g)Tλ=0g(x,u)=0
其中:
∂
g
∂
x
=
(
∂
g
1
∂
x
1
∂
g
1
∂
x
2
⋯
∂
g
1
∂
x
n
∂
g
2
∂
x
1
∂
g
2
∂
x
2
⋯
∂
g
2
∂
x
n
⋮
⋮
⋮
∂
g
n
∂
x
1
∂
g
n
∂
x
2
⋯
∂
g
n
∂
x
n
)
∂
g
∂
u
=
(
∂
g
1
∂
u
1
∂
g
1
∂
u
2
⋯
∂
g
1
∂
u
n
∂
g
2
∂
u
1
∂
g
2
∂
u
2
⋯
∂
g
2
∂
u
n
⋮
⋮
⋮
∂
g
n
∂
u
1
∂
g
n
∂
u
2
⋯
∂
g
n
∂
u
n
)
\frac{\partial\boldsymbol{g}}{\partial\boldsymbol{x}}=\begin{pmatrix} \displaystyle\frac{\partial g_1}{\partial x_1} & \displaystyle\frac{\partial g_1}{\partial x_2} & \cdots & \displaystyle\frac{\partial g_1}{\partial x_n}\\ \displaystyle\frac{\partial g_2}{\partial x_1} & \displaystyle\frac{\partial g_2}{\partial x_2} & \cdots & \displaystyle\frac{\partial g_2}{\partial x_n}\\ \vdots & \vdots & & \vdots\\ \displaystyle\frac{\partial g_n}{\partial x_1} & \displaystyle\frac{\partial g_n}{\partial x_2} & \cdots & \displaystyle\frac{\partial g_n}{\partial x_n} \end{pmatrix}\\\textrm{ }\\ \frac{\partial\boldsymbol{g}}{\partial\boldsymbol{u}}=\begin{pmatrix} \displaystyle\frac{\partial g_1}{\partial u_1} & \displaystyle\frac{\partial g_1}{\partial u_2} & \cdots & \displaystyle\frac{\partial g_1}{\partial u_n}\\ \displaystyle\frac{\partial g_2}{\partial u_1} & \displaystyle\frac{\partial g_2}{\partial u_2} & \cdots & \displaystyle\frac{\partial g_2}{\partial u_n}\\ \vdots & \vdots & & \vdots\\ \displaystyle\frac{\partial g_n}{\partial u_1} & \displaystyle\frac{\partial g_n}{\partial u_2} & \cdots & \displaystyle\frac{\partial g_n}{\partial u_n} \end{pmatrix}
∂x∂g=
∂x1∂g1∂x1∂g2⋮∂x1∂gn∂x2∂g1∂x2∂g2⋮∂x2∂gn⋯⋯⋯∂xn∂g1∂xn∂g2⋮∂xn∂gn
∂u∂g=
∂u1∂g1∂u1∂g2⋮∂u1∂gn∂u2∂g1∂u2∂g2⋮∂u2∂gn⋯⋯⋯∂un∂g1∂un∂g2⋮∂un∂gn
【例】求使
J
=
f
(
x
,
u
)
=
1
2
x
T
Q
1
x
+
1
2
u
T
Q
2
u
J=f(\boldsymbol{x},\boldsymbol{u})=\displaystyle\frac{1}{2}\boldsymbol{x}^T\boldsymbol{Q}_1\boldsymbol{x}+\displaystyle\frac{1}{2}\boldsymbol{u}^T\boldsymbol{Q}_2\boldsymbol{u}
J=f(x,u)=21xTQ1x+21uTQ2u 取极值的
x
∗
\boldsymbol{x}^*
x∗ 和
u
∗
\boldsymbol{u}^*
u∗。它们满足约束条件
g
(
x
,
u
)
=
x
+
F
u
+
d
=
0
\boldsymbol{g}(\boldsymbol{x},\boldsymbol{u})=\boldsymbol{x}+\boldsymbol{F}\boldsymbol{u}+\boldsymbol{d}=\boldsymbol{0}
g(x,u)=x+Fu+d=0,其中
Q
1
,
Q
2
\boldsymbol{Q}_1, \boldsymbol{Q}_2
Q1,Q2 均为正定矩阵,
F
\boldsymbol{F}
F 维任意矩阵。
【解】构造拉格朗日函数:
H
=
J
+
λ
T
g
=
1
2
x
T
Q
1
x
+
1
2
u
T
Q
2
u
+
λ
T
(
x
+
F
u
+
d
)
H=J+\boldsymbol{\lambda}^T\boldsymbol{g}=\displaystyle\frac{1}{2}\boldsymbol{x}^T\boldsymbol{Q}_1\boldsymbol{x}+\displaystyle\frac{1}{2}\boldsymbol{u}^T\boldsymbol{Q}_2\boldsymbol{u}+\boldsymbol{\lambda}^T(\boldsymbol{x}+\boldsymbol{F}\boldsymbol{u}+\boldsymbol{d})
H=J+λTg=21xTQ1x+21uTQ2u+λT(x+Fu+d)
由极值存在的必要条件可知:
{
∂
H
∂
x
=
Q
1
x
+
λ
=
0
∂
H
∂
u
=
Q
2
u
+
F
T
λ
=
0
∂
H
∂
λ
=
x
+
F
u
+
d
=
0
\begin{cases} \displaystyle\frac{\partial H}{\partial\boldsymbol{x}}=\boldsymbol{Q}_1\boldsymbol{x}+\boldsymbol{\lambda}=\boldsymbol{0}\\ \displaystyle\frac{\partial H}{\partial\boldsymbol{u}}=\boldsymbol{Q}_2\boldsymbol{u}+\boldsymbol{F}^T\boldsymbol{\lambda}=\boldsymbol{0}\\ \displaystyle\frac{\partial H}{\partial\boldsymbol{\lambda}}=\boldsymbol{x}+\boldsymbol{F}\boldsymbol{u}+\boldsymbol{d}=\boldsymbol{0} \end{cases}
⎩
⎨
⎧∂x∂H=Q1x+λ=0∂u∂H=Q2u+FTλ=0∂λ∂H=x+Fu+d=0
由于
Q
1
,
Q
2
\boldsymbol{Q}_1,\boldsymbol{Q}_2
Q1,Q2 正定所以极值存在,讲上述方程联立可以求得极值为:
{
x
∗
=
−
[
I
−
F
(
Q
2
+
F
T
Q
1
F
)
−
1
F
T
Q
1
]
d
u
∗
=
−
(
Q
2
+
F
T
Q
1
F
)
−
1
F
T
Q
1
d
λ
∗
=
Q
1
[
I
−
F
(
Q
2
+
F
T
Q
1
F
)
−
1
F
T
Q
1
]
d
\begin{cases} \boldsymbol{x}^*=-[\boldsymbol{I}-\boldsymbol{F}(\boldsymbol{Q}_2+\boldsymbol{F}^T\boldsymbol{Q_1}\boldsymbol{F})^{-1}\boldsymbol{F}^T\boldsymbol{Q}_1]\boldsymbol{d}\\ \boldsymbol{u}^*=-(\boldsymbol{Q}_2+\boldsymbol{F}^T\boldsymbol{Q}_1\boldsymbol{F})^{-1}\boldsymbol{F}^T\boldsymbol{Q}_1\boldsymbol{d}\\ \boldsymbol{\lambda}^*=\boldsymbol{Q}_1[\boldsymbol{I}-\boldsymbol{F}(\boldsymbol{Q}_2+\boldsymbol{F}^T\boldsymbol{Q_1}\boldsymbol{F})^{-1}\boldsymbol{F}^T\boldsymbol{Q}_1]\boldsymbol{d} \end{cases}
⎩
⎨
⎧x∗=−[I−F(Q2+FTQ1F)−1FTQ1]du∗=−(Q2+FTQ1F)−1FTQ1dλ∗=Q1[I−F(Q2+FTQ1F)−1FTQ1]d
3. 离散时间系统的最优控制
3.1 基本形式
考虑如下离散时间系统:
{
x
(
k
+
1
)
=
f
[
x
(
k
)
,
u
(
k
)
,
k
]
,
(
k
=
0
,
1
,
⋯
,
N
−
1
)
x
(
0
)
=
x
0
\begin{cases} \boldsymbol{x}(k+1)=\boldsymbol{f}[\boldsymbol{x}(k),\boldsymbol{u}(k),k],(k=0,1,\cdots,N-1)\\ \boldsymbol{x}(0)=\boldsymbol{x}_0 \end{cases}
{x(k+1)=f[x(k),u(k),k],(k=0,1,⋯,N−1)x(0)=x0
最优控制的问题就是确定矢量序列
{
u
(
1
)
,
u
(
2
)
,
⋯
,
u
(
N
−
1
)
}
\{\boldsymbol{u}(1),\boldsymbol{u}(2),\cdots,\boldsymbol{u}(N-1)\}
{u(1),u(2),⋯,u(N−1)} 使得下列函数取得最小值:
J
=
Φ
[
x
(
N
)
]
+
∑
k
=
0
N
−
1
L
[
x
(
k
)
,
u
(
k
)
,
k
]
J=\varPhi[\boldsymbol{x}(N)]+\sum_{k=0}^{N-1}L[\boldsymbol{x}(k),\boldsymbol{u}(k),k]
J=Φ[x(N)]+k=0∑N−1L[x(k),u(k),k]
这里暂且假定
x
(
N
)
\boldsymbol{x}(N)
x(N) 为自由终端,这样的问题和前一个小节中的无约束问题实际上并没有区别,于是可以构建约束方程为:
f
[
x
(
k
)
,
u
(
k
)
,
k
]
−
x
(
k
+
1
)
=
0
,
(
k
=
0
,
1
,
⋯
,
N
−
1
)
\boldsymbol{f}[\boldsymbol{x}(k),\boldsymbol{u}(k),k]-\boldsymbol{x}(k+1)=\boldsymbol{0},(k=0,1,\cdots,N-1)
f[x(k),u(k),k]−x(k+1)=0,(k=0,1,⋯,N−1)
在这个优化问题中,待优化变量总共有这样的
N
(
n
+
r
)
N(n+r)
N(n+r) 个:
{
x
1
(
k
)
,
x
2
(
k
)
,
⋯
,
x
n
(
k
)
}
,
(
k
=
1
,
2
,
⋯
,
N
)
{
u
1
(
k
)
,
u
2
(
k
)
,
⋯
,
u
r
(
k
)
}
,
(
k
=
0
,
1
,
⋯
,
N
−
1
)
\{x_1(k),x_2(k),\cdots,x_n(k)\},(k=1,2,\cdots,N)\\ \{u_1(k),u_2(k),\cdots,u_r(k)\},(k=0,1,\cdots,N-1)
{x1(k),x2(k),⋯,xn(k)},(k=1,2,⋯,N){u1(k),u2(k),⋯,ur(k)},(k=0,1,⋯,N−1)
因此需要的拉格朗日乘数的数量也需要对应的进行扩大,这里定义
N
n
Nn
Nn 个变量:
{
λ
1
(
k
)
,
λ
2
(
k
)
,
⋯
,
λ
n
(
k
)
}
,
(
k
=
1
,
2
,
⋯
,
N
)
\{\lambda_1(k),\lambda_2(k),\cdots,\lambda_n(k)\},(k=1,2,\cdots,N)
{λ1(k),λ2(k),⋯,λn(k)},(k=1,2,⋯,N)
于是构造一个新的拉格朗日函数为:
V
=
Φ
[
x
(
N
)
]
+
∑
k
=
0
N
−
1
{
L
[
x
(
k
)
,
u
(
k
)
,
k
]
+
λ
T
(
k
+
1
)
[
f
[
x
(
k
)
,
u
(
k
)
,
k
]
−
x
(
k
+
1
)
]
}
≡
Φ
[
x
(
N
)
]
+
∑
k
=
0
N
−
1
[
L
k
[
x
(
k
)
,
u
(
k
)
]
+
λ
T
(
k
+
1
)
f
k
[
x
(
k
)
,
u
(
k
)
]
−
λ
T
(
k
+
1
)
x
(
k
+
1
)
]
≡
Φ
[
x
(
N
)
]
+
∑
k
=
0
N
−
1
[
H
k
−
λ
T
(
k
+
1
)
x
(
k
+
1
)
]
=
Φ
[
x
(
N
)
]
+
H
0
−
λ
T
(
N
)
x
(
N
)
+
∑
k
=
1
N
−
1
[
H
k
−
λ
T
(
k
)
x
(
k
)
]
\begin{aligned} V&=\varPhi[\boldsymbol{x}(N)]+\sum_{k=0}^{N-1}\left\{L[\boldsymbol{x}(k),\boldsymbol{u}(k),k]+\boldsymbol{\lambda}^T(k+1)[\boldsymbol{f}[\boldsymbol{x}(k),\boldsymbol{u}(k),k]-\boldsymbol{x}(k+1)]\right\}\\ &\equiv\varPhi[\boldsymbol{x}(N)]+\sum_{k=0}^{N-1}\left[L_k[\boldsymbol{x}(k),\boldsymbol{u}(k)]+\boldsymbol{\lambda}^T(k+1)\boldsymbol{f}_k[\boldsymbol{x}(k),\boldsymbol{u}(k)]-\boldsymbol{\lambda}^T(k+1)\boldsymbol{x}(k+1)\right]\\ &\equiv\varPhi[\boldsymbol{x}(N)]+\sum_{k=0}^{N-1}\left[H_k-\boldsymbol{\lambda}^T(k+1)\boldsymbol{x}(k+1)\right]\\ &=\varPhi[\boldsymbol{x}(N)]+H_0-\boldsymbol{\lambda}^T(N)\boldsymbol{x}(N)+\sum_{k=1}^{N-1}\left[H_k-\boldsymbol{\lambda}^T(k)\boldsymbol{x}(k)\right] \end{aligned}
V=Φ[x(N)]+k=0∑N−1{L[x(k),u(k),k]+λT(k+1)[f[x(k),u(k),k]−x(k+1)]}≡Φ[x(N)]+k=0∑N−1[Lk[x(k),u(k)]+λT(k+1)fk[x(k),u(k)]−λT(k+1)x(k+1)]≡Φ[x(N)]+k=0∑N−1[Hk−λT(k+1)x(k+1)]=Φ[x(N)]+H0−λT(N)x(N)+k=1∑N−1[Hk−λT(k)x(k)]
对该问题求最优可以得到极值条件:
{
∂
V
∂
x
(
k
)
=
∂
H
k
∂
x
(
k
)
−
λ
(
k
)
=
0
,
(
k
=
1
,
2
,
⋯
,
N
−
1
)
∂
V
∂
x
(
N
)
=
∂
Φ
[
x
(
N
)
]
∂
x
(
N
)
−
λ
(
N
)
=
0
∂
V
∂
u
(
k
)
=
∂
H
k
∂
u
(
k
)
=
0
,
(
k
=
0
,
1
,
⋯
,
N
−
1
)
∂
V
∂
λ
(
k
)
=
f
k
−
1
[
x
(
k
−
1
)
,
u
(
k
−
1
)
]
−
x
(
k
)
=
0
,
(
k
=
1
,
2
,
⋯
,
N
)
⇒
{
x
(
0
)
=
x
0
∂
H
k
∂
x
(
k
)
=
∂
L
k
[
x
(
k
)
,
u
(
k
)
]
∂
x
(
k
)
+
[
∂
f
k
[
x
(
k
)
,
u
(
k
)
]
∂
x
(
k
)
]
T
λ
(
k
+
1
)
=
λ
(
k
)
∂
H
k
∂
u
(
k
)
=
∂
L
k
[
x
(
k
)
,
u
(
k
)
]
∂
u
(
k
)
+
[
∂
f
k
[
x
(
k
)
,
u
(
k
)
]
∂
u
(
k
)
]
T
λ
(
k
+
1
)
=
0
∂
H
k
∂
λ
(
k
+
1
)
=
f
k
[
x
(
k
)
,
u
(
k
)
]
−
x
(
k
+
1
)
=
0
∂
Φ
[
x
(
N
)
]
x
(
N
)
=
λ
(
N
)
(
k
=
0
,
1
,
⋯
,
N
−
1
)
\begin{cases} \displaystyle\frac{\partial V}{\partial\boldsymbol{x}(k)}=\frac{\partial H_k}{\partial\boldsymbol{x}(k)}-\boldsymbol{\lambda}(k)=\boldsymbol{0},(k=1,2,\cdots,N-1)\\ \displaystyle\frac{\partial V}{\partial\boldsymbol{x}(N)}=\frac{\partial\varPhi[\boldsymbol{x}(N)]}{\partial\boldsymbol{x}(N)}-\boldsymbol{\lambda}(N)=\boldsymbol{0}\\ \displaystyle\frac{\partial V}{\partial\boldsymbol{u}(k)}=\frac{\partial H_k}{\partial\boldsymbol{u}(k)}=\boldsymbol{0},(k=0,1,\cdots,N-1)\\ \displaystyle\frac{\partial V}{\partial\boldsymbol{\lambda}(k)}=\boldsymbol{f}_{k-1}[\boldsymbol{x}(k-1),\boldsymbol{u}(k-1)]-\boldsymbol{x}(k)=\boldsymbol{0},(k=1,2,\cdots,N) \end{cases}\\ \Rightarrow\begin{cases} \boldsymbol{x}(0)=\boldsymbol{x}_0\\ \displaystyle\frac{\partial H_k}{\partial\boldsymbol{x}(k)}=\frac{\partial L_k[\boldsymbol{x}(k),\boldsymbol{u}(k)]}{\partial\boldsymbol{x}(k)}+\left[\frac{\partial\boldsymbol{f}_k[\boldsymbol{x}(k),\boldsymbol{u}(k)]}{\partial\boldsymbol{x}(k)}\right]^T\boldsymbol{\lambda}(k+1)=\boldsymbol{\lambda}(k)\\ \displaystyle\frac{\partial H_k}{\partial\boldsymbol{u}(k)}=\frac{\partial L_k[\boldsymbol{x}(k),\boldsymbol{u}(k)]}{\partial\boldsymbol{u}(k)}+\left[\frac{\partial\boldsymbol{f}_k[\boldsymbol{x}(k),\boldsymbol{u}(k)]}{\partial\boldsymbol{u}(k)}\right]^T\boldsymbol{\lambda}(k+1)=\boldsymbol{0}\\ \displaystyle\frac{\partial H_k}{\partial\boldsymbol{\lambda}(k+1)}=\boldsymbol{f}_k[\boldsymbol{x}(k),\boldsymbol{u}(k)]-\boldsymbol{x}(k+1)=\boldsymbol{0}\\ \displaystyle\frac{\partial\varPhi[\boldsymbol{x}(N)]}{\boldsymbol{x}(N)}=\boldsymbol{\lambda}(N) \end{cases}\\ (k=0,1,\cdots,N-1)
⎩
⎨
⎧∂x(k)∂V=∂x(k)∂Hk−λ(k)=0,(k=1,2,⋯,N−1)∂x(N)∂V=∂x(N)∂Φ[x(N)]−λ(N)=0∂u(k)∂V=∂u(k)∂Hk=0,(k=0,1,⋯,N−1)∂λ(k)∂V=fk−1[x(k−1),u(k−1)]−x(k)=0,(k=1,2,⋯,N)⇒⎩
⎨
⎧x(0)=x0∂x(k)∂Hk=∂x(k)∂Lk[x(k),u(k)]+[∂x(k)∂fk[x(k),u(k)]]Tλ(k+1)=λ(k)∂u(k)∂Hk=∂u(k)∂Lk[x(k),u(k)]+[∂u(k)∂fk[x(k),u(k)]]Tλ(k+1)=0∂λ(k+1)∂Hk=fk[x(k),u(k)]−x(k+1)=0x(N)∂Φ[x(N)]=λ(N)(k=0,1,⋯,N−1)
上述方程组中各个方程具有的方程个数分别为
n
,
(
N
−
1
)
n
,
N
r
,
N
n
,
n
n,(N-1)n,Nr,Nn,n
n,(N−1)n,Nr,Nn,n ,总计
(
2
n
+
r
)
N
+
n
(2n+r)N+n
(2n+r)N+n。而现在拥有的待优化变量数为:
N
rank
(
x
)
+
N
rank
(
λ
)
+
N
rank
(
u
)
=
(
2
n
+
r
)
N
N\textrm{rank}(\boldsymbol{x})+N\textrm{rank}(\boldsymbol{\lambda})+N\textrm{rank}(\boldsymbol{u})=(2n+r)N
Nrank(x)+Nrank(λ)+Nrank(u)=(2n+r)N
如果在上述问题中,给定终端时刻 N 上的 λ ( N ) \boldsymbol{\lambda}(N) λ(N) 和初始时刻的 x ( 0 ) \boldsymbol{x}(0) x(0),求解这种给定两点边界的问题称为两点边值问题。
3.2 具有二次型性能指标的线性时不变系统
设离散时间线性定常系统:
{
x
(
k
+
1
)
=
G
x
(
k
)
+
H
u
(
k
)
,
(
k
=
0
,
1
,
⋯
,
N
−
1
)
x
(
0
)
=
x
0
\begin{cases} \boldsymbol{x}(k+1)=\boldsymbol{G}\boldsymbol{x}(k)+\boldsymbol{H}\boldsymbol{u}(k),(k=0,1,\cdots,N-1)\\ \boldsymbol{x}(0)=\boldsymbol{x}_0 \end{cases}
{x(k+1)=Gx(k)+Hu(k),(k=0,1,⋯,N−1)x(0)=x0
要求的性能指标为:
J
=
∑
k
=
1
N
−
1
1
2
[
x
T
(
k
)
Q
1
(
k
)
x
(
k
)
+
u
T
(
k
)
Q
2
(
k
)
u
(
k
)
]
+
1
2
x
T
(
N
)
Q
0
(
N
)
x
(
N
)
J=\sum_{k=1}^{N-1}\frac{1}{2}\left[\boldsymbol{x}^T(k)\boldsymbol{Q}_1(k)\boldsymbol{x}(k)+\boldsymbol{u}^T(k)\boldsymbol{Q}_2(k)\boldsymbol{u}(k)\right]+\frac{1}{2}\boldsymbol{x}^T(N)\boldsymbol{Q}_0(N)\boldsymbol{x}(N)
J=k=1∑N−121[xT(k)Q1(k)x(k)+uT(k)Q2(k)u(k)]+21xT(N)Q0(N)x(N)
其中 Q 0 ( N ) , Q 1 ( k ) , Q 2 ( k ) \boldsymbol{Q}_0(N),\boldsymbol{Q}_1(k),\boldsymbol{Q}_2(k) Q0(N),Q1(k),Q2(k) 都是正定矩阵,且系统的状态转移矩阵 G \boldsymbol{G} G 可逆。
那么按照前一节的推到可以写出
H
k
=
1
2
[
x
T
(
k
)
Q
1
(
k
)
x
(
k
)
+
u
T
(
k
)
Q
2
(
k
)
u
(
k
)
]
+
λ
T
(
k
+
1
)
[
G
x
(
k
)
+
H
u
(
k
)
]
Φ
[
x
(
N
)
]
=
1
2
x
T
(
N
)
Q
1
(
N
)
x
(
N
)
H_k=\frac{1}{2}[\boldsymbol{x}^T(k)\boldsymbol{Q}_1(k)\boldsymbol{x}(k)+\boldsymbol{u}^T(k)\boldsymbol{Q}_2(k)\boldsymbol{u}(k)]+\boldsymbol{\lambda}^T(k+1)[\boldsymbol{Gx}(k)+\boldsymbol{Hu}(k)]\\ \varPhi[\boldsymbol{x}(N)]=\frac{1}{2}\boldsymbol{x}^T(N)\boldsymbol{Q}_1(N)\boldsymbol{x}(N)
Hk=21[xT(k)Q1(k)x(k)+uT(k)Q2(k)u(k)]+λT(k+1)[Gx(k)+Hu(k)]Φ[x(N)]=21xT(N)Q1(N)x(N)
那么就可以根据前面的方程得到受约束的极值条件:
{
x
(
0
)
=
x
0
Q
1
(
k
)
x
(
k
)
+
G
T
λ
(
k
+
1
)
=
λ
(
k
)
Q
2
(
k
)
u
(
k
)
+
H
T
λ
(
k
+
1
)
=
0
G
x
(
k
)
+
H
u
(
k
)
−
x
(
k
+
1
)
=
0
Q
0
(
N
)
x
(
N
)
=
λ
(
N
)
k
=
0
,
1
,
⋯
,
N
−
1
\begin{cases} \boldsymbol{x}(0)=\boldsymbol{x}_0\\ \boldsymbol{Q}_1(k)\boldsymbol{x}(k)+\boldsymbol{G}^T\boldsymbol{\lambda}(k+1)=\boldsymbol{\lambda}(k)\\ \boldsymbol{Q}_2(k)\boldsymbol{u}(k)+\boldsymbol{H}^T\boldsymbol{\lambda}(k+1)=\boldsymbol{0}\\ \boldsymbol{Gx}(k)+\boldsymbol{Hu}(k)-\boldsymbol{x}(k+1)=\boldsymbol{0}\\ \boldsymbol{Q}_0(N)\boldsymbol{x}(N)=\boldsymbol{\lambda}(N) \end{cases}\\ k=0,1,\cdots,N-1
⎩
⎨
⎧x(0)=x0Q1(k)x(k)+GTλ(k+1)=λ(k)Q2(k)u(k)+HTλ(k+1)=0Gx(k)+Hu(k)−x(k+1)=0Q0(N)x(N)=λ(N)k=0,1,⋯,N−1
根据第三个方程组可以得到:
u
(
k
)
=
−
Q
2
−
1
(
k
)
H
T
λ
(
k
+
1
)
\boldsymbol{u}(k)=-\boldsymbol{Q}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{\lambda}(k+1)
u(k)=−Q2−1(k)HTλ(k+1)
带入第四个方程组:
x
(
k
+
1
)
=
G
x
(
k
)
−
H
Q
2
−
1
(
k
)
H
T
λ
(
k
+
1
)
\boldsymbol{x}(k+1)=\boldsymbol{G}\boldsymbol{x}(k)-\boldsymbol{HQ}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{\lambda}(k+1)
x(k+1)=Gx(k)−HQ2−1(k)HTλ(k+1)
由最后一个方程可知,存在矩阵 P ( N ) = Q 0 ( N ) \boldsymbol{P}(N)=\boldsymbol{Q}_0(N) P(N)=Q0(N) 使得 λ ( N ) = P ( N ) x ( N ) \boldsymbol{\lambda}(N)=\boldsymbol{P}(N)\boldsymbol{x}(N) λ(N)=P(N)x(N) 成立。
假设当
k
=
n
+
1
k=n+1
k=n+1 时存在矩阵
P
(
n
+
1
)
\boldsymbol{P}(n+1)
P(n+1) 使得
λ
(
n
+
1
)
=
P
(
n
+
1
)
x
(
n
+
1
)
\boldsymbol{\lambda}(n+1)=\boldsymbol{P}(n+1)\boldsymbol{x}(n+1)
λ(n+1)=P(n+1)x(n+1) 成立,接下来证明当
k
=
n
k=n
k=n 时该结论仍然成立:
x
(
n
+
1
)
=
G
x
(
n
)
−
H
Q
2
−
1
(
n
)
H
T
λ
(
n
+
1
)
=
G
x
(
n
)
−
H
Q
2
−
1
(
n
)
H
T
P
(
n
+
1
)
x
(
n
+
1
)
⇒
x
(
n
+
1
)
=
[
I
+
H
Q
2
−
1
(
n
)
H
T
P
(
n
+
1
)
]
−
1
G
x
(
n
)
⇒
λ
(
n
+
1
)
=
P
(
n
+
1
)
[
I
+
H
Q
2
−
1
(
n
)
H
T
P
(
n
+
1
)
]
−
1
G
x
(
n
)
\begin{aligned} \boldsymbol{x}(n+1)&=\boldsymbol{G}\boldsymbol{x}(n)-\boldsymbol{HQ}_2^{-1}(n)\boldsymbol{H}^T\boldsymbol{\lambda}(n+1)\\ &=\boldsymbol{G}\boldsymbol{x}(n)-\boldsymbol{HQ}_2^{-1}(n)\boldsymbol{H}^T\boldsymbol{P}(n+1)\boldsymbol{x}(n+1) \end{aligned}\\ \Rightarrow\boldsymbol{x}(n+1)=\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(n)\boldsymbol{H}^T\boldsymbol{P}(n+1)\right]^{-1}\boldsymbol{G}\boldsymbol{x}(n)\\ \Rightarrow\boldsymbol{\lambda}(n+1)=\boldsymbol{P}(n+1)\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(n)\boldsymbol{H}^T\boldsymbol{P}(n+1)\right]^{-1}\boldsymbol{G}\boldsymbol{x}(n)
x(n+1)=Gx(n)−HQ2−1(n)HTλ(n+1)=Gx(n)−HQ2−1(n)HTP(n+1)x(n+1)⇒x(n+1)=[I+HQ2−1(n)HTP(n+1)]−1Gx(n)⇒λ(n+1)=P(n+1)[I+HQ2−1(n)HTP(n+1)]−1Gx(n)
带入上面的第二个方程组可以得到:
Q
1
(
n
)
x
(
n
)
+
G
T
P
(
n
+
1
)
[
I
+
H
Q
2
−
1
(
n
)
H
T
P
(
n
+
1
)
]
−
1
G
x
(
n
)
=
λ
(
n
)
⇒
λ
(
n
)
=
{
Q
1
(
n
)
+
G
T
P
(
n
+
1
)
[
I
+
H
Q
2
−
1
(
n
)
H
T
P
(
n
+
1
)
]
−
1
G
}
x
(
n
)
\boldsymbol{Q}_1(n)\boldsymbol{x}(n)+\boldsymbol{G}^T\boldsymbol{P}(n+1)\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(n)\boldsymbol{H}^T\boldsymbol{P}(n+1)\right]^{-1}\boldsymbol{G}\boldsymbol{x}(n)=\boldsymbol{\lambda}(n)\\ \Rightarrow\boldsymbol{\lambda}(n)=\left\{\boldsymbol{Q}_1(n)+\boldsymbol{G}^T\boldsymbol{P}(n+1)\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(n)\boldsymbol{H}^T\boldsymbol{P}(n+1)\right]^{-1}\boldsymbol{G}\right\}\boldsymbol{x}(n)
Q1(n)x(n)+GTP(n+1)[I+HQ2−1(n)HTP(n+1)]−1Gx(n)=λ(n)⇒λ(n)={Q1(n)+GTP(n+1)[I+HQ2−1(n)HTP(n+1)]−1G}x(n)
即存在 P ( n ) = Q 1 ( n ) + G T P ( n + 1 ) [ I + H Q 2 − 1 ( n ) H T P ( n + 1 ) ] − 1 G \boldsymbol{P}(n)=\boldsymbol{Q}_1(n)+\boldsymbol{G}^T\boldsymbol{P}(n+1)\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(n)\boldsymbol{H}^T\boldsymbol{P}(n+1)\right]^{-1}\boldsymbol{G} P(n)=Q1(n)+GTP(n+1)[I+HQ2−1(n)HTP(n+1)]−1G 使得表达式 λ ( k ) = P ( k ) x ( k ) \boldsymbol{\lambda}(k)=\boldsymbol{P}(k)\boldsymbol{x}(k) λ(k)=P(k)x(k) 对 k = n k=n k=n 成立。
所以由数学归纳法可以知道,对于任意的 k = 0 , 1 , ⋯ , N − 1 k=0,1,\cdots,N-1 k=0,1,⋯,N−1 都有 λ ( k ) = P ( k ) x ( k ) \boldsymbol{\lambda}(k)=\boldsymbol{P}(k)\boldsymbol{x}(k) λ(k)=P(k)x(k)。而且可以根据上述递推公式从 Q 0 ( N ) \boldsymbol{Q}_0(N) Q0(N) 开始依次向前迭代得到 P ( N − 1 ) , P ( N − 2 ) , ⋯ , P ( 1 ) \boldsymbol{P}(N-1),\boldsymbol{P}(N-2),\cdots,\boldsymbol{P}(1) P(N−1),P(N−2),⋯,P(1) 。
最后,根据上述结论结合前面的方程组可以得到最优控制率为:
Q
2
(
k
)
u
(
k
)
=
−
H
T
λ
(
k
+
1
)
=
−
H
T
P
(
k
+
1
)
x
(
k
+
1
)
⇒
u
(
k
)
=
−
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
x
(
k
+
1
)
⇒
G
x
(
k
)
−
H
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
x
(
k
+
1
)
−
x
(
k
+
1
)
=
0
⇒
x
(
k
+
1
)
=
[
I
+
H
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
]
−
1
G
x
(
k
)
⇒
u
(
k
)
=
−
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
[
I
+
H
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
]
−
1
G
x
(
k
)
\boldsymbol{Q}_2(k)\boldsymbol{u}(k)=-\boldsymbol{H}^T\boldsymbol{\lambda}(k+1)=-\boldsymbol{H}^T\boldsymbol{P}(k+1)\boldsymbol{x}(k+1)\\ \Rightarrow\boldsymbol{u}(k)=-\boldsymbol{Q}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\boldsymbol{x}(k+1)\\ \Rightarrow\boldsymbol{Gx}(k)-\boldsymbol{HQ}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\boldsymbol{x}(k+1)-\boldsymbol{x}(k+1)=\boldsymbol{0}\\ \Rightarrow\boldsymbol{x}(k+1)=\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\right]^{-1}\boldsymbol{Gx}(k)\\ \Rightarrow\boldsymbol{u}(k)=-\boldsymbol{Q}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\right]^{-1}\boldsymbol{Gx}(k)
Q2(k)u(k)=−HTλ(k+1)=−HTP(k+1)x(k+1)⇒u(k)=−Q2−1(k)HTP(k+1)x(k+1)⇒Gx(k)−HQ2−1(k)HTP(k+1)x(k+1)−x(k+1)=0⇒x(k+1)=[I+HQ2−1(k)HTP(k+1)]−1Gx(k)⇒u(k)=−Q2−1(k)HTP(k+1)[I+HQ2−1(k)HTP(k+1)]−1Gx(k)
然后整理一下就可以得到常用的最优控制率形式:
u
(
k
)
=
−
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
[
I
+
H
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
]
−
1
G
x
(
k
)
=
−
[
Q
2
(
k
)
+
H
T
P
(
k
+
1
)
H
]
−
1
[
Q
2
(
k
)
+
H
T
P
(
k
+
1
)
H
]
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
⋅
[
I
+
H
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
]
−
1
G
x
(
k
)
=
−
[
Q
2
(
k
)
+
H
T
P
(
k
+
1
)
H
]
−
1
[
H
T
P
(
k
+
1
)
+
H
T
P
(
k
+
1
)
H
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
]
⋅
[
I
+
H
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
]
−
1
G
x
(
k
)
=
−
[
Q
2
(
k
)
+
H
T
P
(
k
+
1
)
H
]
−
1
H
T
P
(
k
+
1
)
[
I
+
H
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
]
⋅
[
I
+
H
Q
2
−
1
(
k
)
H
T
P
(
k
+
1
)
]
−
1
G
x
(
k
)
=
−
[
Q
2
(
k
)
+
H
T
P
(
k
+
1
)
H
]
−
1
H
T
P
(
k
+
1
)
G
x
(
k
)
\begin{aligned} \boldsymbol{u}(k)&=-\boldsymbol{Q}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\right]^{-1}\boldsymbol{Gx}(k)\\ &=-\left[\boldsymbol{Q}_2(k)+\boldsymbol{H}^T\boldsymbol{P}(k+1)\boldsymbol{H}\right]^{-1}\left[\boldsymbol{Q}_2(k)+\boldsymbol{H}^T\boldsymbol{P}(k+1)\boldsymbol{H}\right]\boldsymbol{Q}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\cdot\\ &\qquad\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\right]^{-1}\boldsymbol{Gx}(k)\\ &=-\left[\boldsymbol{Q}_2(k)+\boldsymbol{H}^T\boldsymbol{P}(k+1)\boldsymbol{H}\right]^{-1}\left[\boldsymbol{H}^T\boldsymbol{P}(k+1)+\boldsymbol{H}^T\boldsymbol{P}(k+1)\boldsymbol{H}\boldsymbol{Q}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\right]\cdot\\ &\qquad\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\right]^{-1}\boldsymbol{Gx}(k)\\ &=-\left[\boldsymbol{Q}_2(k)+\boldsymbol{H}^T\boldsymbol{P}(k+1)\boldsymbol{H}\right]^{-1}\boldsymbol{H}^T\boldsymbol{P}(k+1)\left[\boldsymbol{I}+\boldsymbol{H}\boldsymbol{Q}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\right]\cdot\\ &\qquad\left[\boldsymbol{I}+\boldsymbol{HQ}_2^{-1}(k)\boldsymbol{H}^T\boldsymbol{P}(k+1)\right]^{-1}\boldsymbol{Gx}(k)\\ &=-\left[\boldsymbol{Q}_2(k)+\boldsymbol{H}^T\boldsymbol{P}(k+1)\boldsymbol{H}\right]^{-1}\boldsymbol{H}^T\boldsymbol{P}(k+1)\boldsymbol{Gx}(k) \end{aligned}
u(k)=−Q2−1(k)HTP(k+1)[I+HQ2−1(k)HTP(k+1)]−1Gx(k)=−[Q2(k)+HTP(k+1)H]−1[Q2(k)+HTP(k+1)H]Q2−1(k)HTP(k+1)⋅[I+HQ2−1(k)HTP(k+1)]−1Gx(k)=−[Q2(k)+HTP(k+1)H]−1[HTP(k+1)+HTP(k+1)HQ2−1(k)HTP(k+1)]⋅[I+HQ2−1(k)HTP(k+1)]−1Gx(k)=−[Q2(k)+HTP(k+1)H]−1HTP(k+1)[I+HQ2−1(k)HTP(k+1)]⋅[I+HQ2−1(k)HTP(k+1)]−1Gx(k)=−[Q2(k)+HTP(k+1)H]−1HTP(k+1)Gx(k)
4. 连续时间系统最优控制的离散化处理
设连续系统状态空间方程为:
{
x
˙
(
t
)
=
f
[
x
(
t
)
,
u
(
t
)
,
t
]
x
(
t
0
)
=
x
0
\begin{cases} \dot{\boldsymbol{x}}(t)=\boldsymbol{f}\left[\boldsymbol{x}(t),\boldsymbol{u}(t),t\right]\\ \boldsymbol{x}(t_0)=\boldsymbol{x}_0 \end{cases}
{x˙(t)=f[x(t),u(t),t]x(t0)=x0
目标函数为:
J
=
∫
t
0
t
f
L
[
x
(
t
)
,
u
(
t
)
,
t
]
d
t
+
Φ
[
x
(
t
f
)
]
J=\int_{t_0}^{t_f}L\left[\boldsymbol{x}(t),\boldsymbol{u}(t),t\right]\textrm{d}t+\varPhi\left[\boldsymbol{x}(t_f)\right]
J=∫t0tfL[x(t),u(t),t]dt+Φ[x(tf)]
这里假定 x ( t f ) \boldsymbol{x}(t_f) x(tf) 是自由终端, Φ [ x ( t f ) ] \varPhi[\boldsymbol{x}(t_f)] Φ[x(tf)] 是终端代价函数。
如果按照
Δ
t
\Delta t
Δt 为采样周期对系统直接进行离散化,就可以将该问题转化为静态最优控制问题:
{
f
[
x
(
k
Δ
t
)
,
u
(
k
Δ
t
)
,
k
Δ
t
]
Δ
t
−
[
x
(
k
Δ
t
+
Δ
t
)
−
x
(
k
Δ
t
)
]
=
0
,
(
k
=
0
,
1
,
⋯
,
t
f
Δ
t
−
1
)
J
=
∑
k
=
0
t
f
Δ
t
−
1
L
[
x
(
k
Δ
t
)
,
u
(
k
Δ
t
)
,
k
Δ
t
]
Δ
t
+
Φ
[
x
(
t
f
)
]
⇒
{
f
[
x
(
k
)
,
u
(
k
)
,
k
]
Δ
t
−
[
x
(
k
+
1
)
−
x
(
k
)
]
=
0
,
(
k
=
0
,
1
,
⋯
,
N
−
1
)
J
=
∑
k
=
0
N
−
1
L
[
x
(
k
)
,
u
(
k
)
,
k
]
Δ
t
+
Φ
[
x
(
N
)
]
\begin{cases} \boldsymbol{f}\left[\boldsymbol{x}(k\Delta t),\boldsymbol{u}(k\Delta t),k\Delta t\right]\Delta t-\left[\boldsymbol{x}(k\Delta t+\Delta t)-\boldsymbol{x}(k\Delta t)\right]=\boldsymbol{0},(k=0,1,\cdots,\frac{t_f}{\Delta t}-1)\\ J=\displaystyle\sum_{k=0}^{\frac{t_f}{\Delta t}-1}L\left[\boldsymbol{x}(k\Delta t),\boldsymbol{u}(k\Delta t),k\Delta t\right]\Delta t+\varPhi\left[\boldsymbol{x}(t_f)\right] \end{cases}\\ \Rightarrow\begin{cases} \boldsymbol{f}\left[\boldsymbol{x}(k),\boldsymbol{u}(k),k\right]\Delta t-\left[\boldsymbol{x}(k+1)-\boldsymbol{x}(k)\right]=\boldsymbol{0},(k=0,1,\cdots,N-1)\\ J=\displaystyle\sum_{k=0}^{N-1}L\left[\boldsymbol{x}(k),\boldsymbol{u}(k),k\right]\Delta t+\varPhi\left[\boldsymbol{x}(N)\right] \end{cases}
⎩
⎨
⎧f[x(kΔt),u(kΔt),kΔt]Δt−[x(kΔt+Δt)−x(kΔt)]=0,(k=0,1,⋯,Δttf−1)J=k=0∑Δttf−1L[x(kΔt),u(kΔt),kΔt]Δt+Φ[x(tf)]⇒⎩
⎨
⎧f[x(k),u(k),k]Δt−[x(k+1)−x(k)]=0,(k=0,1,⋯,N−1)J=k=0∑N−1L[x(k),u(k),k]Δt+Φ[x(N)]
计算出该问题以后考虑 Δ t → 0 \Delta t\to0 Δt→0 的极限情况就可以确定连续系统的最优解。
还是和前面一样采用相似的定义:
H
k
=
L
k
[
x
(
k
)
,
u
(
k
)
]
+
λ
T
(
k
+
1
)
f
k
[
x
(
k
)
,
u
(
k
)
]
V
=
Φ
[
x
(
N
)
]
+
∑
k
=
0
N
−
1
{
L
k
[
x
(
k
)
,
u
(
k
)
]
Δ
t
+
λ
T
(
k
+
1
)
[
f
k
[
x
(
k
)
,
u
(
k
)
]
Δ
t
−
x
(
k
+
1
)
+
x
(
k
)
]
}
=
Φ
[
x
(
N
)
]
+
∑
k
=
0
N
−
1
{
H
k
Δ
t
−
λ
T
(
k
+
1
)
[
x
(
k
+
1
)
−
x
(
k
)
]
}
=
Φ
[
x
(
N
)
]
+
λ
T
(
1
)
x
(
0
)
+
H
0
Δ
t
−
λ
T
(
N
)
x
(
N
)
+
∑
k
=
1
N
−
1
[
H
k
Δ
t
+
(
λ
T
(
k
+
1
)
−
λ
T
(
k
)
)
x
(
k
)
]
\begin{aligned} H_k&=L_k\left[\boldsymbol{x}(k),\boldsymbol{u}(k)\right]+\boldsymbol{\lambda}^T(k+1)\boldsymbol{f}_k\left[\boldsymbol{x}(k),\boldsymbol{u}(k)\right]\\ V&=\varPhi[\boldsymbol{x}(N)]+\sum_{k=0}^{N-1}\left\{L_k\left[\boldsymbol{x}(k),\boldsymbol{u}(k)\right]\Delta t+\boldsymbol{\lambda}^T(k+1)\left[\boldsymbol{f}_k[\boldsymbol{x}(k),\boldsymbol{u}(k)]\Delta t-\boldsymbol{x}(k+1)+\boldsymbol{x}(k)\right]\right\}\\ &=\varPhi[\boldsymbol{x}(N)]+\sum_{k=0}^{N-1}\left\{H_k\Delta t-\boldsymbol{\lambda}^T(k+1)\left[\boldsymbol{x}(k+1)-\boldsymbol{x}(k)\right]\right\}\\ &=\varPhi[\boldsymbol{x}(N)]+\boldsymbol{\lambda}^T(1)\boldsymbol{x}(0)+H_0\Delta t-\boldsymbol{\lambda}^T(N)\boldsymbol{x}(N)+\sum_{k=1}^{N-1}\left[H_k\Delta t+(\boldsymbol{\lambda}^T(k+1)-\boldsymbol{\lambda}^T(k))\boldsymbol{x}(k)\right] \end{aligned}
HkV=Lk[x(k),u(k)]+λT(k+1)fk[x(k),u(k)]=Φ[x(N)]+k=0∑N−1{Lk[x(k),u(k)]Δt+λT(k+1)[fk[x(k),u(k)]Δt−x(k+1)+x(k)]}=Φ[x(N)]+k=0∑N−1{HkΔt−λT(k+1)[x(k+1)−x(k)]}=Φ[x(N)]+λT(1)x(0)+H0Δt−λT(N)x(N)+k=1∑N−1[HkΔt+(λT(k+1)−λT(k))x(k)]
可以得到极值条件为:
{
∂
V
∂
x
(
k
)
=
∂
H
k
∂
x
(
k
)
Δ
t
+
[
λ
(
k
+
1
)
−
λ
(
k
)
]
=
0
,
(
k
=
1
,
2
,
⋯
,
N
−
1
)
∂
V
∂
x
(
N
)
=
∂
Φ
[
x
(
N
)
]
∂
x
(
N
)
−
λ
(
N
)
=
0
∂
V
∂
u
(
k
)
=
∂
H
k
∂
u
(
k
)
Δ
t
=
0
,
(
k
=
0
,
1
,
⋯
,
N
−
1
)
∂
V
∂
λ
(
k
)
=
f
k
−
1
[
x
(
k
−
1
)
,
u
(
k
−
1
)
]
−
x
(
k
)
+
x
(
k
−
1
)
=
0
,
(
k
=
1
,
2
,
⋯
,
N
)
\begin{cases} \displaystyle\frac{\partial V}{\partial\boldsymbol{x}(k)}=\frac{\partial H_k}{\partial\boldsymbol{x}(k)}\Delta t+\left[\boldsymbol{\lambda}(k+1)-\boldsymbol{\lambda}(k)\right]=\boldsymbol{0},(k=1,2,\cdots,N-1)\\ \displaystyle\frac{\partial V}{\partial\boldsymbol{x}(N)}=\frac{\partial\varPhi[\boldsymbol{x}(N)]}{\partial\boldsymbol{x}(N)}-\boldsymbol{\lambda}(N)=\boldsymbol{0}\\ \displaystyle\frac{\partial V}{\partial\boldsymbol{u}(k)}=\frac{\partial H_k}{\partial\boldsymbol{u}(k)}\Delta t=\boldsymbol{0},(k=0,1,\cdots,N-1)\\ \displaystyle\frac{\partial V}{\partial\boldsymbol{\lambda}(k)}=\boldsymbol{f}_{k-1}[\boldsymbol{x}(k-1),\boldsymbol{u}(k-1)]-\boldsymbol{x}(k)+\boldsymbol{x}(k-1)=\boldsymbol{0},(k=1,2,\cdots,N) \end{cases}
⎩
⎨
⎧∂x(k)∂V=∂x(k)∂HkΔt+[λ(k+1)−λ(k)]=0,(k=1,2,⋯,N−1)∂x(N)∂V=∂x(N)∂Φ[x(N)]−λ(N)=0∂u(k)∂V=∂u(k)∂HkΔt=0,(k=0,1,⋯,N−1)∂λ(k)∂V=fk−1[x(k−1),u(k−1)]−x(k)+x(k−1)=0,(k=1,2,⋯,N)
取极限
Δ
t
→
0
\Delta t\to0
Δt→0 可以得到最优控制问题的必要条件:
{
∂
H
(
t
)
∂
x
(
t
)
=
−
λ
˙
(
t
)
∂
Φ
[
x
(
t
f
)
]
∂
x
(
t
)
=
λ
(
t
f
)
∂
H
(
t
)
∂
u
(
t
)
=
0
x
˙
(
t
)
=
f
[
x
(
t
)
,
u
(
t
)
,
t
]
\begin{cases} \displaystyle\frac{\partial H(t)}{\partial\boldsymbol{x}(t)}=-\dot{\boldsymbol{\lambda}}(t)\\ \displaystyle\frac{\partial\varPhi[\boldsymbol{x}(t_f)]}{\partial\boldsymbol{x}(t)}=\boldsymbol{\lambda}(t_f)\\ \displaystyle\frac{\partial H(t)}{\partial\boldsymbol{u}(t)}=\boldsymbol{0}\\ \dot{\boldsymbol{x}}(t)=\boldsymbol{f}\left[\boldsymbol{x}(t),\boldsymbol{u}(t),t\right] \end{cases}
⎩
⎨
⎧∂x(t)∂H(t)=−λ˙(t)∂x(t)∂Φ[x(tf)]=λ(tf)∂u(t)∂H(t)=0x˙(t)=f[x(t),u(t),t]
其中, H ( t ) = L [ x ( t ) , u ( t ) , t ] + λ T ( t ) f [ x ( t ) , u ( t ) , t ] H(t)=L\left[\boldsymbol{x}(t),\boldsymbol{u}(t),t\right]+\boldsymbol{\lambda}^T(t)\boldsymbol{f}\left[\boldsymbol{x}(t),\boldsymbol{u}(t),t\right] H(t)=L[x(t),u(t),t]+λT(t)f[x(t),u(t),t]。