Robust and Stochastic Model Predictive Control
反馈与开环控制
众所周知,当存在不确定性时,反馈控制是一种十分必要的解决方案;当不存在不确定性时,反馈与开环控制可视为等价的。事实上,当不存在不确定性时,可以使用提供最优控制策略或反馈控制律序列的动态规划 (DP) 来计算给定初始状态的最优控制,或者使用仅提供一系列控制动作的开环最优控制。下面举例说明,考虑确定性线性动态系统为:
x + = x + u x^+=x+u x+=x+u
滚动区间为3的最优控制问题可描述为:
P 3 ( x ) : V 3 0 ( x ) = min u 3 V 3 ( x , u ) {P_3}(x):V_3^0(x) = \mathop {\min }\limits_{{{\bf{u}}_{\bf{3}}}} {V_3}(x,{\bf{u}}) P3(x):V30(x)=u3minV3(x,u)
其中, u = ( u ( 0 ) , u ( 1 ) , u ( 2 ) ) {\bf{u}} = (u(0),u(1),u(2)) u=(u(0),u(1),u(2))
V 3 ( x , u ) : = ( 1 / 2 ) ∑ i = 0 2 [ ( x ( i ) 2 + u ( i ) 2 ) ] + ( 1 / 2 ) x ( 3 ) 2 {V_3}(x,{\bf{u}}): = (1/2)\sum\limits_{i = 0}^2 {[(x{{(i)}^2} + u{{(i)}^2})]} + (1/2)x{(3)^2} V3(x,u):=(1/2)i=0∑2[(x(i)2+u(i)2)]+(1/2)x(3)2
其中,对每个 i i i , x ( i ) = ϕ ( i ; x , u ) = x + u ( 0 ) + u ( 1 ) + . . . + u ( i − 1 ) x(i) = \phi (i;x,{\bf{u}}) = x + u(0) + u(1) + ... + u(i - 1) x(i)=ϕ(i;x,u)=x+u(0)+u(1)+...+u(i−1) 差分方程 x + = x + u x^+=x+u x+=x+u 在时间 i i i 的解为 u = ( u ( 0 ) , u ( 1 ) , u ( 2 ) ) {\bf{u}} = (u(0),u(1),u(2)) u=(u(0),u(1),u(2)),如果初始状态 x ( 0 ) = x x(0)=x x(0)=x; u \bf{u} u 为列向量。因此
V 3 ( x , u ) = ( 3 / 2 ) x 2 + x [ 3 2 1 ] u + ( 1 / 2 ) u T P 3 u {V_3}(x,{\bf{u}}) = (3/2){x^2} + x\begin{bmatrix} 3&2&1 \end{bmatrix}{\bf{u}} + (1/2){{\bf{u}}^T}{P_3}{\bf{u}} V3(x,u)=(3/2)x2+x[321]u+(1/2)uTP3u
其中
P
=
[
4
2
1
2
3
1
1
1
2
]
P = \begin{bmatrix} 4 & 2 &1 \\ 2 & 3 &1 \\ 1 & 1 &2 \end{bmatrix}
P=
421231112
因此,对于
x
x
x 的初始状态,最优开环控制序列的向量形式为:
u
0
(
x
)
=
−
P
3
−
1
[
3
2
1
]
x
=
−
[
0.615
0.231
0.077
]
T
x
{{\bf{u}}^0}(x) = - {P_3}^{ - 1}\begin{bmatrix} 3&2&1 \end{bmatrix}x = - \begin{bmatrix} 0.615&0.231&0.077 \end{bmatrix}^Tx
u0(x)=−P3−1[321]x=−[0.6150.2310.077]Tx
最优控制序列和状态序列为:
u
0
(
x
)
=
−
[
0.615
x
0.231
x
0.077
x
]
T
{{\bf{u}}^0}(x) = -\begin{bmatrix} 0.615x&0.231x&0.077x \end{bmatrix}^T
u0(x)=−[0.615x0.231x0.077x]T
x
0
(
x
)
=
[
x
0
.
385
x
0
.
154
x
0.077
x
]
{{\bf{x}}^0}(x) = \begin{bmatrix} {x{\rm{ }}}&{{\rm{0}}{\rm{.385}}x}&{{\rm{0}}{\rm{.154}}x}&{0.077x} \end{bmatrix}
x0(x)=[x0.385x0.154x0.077x]
下面我们计算最优反馈控制,然后对比上述的开环最优控制,我们使用2DP递归方法:
V
i
0
=
min
u
∈
R
{
x
2
/
2
+
u
2
/
2
+
V
i
−
1
0
(
x
+
u
)
}
V_i^0 = \mathop {\min }\limits_{u \in {\mathbb R}} \{ {x^2}/2 + {u^2}/2 + V_{i - 1}^0(x + u)\}
Vi0=u∈Rmin{x2/2+u2/2+Vi−10(x+u)}
κ
i
0
=
arg
min
u
∈
R
{
x
2
/
2
+
u
2
/
2
+
V
i
−
1
0
(
x
+
u
)
}
\kappa _i^0 = \mathop {\arg \min }\limits_{u \in {\mathbb R}} \{ {x^2}/2 + {u^2}/2 + V_{i - 1}^0(x + u)\}
κi0=u∈Rargmin{x2/2+u2/2+Vi−10(x+u)}
边界条件:
V
0
0
(
x
)
=
(
1
/
2
)
x
2
V_0^0(x) = (1/2){x^2}
V00(x)=(1/2)x2
求解递归问题,对于
x
∈
R
,
i
∈
{
1
,
2
,
3
}
x \in {\mathbb R},i \in \{ 1,2,3\}
x∈R,i∈{1,2,3} 得到:
V
1
0
(
x
)
=
(
3
/
4
)
x
2
,
κ
1
0
(
x
)
=
−
(
1
/
2
)
x
V
2
0
(
x
)
=
(
4
/
5
)
x
2
,
κ
2
0
(
x
)
=
−
(
3
/
5
)
x
V
3
0
(
x
)
=
(
21
/
26
)
x
2
,
κ
3
0
(
x
)
=
−
(
8
/
13
)
x
\begin{array}{l} V_1^0(x) = (3/4){x^2},\kappa _1^0(x) = - (1/2)x\\ V_2^0(x) = (4/5){x^2},\kappa _2^0(x) = - (3/5)x\\ V_3^0(x) = (21/26){x^2},\kappa _3^0(x) = - (8/13)x \end{array}
V10(x)=(3/4)x2,κ10(x)=−(1/2)xV20(x)=(4/5)x2,κ20(x)=−(3/5)xV30(x)=(21/26)x2,κ30(x)=−(8/13)x
从初始时刻开始,并将最优控制律迭代地应用于确定性系统
x
+
=
x
+
u
x^+=x+u
x+=x+u (在时间
i
i
i 时的最优控制律为
κ
3
−
i
0
(
⋅
)
\kappa _{3 - i}^0( \cdot )
κ3−i0(⋅)) 得到:
x
0
(
0
)
=
x
,
u
0
(
0
)
=
−
(
8
/
13
)
x
x
0
(
1
)
=
(
5
/
13
)
x
,
u
0
(
1
)
=
−
(
3
/
13
)
x
x
0
(
2
)
=
(
2
/
13
)
x
,
u
0
(
2
)
=
−
(
1
/
13
)
x
x
0
(
3
,
x
)
=
(
1
/
13
)
x
\begin{array}{l} {x^0}(0) = x,{u^0}(0) = - (8/13)x\\ {x^0}(1) = (5/13)x,{u^0}(1) = - (3/13)x\\ {x^0}(2) = (2/13)x,{u^0}(2) = - (1/13)x\\ {x^0}(3,x) = (1/13)x \end{array}
x0(0)=x,u0(0)=−(8/13)xx0(1)=(5/13)x,u0(1)=−(3/13)xx0(2)=(2/13)x,u0(2)=−(1/13)xx0(3,x)=(1/13)x
使得最优控制序列和状态序列分别为:
u
0
(
x
)
=
−
[
0.615
x
0.231
x
0.077
x
]
T
{{\bf{u}}^0}(x) = -\begin{bmatrix} 0.615x&0.231x&0.077x \end{bmatrix}^T
u0(x)=−[0.615x0.231x0.077x]T
x
0
(
x
)
=
[
x
0
.
385
x
0
.
154
x
0.077
x
]
{{\bf{x}}^0}(x) = \begin{bmatrix} {x{\rm{ }}}&{{\rm{0}}{\rm{.385}}x}&{{\rm{0}}{\rm{.154}}x}&{0.077x} \end{bmatrix}
x0(x)=[x0.385x0.154x0.077x]
这与上面计算的最佳开环控制序列相同。
下一节将考虑存在扰动的不确定系统。