前面一章最优控制理论 二、哈密尔顿函数法,我们已经利用Hamilton函数法对连续控制系统的Euler方程和横截条件进行了推导。这篇文章讲述如何求解
Hamilton函数法回顾
对于一个包含终端等式约束、Lagrange型性能指标的最优控制问题:
x
˙
=
f
[
x
(
t
)
,
u
(
t
)
,
t
]
min
u
(
t
)
J
=
∫
t
0
t
f
L
[
x
(
t
)
,
u
(
t
)
,
t
]
d
t
x
(
t
0
)
=
x
0
t
0
≤
t
≤
t
f
ψ
(
x
(
t
f
)
,
t
f
)
=
0
(1)
\begin{aligned} \dot{x}&=f[x(t), u(t), t]\\ \min_{u(t)}J&=\int_{t_{0}}^{t_{f}} L[x(t), u(t), t] d t \\ \quad x\left(t_{0}\right)&=x_0\quad t_{0} \leq t \leq t_{f}\\ &\psi(x(t_f),t_f)=0 \tag 1\end{aligned}
x˙u(t)minJx(t0)=f[x(t),u(t),t]=∫t0tfL[x(t),u(t),t]dt=x0t0≤t≤tfψ(x(tf),tf)=0(1)
Hamilton函数法的Euler方程和横截条件为:
x
˙
=
f
(
x
,
u
,
t
)
x
(
t
0
)
=
x
0
,
ψ
(
x
(
t
f
)
,
t
f
)
=
0
λ
˙
=
−
∂
H
∂
x
=
−
∂
L
∂
x
−
λ
T
∂
f
∂
x
λ
(
t
f
)
=
μ
T
∂
ψ
∂
x
0
=
∂
H
∂
u
=
∂
L
∂
u
+
λ
T
∂
f
∂
u
(2)
\begin{aligned} \dot{x}&=f(x,u,t)\\ x(t_0)&=x_0,\psi(x(t_f),t_f)=0\\ \dot{\lambda}&=-\frac{\partial H}{\partial x}=-\frac{\partial L}{\partial x}-\lambda^T\frac{\partial f}{\partial x}\\ \lambda(t_f)&=\mu^T\frac{\partial\psi}{\partial x}\\ 0&=\frac{\partial H}{\partial u}=\frac{\partial L}{\partial u}+\lambda^T\frac{\partial f}{\partial u} \end{aligned}\tag{2}
x˙x(t0)λ˙λ(tf)0=f(x,u,t)=x0,ψ(x(tf),tf)=0=−∂x∂H=−∂x∂L−λT∂x∂f=μT∂x∂ψ=∂u∂H=∂u∂L+λT∂u∂f(2)
公式
(
2
)
(2)
(2)的前四个构成一个两点边值问题,控制
u
(
t
)
u(t)
u(t)由状态变量
x
(
t
)
x(t)
x(t)和协态变量
λ
(
t
)
\lambda(t)
λ(t)决定。
二维火箭上升问题
二维火箭上升,推力加速度
a
a
a恒定。求最优的喷射角度
β
(
t
)
\beta(t)
β(t),使指定时刻达到预定高度,并水平速度
V
x
V_x
Vx最大(Dreyfus Rocket Ascent Problem)
min
β
(
t
)
J
=
−
V
x
(
t
f
)
=
V
x
(
0
)
−
∫
0
t
f
a
cos
(
β
)
d
t
x
(
t
0
)
=
[
0
,
0
,
0
,
0
]
T
(
0
≤
t
≤
t
f
)
ψ
(
x
(
t
f
)
,
t
f
)
=
[
y
(
t
f
)
−
h
V
y
(
t
f
)
]
=
0
\begin{aligned} \min_{\beta(t)}J&=-V_x(t_f)=V_x(0)-\int_{0}^{t_{f}} a\cos(\beta) d t \\ \quad x\left(t_{0}\right)&=[0,0,0,0]^T\quad (0 \leq t \leq t_{f})\\ \psi(x(t_f),t_f)&=\begin{bmatrix}y(t_f)-h\\V_y(t_f)\end{bmatrix}=0 \end{aligned}
β(t)minJx(t0)ψ(x(tf),tf)=−Vx(tf)=Vx(0)−∫0tfacos(β)dt=[0,0,0,0]T(0≤t≤tf)=[y(tf)−hVy(tf)]=0
其中参数设定
t
f
=
100
s
,
h
=
30000
m
,
g
=
10
m
/
s
2
,
a
=
20
m
/
s
2
t_f=100s,h=30000m,g=10m/s^2,a=20m/s^2
tf=100s,h=30000m,g=10m/s2,a=20m/s2,
H
=
−
a
cos
β
+
λ
r
T
v
+
λ
3
a
cos
β
+
λ
4
(
a
sin
β
−
g
)
H=-a \cos \beta+\lambda_{r}^{T} v+\lambda_{3} a \cos \beta +\lambda_{4}(a \sin \beta-g)
H=−acosβ+λrTv+λ3acosβ+λ4(asinβ−g)
λ ˙ = − ∂ H ∂ x ⟹ λ 1 = 0 λ 2 = 0 λ 3 = − λ 1 λ 4 = − λ 2 λ ( t f ) = μ ⊤ ∂ ψ ∂ x = μ 1 [ 0 1 0 0 ] + μ 2 [ 0 0 0 1 ] λ 1 ( t f ) = 0 , λ 2 ( t f ) = μ 1 λ 3 ( t 1 ) = 0 λ 4 ( t 1 ) = μ 2 ∗ λ 1 = 0 λ 2 = μ 1 λ 3 = 0 , λ 4 = − λ 2 t + μ 2 {\dot{\lambda}=-\frac{\partial H}{\partial x}}\implies {\lambda_{1}=0 \quad \lambda_{2}=0\quad\lambda_{3}=-\lambda_{1} \quad \lambda_{4}=-\lambda_{2}}\\ \lambda\left(t_{f}\right)=\mu^{\top} \frac{\partial \psi}{\partial x}=\mu_{1}\left[\begin{array}{l} 0\\1\\0\\0 \end{array}\right]+\mu_{2}\left[\begin{array}{l} 0 \\0 \\0\\1 \end{array}\right]\\ \lambda_{1}\left(t_{f}\right)=0, \lambda_{2}\left(t_{f}\right)=\mu_{1} \quad\lambda_{3}\left(t_{1}\right)=0 \quad \lambda_{4}\left(t_{1}\right)=\mu_{2}^{*} \\ \lambda_{1}=0\quad \lambda_{2}=\mu_{1} \quad\lambda_{3}=0, \quad \lambda_{4}=-\lambda_2 t+\mu_{2} λ˙=−∂x∂H⟹λ1=0λ2=0λ3=−λ1λ4=−λ2λ(tf)=μ⊤∂x∂ψ=μ1⎣ ⎡0100⎦ ⎤+μ2⎣ ⎡0001⎦ ⎤λ1(tf)=0,λ2(tf)=μ1λ3(t1)=0λ4(t1)=μ2∗λ1=0λ2=μ1λ3=0,λ4=−λ2t+μ2
∂ H ∂ β = a { sin β − λ 3 sin β + λ 4 cos β } = 0 ⟹ tan β = λ 4 λ 3 − 1 = − λ 4 \frac{\partial H}{\partial\beta}=a\{\sin\beta-\lambda_3\sin\beta+\lambda_4\cos\beta\}=0\\ \implies\tan\beta=\frac{\lambda_4}{\lambda_3-1}=-\lambda_4 ∂β∂H=a{sinβ−λ3sinβ+λ4cosβ}=0⟹tanβ=λ3−1λ4=−λ4
共有6个未知变量,构成两点边值问题。
两点边值问题求解
两点边值问题(Two Point Boundary Value Problem)是对常微分方程组(
n
n
n维)而言的,如果定解问题的边界条件既有左边界的初值(
p
p
p个)、又有右边界的终端值(
q
q
q个),且问题定解(
p
+
q
=
n
p+q=n
p+q=n)。典型情况如下:
对ODE系统:
x
˙
=
f
(
t
,
x
)
,
已知
x
(
t
0
)
,
x
(
t
f
)
,
求解
x
(
t
)
\text{对ODE系统}:\dot x=f(t,x), \ 已知x(t_0),\ x(t_f),\ 求解x(t)
对ODE系统:x˙=f(t,x), 已知x(t0), x(tf), 求解x(t)
这样的常微分方程是不能直接按照求解常微分方程组的初值问题的方法(如4阶Runge-Kutta法)来求解的。两点边值问题在轨迹规划里有很多典型的例子。
直接调用MATLABbvpsolver
获得最优控制的求解效率大大高于自己编写fmincon
求解非线性规划(NLP)问题、或者转化成打靶(shooting)问题。当然这可能和自己编写求解算法存在一些没有意识到的缺陷有关。对我而言,在无约束情况下,应该尽量把最优控制问题转化成两点边值问题。
TPBVP问题求解的典型方法有1. 打靶法2. 有限差分法3. 有限元法,算法可以参考百度文库-第5章两点边值问题求解方法介绍。下面给出上面的问题对应的MATLAB代码
function sol = RocketAscentIndirectBVP
%indirect method get optimal control
%Edited by Siyang Meng in Northwestern Polytechnical University
% parameters
a=20;tf=100;h=30000;g=10;
% initial guess
ti=linspace(0,tf,101);
solinit=bvpinit(ti,@init);
% settings
opts=bvpset('Stats','on','RelTol',1e-6);% ,'BCJacobian',@pFpXf
% calculation
sol=bvp4c(@ode,@bc,solinit,opts);
t=sol.x; y=sol.y;
% plot result
figure(1)
plot(y(1,:),y(2,:),'k-'),title('trj'),xlabel('x/m'),ylabel('y/m');
figure(2)
plot(t,y(3:4,:));xlabel('t/s'),ylabel('velocity');
legend('V_x','V_y')
figure(3)
plot(t,-57.3*atan(y(6,:)));xlabel('t/s'),ylabel('\beta°');
function dydx=ode(t,y,p)
% state and costate ode functions of OCP
u=-atan(y(6));
dydx=[y(3);
y(4);
a*cos(u);
a*sin(u)-g;
0;
-y(5)];
end
function [dbcdya,dbcdyb,dbcdp]=pFpXf(t,y,p)
% Jocabian matrix of decision variables
end
function res=bc(ya,yb,p)
% boundary value known, at initial and terminal time, on state or costates
res=[ya(1);
ya(2);
ya(3);
ya(4);
yb(2)-h;
yb(4)];
% yb(4)];
end
function y=init(t,p)
% initialize BVP ode system
y=[400*(t); % y(-1)=-1; y(1)=1;
300*(t);
8*t;
6*t;
1.2*ones(size(t));
-0.1*(120*ones(size(t))-t)];
end
end