一个简单的微分对策问题求解及其Matlab实现

一个简单的微分对策问题求解及其Matlab实现

理论知识


注:本文为《最优控制方法与Matlab实现》第八章 微分对策问题的实现。给出了文中例题的解法,以及matlab仿真代码,解题步骤和matlab均按照我习惯的方式给出。

全文的公式使用LaTex公式编辑器编辑而成。


首先对微分对策的一般形式进行介绍。

PE 是对策双方,动态系统状态方程为
x ˙ ( t ) = f [ x ( t ) , u ( t ) , v ( t ) , t ] \dot {\boldsymbol x}(t)=\boldsymbol f[\boldsymbol x(t),\boldsymbol u(t),\boldsymbol v(t),t] x˙(t)=f[x(t),u(t),v(t),t]
给定初始状态为
x ( t ) = x 0 \boldsymbol x(t)=\boldsymbol x_0 x(t)=x0
式中, x ( t ) \boldsymbol x(t) x(t) n n n P P P E E E 双方的状态变量; u ( t ) \boldsymbol u(t) u(t) m u m_u muP 方控制变量, v ( t ) \boldsymbol v(t) v(t) m v m_v mv E E E 方控制变量, u ( t ) \boldsymbol u(t) u(t) v ( t ) \boldsymbol v(t) v(t)的各分量 u i ( t ) u_i(t) ui(t) v i ( t ) v_i(t) vi(t) 均为 t t t 的分段连续函数, [ u ( t ) , v ( t ) ] [\boldsymbol u(t),\boldsymbol v(t)] [u(t),v(t)] 为容许控制策略或简称控制策略; f ( ⋅ ) \boldsymbol f(\cdot) f() n n n 维连续可微的向量函数; t ∈ [ t 0 , t f ] t \in [t_0,t_f] t[t0,tf] 。取性能指标为
J [ u ( t ) , v ( t ) ] = θ [ x ( t f ) , t f ] + ∫ t 0 t f L [ x ( t ) , u ( t ) , v ( t ) , t ] d t J[\boldsymbol u(t),\boldsymbol v(t)]=\theta[\boldsymbol x(t_f),t_f]+\int_{t_0}^{t_f}L[\boldsymbol x(t),\boldsymbol u(t),\boldsymbol v(t),t]\mathrm{d}t J[u(t),v(t)]=θ[x(tf),tf]+t0tfL[x(t),u(t),v(t),t]dt
【纳什-庞特里亚金最大最小原理】

定义Hamilton函数如下:
H [ x ( t ) , u ( t ) , v ( t ) , λ ( t ) , t ] = L [ x ( t ) , u ( t ) , v ( t ) , t ] + λ T f [ x ( t ) , u ( t ) , v ( t ) , t ] H[\boldsymbol x(t),\boldsymbol u(t),\boldsymbol v(t),\boldsymbol \lambda(t),t]=L[\boldsymbol x(t),\boldsymbol u(t),\boldsymbol v(t),t]+\boldsymbol \lambda^{\mathrm T} \boldsymbol f[\boldsymbol x(t),\boldsymbol u(t),\boldsymbol v(t),t] H[x(t),u(t),v(t),λ(t),t]=L[x(t),u(t),v(t),t]+λTf[x(t),u(t),v(t),t]
【定理】 若 u ∗ ( t ) \boldsymbol u^*(t) u(t) v ∗ ( t ) \boldsymbol v^*(t) v(t) 是最优控制,则有:

(1)状态变量 x ( t ) \boldsymbol x(t) x(t) 与协态变量 λ ( t ) \boldsymbol \lambda(t) λ(t) 满足以下协态方程
x ˙ ( t ) = ∂ H ∂ λ λ ˙ ( t ) = − ∂ H ∂ x \dot {\boldsymbol x}(t)=\frac{\partial H}{\partial \lambda} \\ \dot {\boldsymbol \lambda}(t)=-\frac{\partial H}{\partial \boldsymbol x} x˙(t)=λHλ˙(t)=xH
(2)边界条件
x ( t 0 ) = x 0 λ ( t f ) = ∂ θ [ x ( t f ) , t f ] ∂ x ( t f ) \boldsymbol x(t_0)=\boldsymbol x_0 \\ \boldsymbol \lambda(t_f)=\frac{\partial \theta[\boldsymbol x(t_f),t_f]}{\partial \boldsymbol x(t_f)} x(t0)=x0λ(tf)=x(tf)θ[x(tf),tf]
(3)对于任意 t ∈ [ t 0 , t f ] t \in [t_0,t_f] t[t0,tf],Hamilton函数满足下述条件
H [ x ∗ ( t ) , u ∗ ( t ) , v ∗ ( t ) , λ ∗ ( t ) , t ] = min ⁡ u max ⁡ v H [ x ∗ ( t ) , u ( t ) , v ( t ) , λ ∗ ( t ) , t ] = max ⁡ v min ⁡ u H [ x ∗ ( t ) , u ( t ) , v ( t ) , λ ∗ ( t ) , t ] H[\boldsymbol x^*(t),\boldsymbol u^*(t),\boldsymbol v^*(t),\boldsymbol \lambda^*(t),t]=\min_{\boldsymbol u}\max_{\boldsymbol v}H[\boldsymbol x^*(t),\boldsymbol u(t),\boldsymbol v(t),\boldsymbol \lambda^*(t),t]\\ =\max_{\boldsymbol v}\min_{\boldsymbol u}H[\boldsymbol x^*(t),\boldsymbol u(t),\boldsymbol v(t),\boldsymbol \lambda^*(t),t] H[x(t),u(t),v(t),λ(t),t]=uminvmaxH[x(t),u(t),v(t),λ(t),t]=vmaxuminH[x(t),u(t),v(t),λ(t),t]


【例题】

在追逃对局中, y ( t ) y(t) y(t) v ( t ) v(t) v(t)分别表示追赶者与逃跑者的相对位移与速度,追赶者希望终端脱靶 ∣ y ( t f ) ∣ |y(t_f)| y(tf)最小,而逃跑者希望此值最大。该问题可描述为如下微分对策问题:

系统状态方程为和初值为
{ y ˙ ( t ) = v ( t ) v ˙ ( t ) = a 1 ( t ) − a 2 ( t ) , { y ( t 0 ) = 0 v ( t 0 ) = v 0 \left\{\begin{matrix} \dot y(t)=v(t) \\ \dot v(t)=a_1(t)-a_2(t) \end{matrix}\right.,\quad \left\{\begin{matrix} y(t_0)=0 \\ v(t_0)=v_0 \end{matrix}\right. {y˙(t)=v(t)v˙(t)=a1(t)a2(t),{y(t0)=0v(t0)=v0
t f t_f tf 给定,目标函数为
J = 1 2 y 2 ( t f ) J=\frac{1}{2}y^2(t_f) J=21y2(tf)
式中,追与逃的加速度 a 1 ( t ) a_1(t) a1(t) a 2 ( t ) a_2(t) a2(t) 是该问题的两个控制变量,其范围为
∣ a 1 ( t ) ∣ ≤ a 1 m , ∣ a 2 ( t ) ∣ ≤ a 2 m , a 1 m > a 2 m ≥ 0 |a_1(t)| \le a_{1m},\quad |a_2(t)| \le a_{2m},\quad a_{1m}>a_{2m}\ge0 a1(t)a1m,a2(t)a2m,a1m>a2m0


【解】

① 构造Hamilton函数
H = λ 1 v + λ 2 ( a 1 − a 2 ) H=\lambda_1v+\lambda_2(a_1-a_2) H=λ1v+λ2(a1a2)
λ 1 \lambda_1 λ1 λ 2 \lambda_2 λ2 满足如下协态方程
{ λ ˙ 1 ( t ) = − ∂ H ∂ y = 0 λ ˙ 2 ( t ) = − ∂ H ∂ v = − λ 2 ⇒ { λ 1 ( t ) = C 1 λ 2 ( t ) = − C 1 t + C 2 (1) \left\{\begin{matrix} \dot \lambda_1(t)=-\frac{\partial H}{\partial y}=0 \\ \dot \lambda_2(t)=-\frac{\partial H}{\partial v}=-\lambda_2 \end{matrix}\right. \Rightarrow \left\{\begin{matrix} \lambda_1(t)=C_1 \\ \lambda_2(t)=-C_1t+C_2 \end{matrix}\right. \tag{1} {λ˙1(t)=yH=0λ˙2(t)=vH=λ2{λ1(t)=C1λ2(t)=C1t+C2(1)
② 边界条件

可知
φ = 1 2 y 2 ( t f ) \varphi=\frac{1}{2}y^2(t_f) φ=21y2(tf)
那么边界条件为
{ y ( t 0 ) = 0 v ( t 0 ) = v 0 , { λ 1 ( t f ) = ∂ φ ∂ y ( t f ) = y ( t f ) λ 2 ( t f ) = ∂ φ ∂ v ( t f ) = 0 (2) \left\{\begin{matrix} y(t_0)=0 \\ v(t_0)=v_0 \end{matrix}\right.,\quad \left\{\begin{matrix} \lambda_1(t_f)=\frac{\partial \varphi}{\partial y(t_f)}=y(t_f) \\ \lambda_2(t_f)=\frac{\partial \varphi}{\partial v(t_f)}=0 \end{matrix}\right. \tag{2} {y(t0)=0v(t0)=v0,{λ1(tf)=y(tf)φ=y(tf)λ2(tf)=v(tf)φ=0(2)
将式 ( 2 ) (2) (2) 与式 ( 1 ) (1) (1) 作对比,得到
C 1 = y ( t f ) C_1=y(t_f) C1=y(tf)

λ 2 ( t f ) = − y ( t f ) ⋅ t f + C 2 = 0 ⇒ C 2 = y ( t f ) ⋅ t f \lambda_2(t_f)=-y(t_f)\cdot t_f+C_2=0 \\ \Rightarrow C_2=y(t_f)\cdot t_f λ2(tf)=y(tf)tf+C2=0C2=y(tf)tf

整理得到
{ λ 1 ( t ) = y ( t f ) λ 2 ( t ) = y ( t f ) ⋅ ( t f − t ) (3) \left\{\begin{matrix} \lambda_1(t)=y(t_f) \\ \lambda_2(t)=y(t_f)\cdot (t_f-t) \end{matrix}\right. \tag{3} {λ1(t)=y(tf)λ2(t)=y(tf)(tft)(3)
③ 极值条件

根据纳什-庞特里亚金最大最小定理,可得到控制量 a 1 ( t ) a_1(t) a1(t) a 2 ( t ) a_2(t) a2(t) 的最优轨线如下
{ a 1 ∗ ( t ) = − a 1 m s g n [ λ 2 ( t ) ] a 2 ∗ ( t ) = − a 2 m s g n [ λ 2 ( t ) ] (4) \left\{\begin{matrix} a_1^*(t)=-a_{1m}\mathrm{sgn}[\lambda_2(t)] \\ a_2^*(t)=-a_{2m}\mathrm{sgn}[\lambda_2(t)] \end{matrix}\right. \tag{4} {a1(t)=a1msgn[λ2(t)]a2(t)=a2msgn[λ2(t)](4)
将式 ( 3 ) (3) (3) 代入式 ( 4 ) (4) (4),则有
{ a 1 ∗ ( t ) = − a 1 m s g n [ y ( t f ) ( t f − t ) ] a 2 ∗ ( t ) = − a 2 m s g n [ y ( t f ) ( t f − t ) ] \left\{\begin{matrix} a_1^*(t)=-a_{1m}\mathrm{sgn}[y(t_f)(t_f-t)] \\ a_2^*(t)=-a_{2m}\mathrm{sgn}[y(t_f)(t_f-t)] \end{matrix}\right. {a1(t)=a1msgn[y(tf)(tft)]a2(t)=a2msgn[y(tf)(tft)]
因为 t f − t > 0 t_f-t>0 tft>0 恒成立,上式可化简为
{ a 1 ∗ ( t ) = − a 1 m s g n [ y ( t f ) ] a 2 ∗ ( t ) = − a 2 m s g n [ y ( t f ) ] \left\{\begin{matrix} a_1^*(t)=-a_{1m}\mathrm{sgn}[y(t_f)] \\ a_2^*(t)=-a_{2m}\mathrm{sgn}[y(t_f)] \end{matrix}\right. {a1(t)=a1msgn[y(tf)]a2(t)=a2msgn[y(tf)]
更清晰地,将上式改写为
{ a 1 ∗ ( t ) = − a 1 m , y ( t f ) > 0 a 1 ∗ ( t ) = a 1 m , y ( t f ) < 0 , { a 2 ∗ ( t ) = − a 2 m , y ( t f ) > 0 a 2 ∗ ( t ) = a 2 m , y ( t f ) < 0 (5) \left\{\begin{matrix} a_1^*(t)=-a_{1m},\quad y(t_f) > 0 \\ a_1^*(t)=a_{1m},\quad y(t_f) < 0 \end{matrix}\right.,\quad \left\{\begin{matrix} a_2^*(t)=-a_{2m},\quad y(t_f) > 0 \\ a_2^*(t)=a_{2m},\quad y(t_f) < 0 \end{matrix}\right. \tag{5} {a1(t)=a1m,y(tf)>0a1(t)=a1m,y(tf)<0,{a2(t)=a2m,y(tf)>0a2(t)=a2m,y(tf)<0(5)
④ 求解结果

因为原状态方程为
{ y ˙ ( t ) = v ( t ) v ˙ ( t ) = ( a 2 m − a 1 m ) s g n [ y ( t f ) ] , { y ( t 0 ) = 0 v ( t 0 ) = v 0 \left\{\begin{matrix} \dot y(t)=v(t) \\ \dot v(t)=(a_{2m}-a_{1m})\mathrm{sgn}[y(t_f)] \end{matrix}\right.,\quad \left\{\begin{matrix} y(t_0)=0 \\ v(t_0)=v_0 \end{matrix}\right. {y˙(t)=v(t)v˙(t)=(a2ma1m)sgn[y(tf)],{y(t0)=0v(t0)=v0
注意, ( a 1 m − a 2 m ) s g n [ y ( t f ) ] (a_{1m}-a_{2m})\mathrm{sgn}[y(t_f)] (a1ma2m)sgn[y(tf)]视作常数。对方程求积分,得到
{ y ( t ) = v 0 ( t − t 0 ) + 1 2 ( a 2 m − a 1 m ) s g n [ y ( t f ) ] ( t − t 0 ) 2 v ( t ) = ( a 2 m − a 1 m ) s g n [ y ( t f ) ] ( t − t 0 ) (6) \left\{\begin{matrix} y(t)=v_0(t-t_0)+\frac{1}{2}(a_{2m}-a_{1m})\mathrm{sgn}[y(t_f)](t-t_0)^2 \\ v(t)=(a_{2m}-a_{1m})\mathrm{sgn}[y(t_f)](t-t_0) \end{matrix}\right. \tag{6} {y(t)=v0(tt0)+21(a2ma1m)sgn[y(tf)](tt0)2v(t)=(a2ma1m)sgn[y(tf)](tt0)(6)
疑问:我不懂为什么 y ( t ) y(t) y(t) 中还有 v ( t − t 0 ) v(t-t_0) v(tt0) 这一项,难道和初值条件有关系吗?

所以,末端条件为
y ( t f ) = v 0 ( t f − t 0 ) + 1 2 ( a 2 m − a 1 m ) s g n [ y ( t f ) ] ( t f − t 0 ) 2 y(t_f)=v_0(t_f-t_0)+\frac{1}{2}(a_{2m}-a_{1m})\mathrm{sgn}[y(t_f)](t_f-t_0)^2 y(tf)=v0(tft0)+21(a2ma1m)sgn[y(tf)](tft0)2
根据式 ( 5 ) (5) (5) 的分段条件,有
y ( t f ) > 0 时 , y ( t f ) = v 0 ( t f − t 0 ) + 1 2 ( a 2 m − a 1 m ) s g n [ y ( t f ) ] ( t f − t 0 ) 2 > 0 ⇒ 2 v 0 > − ( a 2 m − a 1 m ) ⋅ 1 ⋅ ( t f − t 0 ) ⇒ 2 v 0 > ( a 1 m − a 2 m ) ⋅ 1 ⋅ ( t f − t 0 ) ⇒ 2 v 0 ( a 1 m − a 2 m ) ( t f − t 0 ) > 1 y(t_f)>0时,\\ y(t_f)=v_0(t_f-t_0)+\frac{1}{2}(a_{2m}-a_{1m})\mathrm{sgn}[y(t_f)](t_f-t_0)^2>0 \\ \Rightarrow 2v_0>-(a_{2m}-a_{1m})\cdot 1 \cdot (t_f-t_0) \\ \Rightarrow 2v_0>(a_{1m}-a_{2m})\cdot 1 \cdot (t_f-t_0) \\ \Rightarrow \frac{2v_0}{(a_{1m}-a_{2m})(t_f-t_0)}>1\\ y(tf)>0,y(tf)=v0(tft0)+21(a2ma1m)sgn[y(tf)](tft0)2>02v0>(a2ma1m)1(tft0)2v0>(a1ma2m)1(tft0)(a1ma2m)(tft0)2v0>1

y ( t f ) < 0 时 , y ( t f ) = v 0 ( t f − t 0 ) + 1 2 ( a 2 m − a 1 m ) s g n [ y ( t f ) ] ( t f − t 0 ) 2 < 0 ⇒ 2 v 0 < − ( a 2 m − a 1 m ) ⋅ ( − 1 ) ⋅ ( t f − t 0 ) ⇒ 2 v 0 < ( a 1 m − a 2 m ) ⋅ ( − 1 ) ⋅ ( t f − t 0 ) ⇒ 2 v 0 ( a 1 m − a 2 m ) ( t f − t 0 ) < − 1 y(t_f)<0时,\\ y(t_f)=v_0(t_f-t_0)+\frac{1}{2}(a_{2m}-a_{1m})\mathrm{sgn}[y(t_f)](t_f-t_0)^2<0 \\ \Rightarrow 2v_0<-(a_{2m}-a_{1m})\cdot (-1) \cdot (t_f-t_0) \\ \Rightarrow 2v_0<(a_{1m}-a_{2m})\cdot (-1) \cdot (t_f-t_0) \\ \Rightarrow \frac{2v_0}{(a_{1m}-a_{2m})(t_f-t_0)}<-1\\ y(tf)<0,y(tf)=v0(tft0)+21(a2ma1m)sgn[y(tf)](tft0)2<02v0<(a2ma1m)(1)(tft0)2v0<(a1ma2m)(1)(tft0)(a1ma2m)(tft0)2v0<1

写在一起,有
{ y ( t f ) > 0 ⇒ 2 v 0 ( a 1 m − a 2 m ) ( t f − t 0 ) > 1 y ( t f ) < 0 ⇒ 2 v 0 ( a 1 m − a 2 m ) ( t f − t 0 ) < − 1 (7) \left\{\begin{matrix} y(t_f)>0\Rightarrow \frac{2v_0}{(a_{1m}-a_{2m})(t_f-t_0)}>1\\ y(t_f)<0\Rightarrow \frac{2v_0}{(a_{1m}-a_{2m})(t_f-t_0)}<-1 \end{matrix}\right. \tag{7} {y(tf)>0(a1ma2m)(tft0)2v0>1y(tf)<0(a1ma2m)(tft0)2v0<1(7)
最后,最优控制为
{ a 1 ∗ ( t ) = − a 1 m , 2 v 0 ( a 1 m − a 2 m ) ( t f − t 0 ) > 1 a 1 ∗ ( t ) = a 1 m , 2 v 0 ( a 1 m − a 2 m ) ( t f − t 0 ) < − 1 { a 2 ∗ ( t ) = − a 2 m , 2 v 0 ( a 1 m − a 2 m ) ( t f − t 0 ) > 1 a 2 ∗ ( t ) = a 2 m , 2 v 0 ( a 1 m − a 2 m ) ( t f − t 0 ) < − 1 (8) \left\{\begin{matrix} a_1^*(t)=-a_{1m},\quad \frac{2v_0}{(a_{1m}-a_{2m})(t_f-t_0)}>1 \\ a_1^*(t)=a_{1m},\quad \frac{2v_0}{(a_{1m}-a_{2m})(t_f-t_0)}<-1 \end{matrix}\right.\\ \left\{\begin{matrix} a_2^*(t)=-a_{2m},\quad \frac{2v_0}{(a_{1m}-a_{2m})(t_f-t_0)}>1 \\ a_2^*(t)=a_{2m},\quad \frac{2v_0}{(a_{1m}-a_{2m})(t_f-t_0)}<-1 \end{matrix}\right. \tag{8} {a1(t)=a1m,(a1ma2m)(tft0)2v0>1a1(t)=a1m,(a1ma2m)(tft0)2v0<1{a2(t)=a2m,(a1ma2m)(tft0)2v0>1a2(t)=a2m,(a1ma2m)(tft0)2v0<1(8)

思考

就本文最简单的微分对策问题而言,其解题步骤与最优控制类似,参考最优控制的解题方法解微分对策即可。

这里给出Matlab仿真代码:

【例题2】

现有追逃问题,题设【例题1】,其起始时刻 t 0 = 0 t_0=0 t0=0,终端时刻 t f = 1 t_f=1 tf=1,初始相对速度 v 0 = 3 m / s v_0=3m/s v0=3m/s,追赶者加速度 ∣ a 1 ( t ) ∣ ≤ a 1 m = 2 |a_1(t)| \le a_{1m}=2 a1(t)a1m=2,逃跑者加速度 ∣ a 2 ( t ) ∣ ≤ a 2 m = 1.5 |a_2(t)| \le a_{2m}=1.5 a2(t)a2m=1.5

clear;clc;close all;
% 题目条件
t_0 = 0; t_f = 1;
v_0 = 3;
a_1m = 2; a_2m = 1.5;

%% 01 构建Hamilton函数
syms H lambda_1 lambda_2 y v a_1 a_2
H = lambda_1*v + lambda_2*(a_1-a_2);
% 求协态方程
Dlambda_1 = -diff(H,y);
Dlambda_2 = -diff(H,v);

%% 02 边界条件
% 性能指标中的常值型性能指标项
syms varphi
varphi = 0.5*y(1)^2;

lambda_1 = diff(varphi,y);
lambda_2 = diff(varphi,v);

%% 03 极值条件
% 省略没写

%% 04 求解结果
eq_1 = strcat('Dlambda_1=',char(Dlambda_1));
eq_2 = strcat('Dlambda_2=',char(Dlambda_2));
con_1 = strcat('lambda_1(1)=',char(y(1)));
con_2 = strcat('lambda_2(1)=',char(lambda_2));
[lambda_1,lambda_2] = dsolve(eq_1,eq_2,con_1,con_2);
[v,y] = dsolve('Dy=v,Dv=(a2m-a1m)*sgnyf','y(0)=0,v(0)=3');
% 求终端时刻相对位移
% t = tf;         % 这个位置有问题,代码错误
% 经网友【onion_zh】更正,此处改为下列形式。谢谢onion_zh。
t = t_f;
y = simplify(y);
y = expand(y);
yf = subs(y);

但是t = tf这一步存在问题,会提示t = Empty transfer function.。导致代码执行至yf = subs(y)处出错,显示Conversion to 'sym' from 'tf' is not possible.

不知道什么原因,暂时无法解释。

但是对于 λ ˙ 1 \dot \lambda_1 λ˙1 λ ˙ 2 \dot \lambda_2 λ˙2 λ 1 \lambda_1 λ1 λ 2 \lambda_2 λ2的求解是正确的,如下图所示。
计算结果
− − − E n d − − − ---\mathrm{End}--- End

  • 32
    点赞
  • 136
    收藏
    觉得还不错? 一键收藏
  • 13
    评论
评论 13
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值