3.1 Solve the problem of Example 3.2.1 for the case where the cost function is
(
x
(
T
)
)
2
+
∫
0
T
(
u
(
t
)
)
2
d
t
(x(T))^2+\int_0^T(u(t))^2dt
(x(T))2+∫0T(u(t))2dtAlso, calculate the cost-to-go function
J
∗
(
t
,
x
)
J^*(t,x)
J∗(t,x) and verify that it satisfies the HJB equation.
Solution. The scalar system
x
˙
(
t
)
=
u
(
t
)
\dot x(t)=u(t)
x˙(t)=u(t) with the constaint
∣
u
(
t
)
∣
≤
1
|u(t)|\leq 1
∣u(t)∣≤1 for all
t
∈
[
0
,
T
]
t\in [0,T]
t∈[0,T].
3.2 A young investor has earned in the stock market a large amount if money
S
S
S and plans to spend it so as to maximize his enjoyment through the rest of his life without working. He estimates that he will live exactly
T
T
T more than years and that his capital
x
(
t
)
x(t)
x(t) should be reduced to zero at time
T
T
T, i.e.,
x
(
T
)
=
0
x(T)=0
x(T)=0. Also he models the evolution of his capital by the differential equation
d
x
(
t
)
d
t
=
α
x
(
t
)
−
u
(
t
)
\frac{dx(t)}{dt}=\alpha x(t)-u(t)
dtdx(t)=αx(t)−u(t) where
x
(
0
)
=
S
x(0)=S
x(0)=S is his initial capital,
α
>
0
\alpha >0
α>0 is a given interest rate, and
u
(
t
)
≥
0
u(t)\ge 0
u(t)≥0 is his rate of expenditure. The total enjoyment he will obtain is given by
∫
0
T
e
−
β
t
u
(
t
)
d
t
\int_0^Te^{-\beta t}\sqrt{u(t)}dt
∫0Te−βtu(t)dt Here
β
\beta
β is some positive scalar, which serves to discount future enjoyment. Find the optimal
{
u
(
t
)
∣
t
∈
[
0
,
T
]
}
\{u(t)|t\in[0,T]\}
{u(t)∣t∈[0,T]}.
Solution. We have
f
(
x
,
u
)
=
α
x
−
u
    
,
g
(
x
,
u
)
=
e
−
β
t
u
f(x,u)=\alpha x-u\;\;,\qquad g(x,u)=e^{-\beta t}\sqrt{u}
f(x,u)=αx−u,g(x,u)=e−βtugiving the Hamiltonian as follows:
H
(
x
,
u
,
p
)
=
e
−
β
t
u
+
p
(
α
x
−
u
)
H(x,u,p)=e^{-\beta t}\sqrt{u}+p(\alpha x-u)
H(x,u,p)=e−βtu+p(αx−u) and the adjoint equation is
p
˙
(
t
)
=
−
α
p
(
t
)
\dot p(t)=-\alpha p(t)
p˙(t)=−αp(t) yielding
p
(
t
)
=
C
1
e
−
α
t
for some constant
C
1
p(t)=C_1e^{-\alpha t}\qquad\text{for some constant }C_1
p(t)=C1e−αtfor some constant C1Notice that here
x
(
T
)
=
0
x(T)=0
x(T)=0 is given, so
p
(
T
)
=
∇
(
h
(
x
∗
(
T
)
)
)
=
0
p(T)=\nabla(h(x^*(T)))=0
p(T)=∇(h(x∗(T)))=0 is not true anymore.
\qquad
The optimal control is obtained by maximizing the Hamiltonian with respect to
u
u
u, yielding
u
∗
(
t
)
=
arg
max
u
[
e
−
β
t
u
+
C
1
e
−
α
t
(
α
x
∗
−
u
)
]
=
e
(
α
−
β
)
t
2
C
1
(
3.2.1
)
u^*(t)=\arg\max_u\left[e^{-\beta t}\sqrt{u}+C_1e^{-\alpha t}(\alpha x^*-u)\right]=\frac{e^{(\alpha -\beta)t}}{2C_1}\qquad (3.2.1)
u∗(t)=argumax[e−βtu+C1e−αt(αx∗−u)]=2C1e(α−β)t(3.2.1)Then by the differiential equation of the system we get
x
˙
∗
(
t
)
=
α
x
∗
(
t
)
−
e
(
α
−
β
)
t
2
C
1
\dot{x}^*(t)=\alpha x^*(t)-\frac{e^{(\alpha -\beta)t}}{2C_1}
x˙∗(t)=αx∗(t)−2C1e(α−β)tSolving this equation, we obtain
x
∗
(
t
)
=
C
2
e
α
t
+
e
(
α
−
β
)
t
2
C
1
β
for some constant
C
2
x^*(t)=C_2e^{\alpha t}+\frac{e^{(\alpha -\beta)t}}{2C_1\beta}\qquad\text{for some constant }C_2
x∗(t)=C2eαt+2C1βe(α−β)tfor some constant C2And together with the initial condition
x
∗
(
0
)
=
S
x^*(0)=S
x∗(0)=S and the final condition
x
∗
(
T
)
=
0
x^*(T)=0
x∗(T)=0, we can get the exact values of
C
1
C_1
C1 and
C
2
C_2
C2. So
u
∗
(
t
)
u^*(t)
u∗(t) in (3.2.1) gives the optimal control.
□
\qquad\qquad\qquad\qquad\qquad\Box
□
3.9 Use the Minimum Principle to solve the linear-quadratic problem of Example 3.2.2.
Solution. The
n
n
n-dimension linear-quadratic system is given by
x
˙
(
t
)
=
A
x
(
t
)
+
B
u
(
t
)
\dot x(t)=Ax(t)+Bu(t)
x˙(t)=Ax(t)+Bu(t) where
A
A
A and
B
B
B are given matrices, and the quadratic cost
x
(
T
)
′
Q
T
x
(
T
)
+
∫
0
T
(
x
(
t
)
′
Q
x
(
t
)
+
u
(
t
)
′
R
u
(
t
)
)
d
t
x(T)'Q_Tx(T)+\int_0^T\left(x(t)'Qx(t)+u(t)'Ru(t)\right)dt
x(T)′QTx(T)+∫0T(x(t)′Qx(t)+u(t)′Ru(t))dt where the matrices
Q
T
Q_T
QT and
Q
Q
Q are symmetric positive semidefinite, and the matrix
R
R
R is symmetric positive definite.
\qquad
The Hamiltonian here is
H
(
x
,
u
,
p
)
=
x
′
Q
x
+
u
′
R
u
+
p
′
(
A
x
+
B
u
)
H(x,u,p)=x'Qx+u'Ru+p'(Ax+Bu)
H(x,u,p)=x′Qx+u′Ru+p′(Ax+Bu) and the adjoint equation is
p
˙
(
t
)
=
2
Q
x
+
A
′
p
(
t
)
(
1
)
\dot p(t)=2Qx+A'p(t)\qquad (1)
p˙(t)=2Qx+A′p(t)(1) with the terminal conditon
p
(
T
)
=
∇
h
(
x
∗
(
T
)
)
=
2
Q
T
x
∗
(
T
)
p(T)=\nabla h(x^*(T))=2Q_Tx^*(T)
p(T)=∇h(x∗(T))=2QTx∗(T)The optimal control can be obtained by minimizing the Hamiltonian with respect to
u
u
u, yielding
u
∗
(
t
)
=
arg
min
u
{
x
∗
(
t
)
′
Q
x
∗
(
t
)
+
u
′
R
u
+
p
′
(
A
x
∗
(
t
)
+
B
u
)
}
u^*(t)=\arg\min_{u}\left\{x^*(t)'Qx^*(t)+u'Ru+p'(Ax^*(t)+Bu)\right\}
u∗(t)=argumin{x∗(t)′Qx∗(t)+u′Ru+p′(Ax∗(t)+Bu)} Since
∇
u
{
x
∗
(
t
)
′
Q
x
∗
(
t
)
+
u
′
R
u
+
p
′
(
A
x
∗
(
t
)
+
B
u
)
}
=
2
R
u
+
B
′
p
\nabla_u\{x^*(t)'Qx^*(t)+u'Ru+p'(Ax^*(t)+Bu)\}=2Ru+B'p
∇u{x∗(t)′Qx∗(t)+u′Ru+p′(Ax∗(t)+Bu)}=2Ru+B′p, we get
u
∗
(
t
)
=
−
1
2
R
−
1
B
′
p
(
t
)
(
2
)
u^*(t)=-\frac{1}{2}R^{-1}B'p(t)\qquad(2)
u∗(t)=−21R−1B′p(t)(2) together with the system function leading to
x
˙
∗
(
t
)
=
A
x
∗
(
t
)
−
1
2
B
R
−
1
B
′
p
(
t
)
(
3
)
\dot x^*(t)=Ax^*(t)-\frac{1}{2}BR^{-1}B'p(t)\qquad (3)
x˙∗(t)=Ax∗(t)−21BR−1B′p(t)(3) So
p
(
t
)
p(t)
p(t) can be solved by (1) (But I don’t know the answer!!) , and then
x
∗
(
t
)
x^*(t)
x∗(t) can be solved by (3).
3.11 Use the discrete-time Minimum Principle to solve Exercise 1.14 of Chapter 1, assuming that each
w
k
w_k
wk is fixed at a known deterministic value.
Solution. Let
w
k
=
w
‾
w_k=\overline{w}
wk=w for some fixed number
w
‾
>
0
\overline{w}>0
w>0, the system is characterized by
x
k
+
1
=
f
k
(
x
k
,
u
k
)
=
x
k
+
w
‾
u
k
x
k
x_{k+1}=f_k(x_k,u_k)=x_k+\overline{w}u_kx_k
xk+1=fk(xk,uk)=xk+wukxk and the cost functiom becomes
J
(
u
)
=
x
N
+
∑
k
=
0
N
−
1
(
1
−
u
k
)
x
k
J(u)=x_N+\mathop{\sum}\limits_{k=0}^{N-1}(1-u_k)x_k
J(u)=xN+k=0∑N−1(1−uk)xk Then the Hamiltonian function can be written as
H
k
(
x
k
,
u
k
,
p
k
+
1
)
=
(
1
−
u
k
)
x
k
+
p
k
+
1
(
x
k
+
w
‾
u
k
x
k
)
H_k(x_k,u_k,p_{k+1})=(1-u_k)x_k+p_{k+1}(x_k+\overline{w}u_kx_k)
Hk(xk,uk,pk+1)=(1−uk)xk+pk+1(xk+wukxk) By the Discrete-time Minimum Principle, for
k
=
0
,
1
,
⋯
 
,
N
−
1
k=0,1,\cdots,N-1
k=0,1,⋯,N−1, we have
u
k
∗
=
arg
max
u
k
H
k
(
x
k
∗
,
u
k
,
p
k
+
1
)
    
u_k^*=\arg\mathop{\max}\limits_{u_k}H_k(x_k^*,u_k,p_{k+1})\qquad\qquad\qquad\qquad\;\;
uk∗=argukmaxHk(xk∗,uk,pk+1)
=
arg
max
u
k
[
(
p
k
+
1
w
‾
−
1
)
u
k
x
k
+
(
p
k
+
1
+
1
)
x
k
]
=\arg\mathop{\max}\limits_{u_k}\left[(p_{k+1}\overline{w}-1)u_kx_k+(p_{k+1}+1)x_k\right]
=argukmax[(pk+1w−1)ukxk+(pk+1+1)xk]
=
{
1
,
if
  
p
k
+
1
w
‾
>
1
0
,
if
  
p
k
+
1
w
‾
≤
1
(
3.11.1
)
=\begin{cases} 1, & \text{ if }\; p_{k+1}\overline{w}>1\\ 0, & \text{ if }\; p_{k+1}\overline{w}\leq1 \end{cases}\qquad\qquad\qquad(3.11.1)
={1,0, if pk+1w>1 if pk+1w≤1(3.11.1) On the other hand, for
k
=
0
,
1
,
⋯
 
,
N
−
1
k=0,1,\cdots,N-1
k=0,1,⋯,N−1, the adjoint equation reads
p
k
=
∇
x
k
H
k
(
x
k
∗
,
u
k
∗
,
p
k
+
1
)
=
(
p
k
+
1
w
‾
−
1
)
u
k
∗
+
p
k
+
1
+
1
(
3.11.2
)
p_k=\nabla_{x_k}H_k(x_k^*,u_k^*,p_{k+1})=(p_{k+1}\overline{w}-1)u_k^*+p_{k+1}+1\qquad(3.11.2)
pk=∇xkHk(xk∗,uk∗,pk+1)=(pk+1w−1)uk∗+pk+1+1(3.11.2) with the terminal condition
p
N
=
∇
g
N
(
x
N
∗
)
=
1.
p_N=\nabla_{g_N}(x_N^*)=1.
pN=∇gN(xN∗)=1.
Combing (3.11.1) with (3.11.2), we can obtain the following argument
p
k
+
1
w
‾
>
1
  
⇒
  
μ
k
∗
=
1
  
⇒
  
p
k
=
(
w
‾
+
1
)
p
k
+
1
(
3.11.3
)
p_{k+1}\overline{w}>1\;\Rightarrow\;\mu_k^*=1\;\Rightarrow\;p_k=(\overline{w}+1)p_{k+1}\qquad (3.11.3)
pk+1w>1⇒μk∗=1⇒pk=(w+1)pk+1(3.11.3)
p
k
+
1
w
‾
≤
1
  
⇒
  
μ
k
∗
=
0
  
⇒
  
p
k
=
p
k
+
1
+
1
    
(
3.11.4
)
p_{k+1}\overline{w}\leq1\;\Rightarrow\;\mu_k^*=0\;\Rightarrow\;p_k=p_{k+1}+1\;\;\qquad (3.11.4)
pk+1w≤1⇒μk∗=0⇒pk=pk+1+1(3.11.4) So by induction, we can easily conclude the following optimal control results:
(1) If
w
‾
>
1
\overline{w}>1
w>1,
u
0
∗
=
⋯
=
u
N
−
1
∗
=
1
u_0^*=\cdots=u_{N-1}^*=1
u0∗=⋯=uN−1∗=1.
(2) If
0
<
w
‾
<
1
/
N
0<\overline{w}<1/N
0<w<1/N,
u
0
∗
=
⋯
=
u
N
−
1
∗
=
0
u_0^*=\cdots=u_{N-1}^*=0
u0∗=⋯=uN−1∗=0.
(3) If
1
/
N
≤
w
‾
≤
1
1/N\leq\overline{w}\leq 1
1/N≤w≤1,
u
0
∗
=
⋯
=
u
N
−
k
ˉ
−
1
∗
=
1
u_0^*=\cdots=u_{N-\bar{k}-1}^*=1
u0∗=⋯=uN−kˉ−1∗=1
u
N
−
k
ˉ
∗
=
⋯
=
u
N
−
1
∗
=
0
u_{N-\bar{k}}^*=\cdots=u_{N-1}^*=0
uN−kˉ∗=⋯=uN−1∗=0 where
k
ˉ
\bar{k}
kˉ is such that
1
/
(
k
ˉ
+
1
)
<
w
‾
≤
1
/
k
ˉ
1/{(\bar{k}+1)}<\overline{w}\leq 1/{\bar{k}}
1/(kˉ+1)<w≤1/kˉ.
□
\qquad\qquad\qquad\qquad\qquad\qquad\qquad\Box
□
3.12 Use the discrete-time Minimum Principle to solve Exercise 1.15 of Chapter 1, assuming that each γ k \gamma_k γk and δ k \delta_k δk are fixed at a known deterministic values.