Dynamic Programming and Optimal Control 第三章习题

最新推荐文章于 2019-06-15 18:58:52 发布

zte10096334

最新推荐文章于 2019-06-15 18:58:52 发布

阅读量1.6k

点赞数 1

分类专栏：动态规划

本文链接：https://blog.csdn.net/zte10096334/article/details/91347782

版权

动态规划专栏收录该内容

11 篇文章 4 订阅

订阅专栏

3.1 Solve the problem of Example 3.2.1 for the case where the cost function is $(x(T))^2+\int_0^T(u(t))^2dt$ Also, calculate the cost-to-go function $J^*(t,x)$ and verify that it satisfies the HJB equation.
Solution. The scalar system $\dot x(t)=u(t)$ with the constaint $|u(t)|\leq 1$ for all $t\in [0,T]$ .

3.2 A young investor has earned in the stock market a large amount if money $S$ and plans to spend it so as to maximize his enjoyment through the rest of his life without working. He estimates that he will live exactly $T$ more than years and that his capital $x (t)$ should be reduced to zero at time $T$ , i.e., $x (T) = 0$ . Also he models the evolution of his capital by the differential equation $\frac{dx(t)}{dt}=\alpha x(t)-u(t)$ where $x (0) = S$ is his initial capital, $\alpha >0$ is a given interest rate, and $u(t)\ge 0$ is his rate of expenditure. The total enjoyment he will obtain is given by $\int_0^Te^{-\beta t}\sqrt{u(t)}dt$ Here $\beta$ is some positive scalar, which serves to discount future enjoyment. Find the optimal $\{u(t)|t\in[0,T]\}$ .
Solution. We have $f(x,u)=\alpha x-u\;\;,\qquad g(x,u)=e^{-\beta t}\sqrt{u}$ giving the Hamiltonian as follows: $H(x,u,p)=e^{-\beta t}\sqrt{u}+p(\alpha x-u)$ and the adjoint equation is
$\dot p(t)=-\alpha p(t)$ yielding $p(t)=C_1e^{-\alpha t}\qquad\text{for some constant }C_1$ Notice that here $x (T) = 0$ is given, so $p(T)=\nabla(h(x^*(T)))=0$ is not true anymore.
$\qquad$ The optimal control is obtained by maximizing the Hamiltonian with respect to $u$ , yielding
$u^*(t)=\arg\max_u\left[e^{-\beta t}\sqrt{u}+C_1e^{-\alpha t}(\alpha x^*-u)\right]=\frac{e^{(\alpha -\beta)t}}{2C_1}\qquad (3.2.1)$ Then by the differiential equation of the system we get $\dot{x}^*(t)=\alpha x^*(t)-\frac{e^{(\alpha -\beta)t}}{2C_1}$ Solving this equation, we obtain
$x^*(t)=C_2e^{\alpha t}+\frac{e^{(\alpha -\beta)t}}{2C_1\beta}\qquad\text{for some constant }C_2$ And together with the initial condition $x^*(0)=S$ and the final condition $x^*(T)=0$ , we can get the exact values of $C_1$ and $C_2$ . So $u^*(t)$ in (3.2.1) gives the optimal control. $\qquad\qquad\qquad\qquad\qquad\Box$

3.9 Use the Minimum Principle to solve the linear-quadratic problem of Example 3.2.2.
Solution. The $n$ -dimension linear-quadratic system is given by
$\dot x(t)=Ax(t)+Bu(t)$ where $A$ and $B$ are given matrices, and the quadratic cost
$x(T)'Q_Tx(T)+\int_0^T\left(x(t)'Qx(t)+u(t)'Ru(t)\right)dt$ where the matrices $Q_T$ and $Q$ are symmetric positive semidefinite, and the matrix $R$ is symmetric positive definite.
$\qquad$ The Hamiltonian here is $H (x, u, p) = x^{'} Q x + u^{'} R u + p^{'} (A x + B u)$ and the adjoint equation is
$\dot p(t)=2Qx+A'p(t)\qquad (1)$ with the terminal conditon $p(T)=\nabla h(x^*(T))=2Q_Tx^*(T)$ The optimal control can be obtained by minimizing the Hamiltonian with respect to $u$ , yielding
$u^*(t)=\arg\min_{u}\left\{x^*(t)'Qx^*(t)+u'Ru+p'(Ax^*(t)+Bu)\right\}$ Since $\nabla_u\{x^*(t)'Qx^*(t)+u'Ru+p'(Ax^*(t)+Bu)\}=2Ru+B'p$ , we get $u^*(t)=-\frac{1}{2}R^{-1}B'p(t)\qquad(2)$ together with the system function leading to $\dot x^*(t)=Ax^*(t)-\frac{1}{2}BR^{-1}B'p(t)\qquad (3)$ So $p (t)$ can be solved by (1) (But I don’t know the answer!!) , and then $x^*(t)$ can be solved by (3).

3.11 Use the discrete-time Minimum Principle to solve Exercise 1.14 of Chapter 1, assuming that each $w_k$ is fixed at a known deterministic value.
Solution. Let $w_k=\overline{w}$ for some fixed number $\overline{w}>0$ , the system is characterized by $x_{k+1}=f_k(x_k,u_k)=x_k+\overline{w}u_kx_k$ and the cost functiom becomes $J(u)=x_N+\mathop{\sum}\limits_{k=0}^{N-1}(1-u_k)x_k$ Then the Hamiltonian function can be written as $H_k(x_k,u_k,p_{k+1})=(1-u_k)x_k+p_{k+1}(x_k+\overline{w}u_kx_k)$ By the Discrete-time Minimum Principle, for $k=0,1,\cdots,N-1$ , we have $u_k^*=\arg\mathop{\max}\limits_{u_k}H_k(x_k^*,u_k,p_{k+1})\qquad\qquad\qquad\qquad\;\;$ $=\arg\mathop{\max}\limits_{u_k}\left[(p_{k+1}\overline{w}-1)u_kx_k+(p_{k+1}+1)x_k\right]$ $=\begin{cases} 1, & \text{ if }\; p_{k+1}\overline{w}>1\\ 0, & \text{ if }\; p_{k+1}\overline{w}\leq1 \end{cases}\qquad\qquad\qquad(3.11.1)$ On the other hand, for $k=0,1,\cdots,N-1$ , the adjoint equation reads $p_k=\nabla_{x_k}H_k(x_k^*,u_k^*,p_{k+1})=(p_{k+1}\overline{w}-1)u_k^*+p_{k+1}+1\qquad(3.11.2)$ with the terminal condition $p_N=\nabla_{g_N}(x_N^*)=1.$
Combing (3.11.1) with (3.11.2), we can obtain the following argument
$p_{k+1}\overline{w}>1\;\Rightarrow\;\mu_k^*=1\;\Rightarrow\;p_k=(\overline{w}+1)p_{k+1}\qquad (3.11.3)$ $p_{k+1}\overline{w}\leq1\;\Rightarrow\;\mu_k^*=0\;\Rightarrow\;p_k=p_{k+1}+1\;\;\qquad (3.11.4)$ So by induction, we can easily conclude the following optimal control results:
(1) If $\overline{w}>1$ , $u_0^*=\cdots=u_{N-1}^*=1$ .
(2) If $0<\overline{w}<1/N$ , $u_0^*=\cdots=u_{N-1}^*=0$ .
(3) If $1/N\leq\overline{w}\leq 1$ , $u_0^*=\cdots=u_{N-\bar{k}-1}^*=1$ $u_{N-\bar{k}}^*=\cdots=u_{N-1}^*=0$ where $\bar{k}$ is such that $1/{(\bar{k}+1)}<\overline{w}\leq 1/{\bar{k}}$ . $\qquad\qquad\qquad\qquad\qquad\qquad\qquad\Box$

3.12 Use the discrete-time Minimum Principle to solve Exercise 1.15 of Chapter 1, assuming that each $\gamma_k$ and $\delta_k$ are fixed at a known deterministic values.