【Optimal Control (CMU 16-745)】Lecture 7 The Linear Quadratic Regulator Three Ways

Review:

  • Control history
  • Deterministic optimal control
  • Pontryagin’s principle

Lecture 7 The Linear Quadratic Regulator Three Ways

Overview

  • LQR intro
  • LQR via shooting
  • LQR as a QP
  • Riccati recursion

1. LQR problem

min ⁡ x 1 : N , u 1 : N − 1 J = ∑ k = 1 N − 1 ( 1 2 x k T Q k x k + 1 2 u k T R u k ) + 1 2 x N T Q N x N s.t.  x k + 1 = A k x k + B k u k , k = 1 , ⋯   , N − 1 Q k ⪰ 0 , R ≻ 0 \begin{aligned} \min_{\mathbf{x}_{1:N}, \mathbf{u}_{1:N-1}} &J = \sum_{k=1}^{N-1} \left(\frac{1}{2}\mathbf{x}_k^T \mathbf{Q}_k \mathbf{x}_k + \frac{1}{2}\mathbf{u}_k^T \mathbf{R} \mathbf{u}_k \right)+ \frac{1}{2}\mathbf{x}_N^T \mathbf{Q}_N \mathbf{x}_N \\ \text{s.t. } &\mathbf{x}_{k+1} = \mathbf{A}_k \mathbf{x}_k + \mathbf{B}_k \mathbf{u}_k, \quad k = 1, \cdots, N-1 \\ &\mathbf{Q}_k \succeq 0, \quad \mathbf{R} \succ 0 \end{aligned} x1:N,u1:N1mins.t. J=k=1N1(21xkTQkxk+21ukTRuk)+21xNTQNxNxk+1=Akxk+Bkuk,k=1,,N1Qk0,R0

R \mathbf{R} R should be positive definite, otherwise the control input will be infinite. But Q k \mathbf{Q}_k Qk can be positive semidefinite.

  • Can (locally) approximate many nonlinear problems.
  • Computational tractable
  • Many extensions (e.g. infinite horizon, stochastic, etc.)
  • Time invariant: A k = A \mathbf{A}_k = \mathbf{A} Ak=A, B k = B \mathbf{B}_k = \mathbf{B} Bk=B, Q k = Q \mathbf{Q}_k = \mathbf{Q} Qk=Q, R k = R \mathbf{R}_k = \mathbf{R} Rk=R, ∀ k \forall k k.

2. LQR with indirect shooting

x k + 1 = ∇ λ H ( x k , u k , λ k + 1 ) = A x k + B u k λ k = ∇ x H ( x k , u k , λ k + 1 ) = Q x k + A ⊤ λ k + 1 λ N = ∂ ℓ F ∂ x N = Q N x N u k = arg min ⁡ u ~ H ( x k , u ~ , λ k + 1 ) = − R − 1 B ⊤ λ k + 1 \begin{aligned} &\mathbf{x}_{k+1} = \nabla_\lambda \mathcal{H}(\mathbf{x}_k, \mathbf{u}_k, \lambda_{k+1})= \mathbf{A} \mathbf{x}_k + \mathbf{B} \mathbf{u}_k\\ &\lambda_k = \nabla_{\mathbf{x}} \mathcal{H}(\mathbf{x}_k, \mathbf{u}_k, \lambda_{k+1})= \mathbf{Q} \mathbf{x}_k + \mathbf{A}^\top \lambda_{k+1}\\ &\lambda_N = \frac{\partial \ell_F}{\partial \mathbf{x}_N} = \mathbf{Q}_N \mathbf{x}_N\\ &\mathbf{u}_k = \argmin_{\tilde{\mathbf{u}}} \mathcal{H}(\mathbf{x}_k, \tilde{\mathbf{u}}, \lambda_{k+1}) = -\mathbf{R}^{-1} \mathbf{B}^\top \lambda_{k+1} \end{aligned} xk+1=λH(xk,uk,λk+1)=Axk+Bukλk=xH(xk,uk,λk+1)=Qxk+Aλk+1λN=xNF=QNxNuk=u~argminH(xk,u~,λk+1)=R1Bλk+1

Note:
H ( x k , u ~ , λ k + 1 ) = 1 2 x k ⊤ Q x k + 1 2 u ~ ⊤ R u ~ + λ k + 1 ⊤ ( A x k + B u ~ ) \mathcal{H}\left(\mathbf{x}_k, \tilde{\mathbf{u}}, \lambda_{k+1}\right) = \frac{1}{2}\mathbf{x}_k^\top \mathbf{Q} \mathbf{x}_k + \frac{1}{2}\tilde{\mathbf{u}}^\top \mathbf{R} \tilde{\mathbf{u}} + \lambda_{k+1}^\top \left(\mathbf{A} \mathbf{x}_k + \mathbf{B} \tilde{\mathbf{u}} \right) H(xk,u~,λk+1)=21xkQxk+21u~Ru~+λk+1(Axk+Bu~)
To minimize H \mathcal{H} H, we take derivative w.r.t. u ~ \tilde{\mathbf{u}} u~ and set it to zero:
∂ H ∂ u ~ = R u ~ + B ⊤ λ k + 1 = 0 ⇒ u ~ = − R − 1 B ⊤ λ k + 1 \frac{\partial \mathcal{H}}{\partial \tilde{\mathbf{u}}} = \mathbf{R} \tilde{\mathbf{u}} + \mathbf{B}^\top \lambda_{k+1} = 0 \Rightarrow \tilde{\mathbf{u}} = -\mathbf{R}^{-1} \mathbf{B}^\top \lambda_{k+1} u~H=Ru~+Bλk+1=0u~=R1Bλk+1

(1) Procedure

i. Start with an initial guess u 1 : N − 1 \mathbf{u}_{1:N-1} u1:N1 trajectory
ii. Simulate (“rollout”) to get x 1 : N \mathbf{x}_{1:N} x1:N
iii. Backward pass to get λ \lambda λ and Δ u \Delta \mathbf{u} Δu
iv. Rollout with line search on Δ u \Delta \mathbf{u} Δu
v. Go to iii until convergence

(2) Example: double integrator

Dynamics:
x ˙ = [ q ˙ q ¨ ] = [ 0 1 0 0 ] [ q q ˙ ] + [ 0 1 ] u \dot{\mathbf{x}} = \begin{bmatrix} \dot{\mathbf{q}}\\ \ddot{\mathbf{q}} \end{bmatrix} = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} \mathbf{q}\\ \dot{\mathbf{q}} \end{bmatrix} + \begin{bmatrix} 0\\ 1 \end{bmatrix} \mathbf{u} x˙=[q˙q¨]=[0010][qq˙]+[01]u

Think of this as a brick sliding on ice (no friction).

Discretize with Euler’s method:
x k + 1 = [ 1 h 0 1 ] x k + [ h 2 2 h ] u k = A x k + B u k \mathbf{x}_{k+1} = \begin{bmatrix} 1 & h\\ 0 & 1 \end{bmatrix} \mathbf{x}_k + \begin{bmatrix} \frac{h^2}{2}\\ h \end{bmatrix} \mathbf{u}_k = \mathbf{A} \mathbf{x}_k + \mathbf{B} \mathbf{u}_k xk+1=[10h1]xk+[2h2h]uk=Axk+Buk

Code:

using LinearAlgebra
using PyPlot

# Discrete dynamics
h = 0.1   # time step
A = [1 h; 0 1]
B = [0.5*h*h; h]

n = 2     # number of state
m = 1     # number of controls
Tfinal = 5.0 # final time #try larger values
N = Int(Tfinal/h)+1    # number of time steps
thist = Array(range(0,h*(N-1), step=h));

# Initial conditions
x0 = [1.0; 0] 

# Cost weights
Q = 1.0*I(2)
R = 0.1
Qn = 1.0*I(2)

# Cost function
function J(xhist,uhist)
    cost = 0.5*xhist[:,end]'*Qn*xhist[:,end]
    for k = 1:(N-1)
        cost = cost + 0.5*xhist[:,k]'*Q*xhist[:,k] + 0.5*uhist[k]'*R*uhist[k]
    end
    return cost
end

# Dynamics
function rollout(xhist, uhist)
    xnew = zeros(size(xhist))
    xnew[:,1] = xhist[:,1]
    for k = 1:(N-1)
        xnew[:,k+1] .= A*xnew[:,k] + B*uhist[k]
    end
    return xnew
end

# Initial guess
xhist = repeat(x0, 1, N)
uhist = zeros(N-1)
Δu = ones(N-1)
λhist = zeros(n,N)
xhist = rollout(xhist, uhist) #initial rollout to get state trajectory

# J(xhist,uhist) #Initial cost

b = 1e-2 #line search tolerance
α = 1.0
iter = 0
while maximum(abs.(Δu[:])) > 1e-2 #terminate when the gradient is small
    
    #Backward pass to compute λ and Δu
    λhist[:,N] .= Qn*xhist[:,N]
    for k = N-1:-1:1
        Δu[k] = -(uhist[k]+R\B'*λhist[:,k+1])
        λhist[:,k] .= Q*xhist[:,k] + A'*λhist[:,k+1]
    end
    
    #Forward pass with line search to compute x
    α = 1.0
    unew = uhist + α.*Δu
    xnew = rollout(xhist, unew)
    while J(xnew, unew) > J(xhist, uhist) - b*α*Δu[:]'*Δu[:]
        α = 0.5*α
        unew = uhist + α.*Δu
        xnew = rollout(xhist, unew)
    end
    uhist .= unew;
    xhist .= xnew;
    iter += 1
end

# Plot x1 vs. x2, u vs. t, x vs. t, etc.
clf()
figure(figsize=(12,3.5))
subplot(1,2,1)
plot(thist, xhist[1,:], label="Position")
plot(thist, xhist[2,:], label="Velocity")
xlabel("Time")
title("State Trajectory")
legend()

subplot(1,2,2)
plot(thist[1:end-1], uhist, label="Control Input")
xlabel("Time")
title("Control Input")
legend()

display(gcf())
savefig("lqr-shooting.png", dpi=300, bbox_inches="tight")

Result:
在这里插入图片描述

Set the final time to 10.0s, we get:
在这里插入图片描述

The solutions are similar, but a longer time horizon needs more iterations to converge (2415 vs. 664). Its time complexity is linear to the time horizon.

3. LQR as a QP

(1) Redefine some variables

Assume initial state x 1 \mathbf{x}_1 x1 is given (not a decision variable).
Define z = [ u 1 x 2 u 2 x 3 ⋮ x N ] \mathbf{z} = \begin{bmatrix} \mathbf{u}_1\\ \mathbf{x}_2\\ \mathbf{u}_2\\ \mathbf{x}_3\\ \vdots\\ \mathbf{x}_N \end{bmatrix} z= u1x2u2x3xN .
Define cost matrix:
H = d i a g ( R 1 , Q 2 , R 2 , Q 3 , ⋯   , Q N ) , \mathbf{H} = \mathrm{diag}(\mathbf{R}_1, \mathbf{Q}_2, \mathbf{R}_2, \mathbf{Q}_3, \cdots, \mathbf{Q}_N), H=diag(R1,Q2,R2,Q3,,QN), such that
J = 1 2 z ⊤ H z J = \frac{1}{2}\mathbf{z}^\top \mathbf{H} \mathbf{z} J=21zHz
Do the same for dynamics:
Define C \mathbf{C} C and d \mathbf{d} d:
C = [ B − I 0 0 ⋯ 0 0 0 0 A B − I ⋯ 0 0 0 ⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⋮ 0 0 0 0 ⋯ A B − I ] , d = [ − A x 1 0 ⋮ 0 ] \mathbf{C} = \begin{bmatrix} \mathbf{B} & -\mathbf{I} & \mathbf{0} & \mathbf{0} & \cdots & \mathbf{0}& \mathbf{0}& \mathbf{0}\\ \mathbf{0} & \mathbf{A} & \mathbf{B} & -\mathbf{I} & \cdots & \mathbf{0}& \mathbf{0}& \mathbf{0}\\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots & \vdots\\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \cdots & \mathbf{A}& \mathbf{B}& -\mathbf{I} \end{bmatrix}, \mathbf{d} = \begin{bmatrix} -\mathbf{A}\mathbf{x}_1\\ \mathbf{0}\\ \vdots\\ \mathbf{0} \end{bmatrix} C= B00IA00B00I000A00B00I ,d= Ax100

So the dynamics constraint can be written as:
C z = d \mathbf{C}\mathbf{z} = \mathbf{d} Cz=d

(2) QP formulation and its closed-form solution

Now we can write the LQR problem as a QP:
min ⁡ z    J = 1 2 z ⊤ H z s.t.  C z = d \begin{aligned} \min_{\mathbf{z}} \; &J = \frac{1}{2}\mathbf{z}^\top \mathbf{H} \mathbf{z}\\ \text{s.t. } &\mathbf{C}\mathbf{z} = \mathbf{d} \end{aligned} zmins.t. J=21zHzCz=d

The Lagrangian is:
L = 1 2 z ⊤ H z + λ ⊤ ( C z − d ) \mathcal{L} = \frac{1}{2}\mathbf{z}^\top \mathbf{H} \mathbf{z} + \lambda^\top (\mathbf{C}\mathbf{z} - \mathbf{d}) L=21zHz+λ(Czd)

KKT conditions:
∇ z L = H z + C ⊤ λ = 0 ∇ λ L = C z − d = 0 ⇒ [ H C ⊤ C 0 ] [ z λ ] = [ 0 d ] \begin{aligned} \nabla_\mathbf{z} \mathcal{L} &= \mathbf{H}\mathbf{z} + \mathbf{C}^\top \lambda = \mathbf{0}\\ \nabla_\lambda \mathcal{L} &= \mathbf{C}\mathbf{z} - \mathbf{d} = \mathbf{0} \end{aligned} \Rightarrow \begin{bmatrix} \mathbf{H} & \mathbf{C}^\top\\ \mathbf{C} & \mathbf{0} \end{bmatrix} \begin{bmatrix} \mathbf{z}\\ \lambda \end{bmatrix} = \begin{bmatrix} \mathbf{0}\\ \mathbf{d} \end{bmatrix} zLλL=Hz+Cλ=0=Czd=0[HCC0][zλ]=[0d]

We get the exact solution by solving one linear system.

(3) Example: double integrator

Code:

using LinearAlgebra
using PyPlot
using SparseArrays

# Discrete dynamics
h = 0.1   # time step
A = [1 h; 0 1]
B = [0.5*h*h; h]

n = 2     # number of state
m = 1     # number of controls
Tfinal = 100.0 # final time
N = Int(Tfinal/h)+1    # number of time steps
thist = Array(range(0,h*(N-1), step=h));

# Initial conditions
x0 = [1.0; 0]

# Cost weights
Q = sparse(1.0*I(2))
R = sparse(0.1*I(1))
Qn = sparse(1.0*I(2))

#Cost function
function J(xhist,uhist)
    cost = 0.5*xhist[:,end]'*Qn*xhist[:,end]
    for k = 1:(N-1)
        cost = cost + 0.5*xhist[:,k]'*Q*xhist[:,k] + 0.5*(uhist[k]'*R*uhist[k])[1]
    end
    return cost
end

# Cost
H = blockdiag(R, kron(I(N-2), blockdiag(Q,R)), Qn);

# Constraints
C = kron(I(N-1), [B -I(2)])
for k = 1:N-2
    C[(k*n).+(1:n), (k*(n+m)-n).+(1:n)] .= A
end
d = [-A*x0; zeros(size(C,1)-n)];

# Solve the linear system
y = [H C'; C zeros(size(C,1),size(C,1))]$$zeros(size(H,1)); d]

# Get state history
z = y[1:size(H,1)]   # states and controls [u0,x1,u1,...,xN]
Z = reshape(z,n+m,N-1)
xhist = Z[m+1:end,:]
xhist = [x0 xhist]

# Get control history
uhist = Z[1,:];

# Plot x1 vs. x2, u vs. t, x vs. t, etc.
times = range(0,h*(N-1), step=h)

clf()
figure(figsize=(12,3.5))
subplot(1,2,1)
plot(times, xhist[1,:], label="Position")
plot(times, xhist[2,:], label="Velocity")
xlabel("Time")
title("State Trajectory")
legend()

subplot(1,2,2)
plot(times[1:end-1], uhist, label="Control Input")
xlabel("Time")
title("Control Input")
legend()

display(gcf())
savefig("lqr-qp.png", dpi=300, bbox_inches="tight")

Result:
在这里插入图片描述

Compute time is much shorter than shooting method, and a longer time horizon is available.

4. A closer look at the LQR QP

(1) KKT system

The KKT system for LQR is very sparse (lots of zeros) and has a lot of structure.
[ R B ⊤ Q − I A ⊤ R B ⊤ Q − I A ⊤ R B ⊤ Q N − I B − I A B − I 0 A B − I ] [ u 1 x 2 u 2 x 3 u 3 x 4 λ 2 λ 3 λ 4 ] = [ 0 0 0 0 0 0 − A x 1 0 0 ] \left[ \begin{array}{c} \begin{matrix} \mathbf{R}& & & & & & \mathbf{B}^{\top}& & \\ & \mathbf{Q}& & & & & -\mathbf{I}& \mathbf{A}^{\top}& \\ & & \mathbf{R}& & & & & \mathbf{B}^{\top}& \\ & & & \mathbf{Q}& & & & -\mathbf{I}& \mathbf{A}^{\top}\\ & & & & \mathbf{R}& & & & \mathbf{B}^{\top}\\ & & & & & \mathbf{Q}_N& & & -\mathbf{I}\\ \mathbf{B}& -\mathbf{I}& & & & & & & \\ & \mathbf{A}& \mathbf{B}& -\mathbf{I}& & & & \mathbf{0}& \\ & & & \mathbf{A}& \mathbf{B}& -\mathbf{I}& & & \\ \end{matrix}\\ \end{array} \right] \left[ \begin{array}{c} \begin{matrix} \mathbf{u}_1\\ \mathbf{x}_2\\ \mathbf{u}_2\\ \mathbf{x}_3\\ \mathbf{u}_3\\ \mathbf{x}_4\\ \lambda_2\\ \lambda_3\\ \lambda_4 \end{matrix}\\ \end{array} \right] = \left[ \begin{array}{c} \begin{matrix} \mathbf{0}\\ \mathbf{0}\\ \mathbf{0}\\ \mathbf{0}\\ \mathbf{0}\\ \mathbf{0}\\ -\mathbf{A}\mathbf{x}_1\\ \mathbf{0}\\ \mathbf{0}\\ \end{matrix}\\ \end{array} \right] RBQIARBQIARBQNIBIABI0ABI u1x2u2x3u3x4λ2λ3λ4 = 000000Ax100

(2) Solving procedure (Riccati recursion)

i. Q N x 4 − λ 4 = 0 ⇒ λ 4 = Q N x 4 \mathbf{Q}_N \mathbf{x}_4 - \lambda_4 = \mathbf{0} \Rightarrow \lambda_4 = \mathbf{Q}_N \mathbf{x}_4 QNx4λ4=0λ4=QNx4
ii. Substitute λ 4 \lambda_4 λ4 into the last equation:
R u 3 + B ⊤ λ 4 = R u 3 + B ⊤ Q N x 4 = 0 \mathbf{R}\mathbf{u}_3 + \mathbf{B}^\top \lambda_4 = \mathbf{R}\mathbf{u}_3 + \mathbf{B}^\top \mathbf{Q}_N \mathbf{x}_4 = \mathbf{0} Ru3+Bλ4=Ru3+BQNx4=0
Plug in dynamics for x 4 \mathbf{x}_4 x4:
R u 3 + B ⊤ Q N ( A x 3 + B u 3 ) = 0 \mathbf{R}\mathbf{u}_3 + \mathbf{B}^\top \mathbf{Q}_N \left(\mathbf{A}\mathbf{x}_3 + \mathbf{B}\mathbf{u}_3\right) = \mathbf{0} Ru3+BQN(Ax3+Bu3)=0
⇒ u 3 = − ( R + B ⊤ Q N B ) − 1 B ⊤ Q N A x 3 ≜ − K 3 x 3 \Rightarrow \mathbf{u}_3 = -\left(\mathbf{R} + \mathbf{B}^\top \mathbf{Q}_N \mathbf{B}\right)^{-1} \mathbf{B}^\top \mathbf{Q}_N \mathbf{A}\mathbf{x}_3 \triangleq -\mathbf{K}_3 \mathbf{x}_3 u3=(R+BQNB)1BQNAx3K3x3
iii. Substitute u 3 \mathbf{u}_3 u3 into the fourth equation:
Q x 3 − λ 3 + A ⊤ λ 4 = 0 \mathbf{Q}\mathbf{x}_3 -\lambda_3 + \mathbf{A}^\top \lambda_4 = \mathbf{0} Qx3λ3+Aλ4=0
Plug in λ 4 \lambda_4 λ4: Q x 3 − λ 3 + A ⊤ Q N x 4 = 0 \mathbf{Q}\mathbf{x}_3 -\lambda_3 + \mathbf{A}^\top \mathbf{Q}_N \mathbf{x}_4 = \mathbf{0} Qx3λ3+AQNx4=0
Plug in dynamics: Q x 3 − λ 3 + A ⊤ Q N ( A x 3 + B u 3 ) = 0 \mathbf{Q}\mathbf{x}_3 -\lambda_3 + \mathbf{A}^\top \mathbf{Q}_N \left(\mathbf{A}\mathbf{x}_3 + \mathbf{B}\mathbf{u}_3\right) = \mathbf{0} Qx3λ3+AQN(Ax3+Bu3)=0
Plug in u 3 \mathbf{u}_3 u3: Q x 3 − λ 3 + A ⊤ Q N ( A x 3 − B K 3 x 3 ) = 0 \mathbf{Q}\mathbf{x}_3 -\lambda_3 + \mathbf{A}^\top \mathbf{Q}_N \left(\mathbf{A}\mathbf{x}_3 - \mathbf{B}\mathbf{K}_3 \mathbf{x}_3\right) = \mathbf{0} Qx3λ3+AQN(Ax3BK3x3)=0
⇒ λ 3 = ( Q + A ⊤ Q N ( A − B K 3 ) ) x 3 ≜ P 3 x 2 \Rightarrow \lambda_3 = \left(\mathbf{Q} + \mathbf{A}^\top \mathbf{Q}_N \left(\mathbf{A} - \mathbf{B}\mathbf{K}_3\right)\right) \mathbf{x}_3 \triangleq \mathbf{P}_3 \mathbf{x}_2 λ3=(Q+AQN(ABK3))x3P3x2

Now we have a recursion for K k \mathbf{K}_k Kk and P k \mathbf{P}_k Pk:
P N = Q N , K k = ( R + B ⊤ P k + 1 B ) − 1 B ⊤ P k + 1 A , P k = Q + A ⊤ P k + 1 ( A − B K k ) \begin{aligned} &\mathbf{P}_N = \mathbf{Q}_N, \\ &\mathbf{K}_k = \left(\mathbf{R} + \mathbf{B}^\top \mathbf{P}_{k+1} \mathbf{B}\right)^{-1} \mathbf{B}^\top \mathbf{P}_{k+1} \mathbf{A}, \\ &\mathbf{P}_k = \mathbf{Q} + \mathbf{A}^\top \mathbf{P}_{k+1} \left(\mathbf{A} - \mathbf{B}\mathbf{K}_k\right) \end{aligned} PN=QN,Kk=(R+BPk+1B)1BPk+1A,Pk=Q+APk+1(ABKk)

  • This is called the Riccati recursion/equation. We can solve the QP by doing a backward Riccati recursion followed by a forward rollout to compute x 1 : N \mathbf{x}_{1:N} x1:N and u 1 : N − 1 \mathbf{u}_{1:N-1} u1:N1 form initial conditions.

  • General (dense) QP has complexity O ( N 3 ( n + m ) 3 ) \mathcal{O}(N^3(n+m)^3) O(N3(n+m)3) ( N N N is the time horizon, n n n is the state dimension, m m m is the control dimension).

  • Riccati recursion has complexity O ( N ( n + m ) 3 ) \mathcal{O}(N(n+m)^3) O(N(n+m)3).

  • Even more important: we now have a feedback policy u k = − K k x k \mathbf{u}_k = -\mathbf{K}_k \mathbf{x}_k uk=Kkxk instead of an open-loop trajectory.

(3) Example: double integrator

Code:

P = zeros(n,n,N)
K = zeros(m,n,N-1)

P[:,:,N] .= Qn

#Backward Riccati recursion
for k = (N-1):-1:1
    K[:,:,k] .= (R + B'*P[:,:,k+1]*B)\(B'*P[:,:,k+1]*A)
    P[:,:,k] .= Q + A'*P[:,:,k+1]*(A-B*K[:,:,k])
end

#Forward rollout starting at x0
xhist = zeros(n,N)
xhist[:,1] = x0
uhist = zeros(m,N-1)
for k = 1:(N-1)
    uhist[:,k] .= -K[:,:,k]*xhist[:,k]
    xhist[:,k+1] .= A*xhist[:,k] + B*uhist[k]
end
  • Plot the control feedback, we find that the feedback matrix K \mathbf{K} K converges to a constant value.
    在这里插入图片描述

  • Using Julia’s dlqr function, we get the feedback matrix, which is the same as the converged value.

  • Therefore, when we want to stabilize the system, we can use a constant feedback matrix (treat it as an infinite horizon LQR).

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值