Review:
- Control history
- Deterministic optimal control
- Pontryagin’s principle
Lecture 7 The Linear Quadratic Regulator Three Ways
Overview
- LQR intro
- LQR via shooting
- LQR as a QP
- Riccati recursion
1. LQR problem
min x 1 : N , u 1 : N − 1 J = ∑ k = 1 N − 1 ( 1 2 x k T Q k x k + 1 2 u k T R u k ) + 1 2 x N T Q N x N s.t. x k + 1 = A k x k + B k u k , k = 1 , ⋯ , N − 1 Q k ⪰ 0 , R ≻ 0 \begin{aligned} \min_{\mathbf{x}_{1:N}, \mathbf{u}_{1:N-1}} &J = \sum_{k=1}^{N-1} \left(\frac{1}{2}\mathbf{x}_k^T \mathbf{Q}_k \mathbf{x}_k + \frac{1}{2}\mathbf{u}_k^T \mathbf{R} \mathbf{u}_k \right)+ \frac{1}{2}\mathbf{x}_N^T \mathbf{Q}_N \mathbf{x}_N \\ \text{s.t. } &\mathbf{x}_{k+1} = \mathbf{A}_k \mathbf{x}_k + \mathbf{B}_k \mathbf{u}_k, \quad k = 1, \cdots, N-1 \\ &\mathbf{Q}_k \succeq 0, \quad \mathbf{R} \succ 0 \end{aligned} x1:N,u1:N−1mins.t. J=k=1∑N−1(21xkTQkxk+21ukTRuk)+21xNTQNxNxk+1=Akxk+Bkuk,k=1,⋯,N−1Qk⪰0,R≻0
R \mathbf{R} R should be positive definite, otherwise the control input will be infinite. But Q k \mathbf{Q}_k Qk can be positive semidefinite.
- Can (locally) approximate many nonlinear problems.
- Computational tractable
- Many extensions (e.g. infinite horizon, stochastic, etc.)
- Time invariant: A k = A \mathbf{A}_k = \mathbf{A} Ak=A, B k = B \mathbf{B}_k = \mathbf{B} Bk=B, Q k = Q \mathbf{Q}_k = \mathbf{Q} Qk=Q, R k = R \mathbf{R}_k = \mathbf{R} Rk=R, ∀ k \forall k ∀k.
2. LQR with indirect shooting
x k + 1 = ∇ λ H ( x k , u k , λ k + 1 ) = A x k + B u k λ k = ∇ x H ( x k , u k , λ k + 1 ) = Q x k + A ⊤ λ k + 1 λ N = ∂ ℓ F ∂ x N = Q N x N u k = arg min u ~ H ( x k , u ~ , λ k + 1 ) = − R − 1 B ⊤ λ k + 1 \begin{aligned} &\mathbf{x}_{k+1} = \nabla_\lambda \mathcal{H}(\mathbf{x}_k, \mathbf{u}_k, \lambda_{k+1})= \mathbf{A} \mathbf{x}_k + \mathbf{B} \mathbf{u}_k\\ &\lambda_k = \nabla_{\mathbf{x}} \mathcal{H}(\mathbf{x}_k, \mathbf{u}_k, \lambda_{k+1})= \mathbf{Q} \mathbf{x}_k + \mathbf{A}^\top \lambda_{k+1}\\ &\lambda_N = \frac{\partial \ell_F}{\partial \mathbf{x}_N} = \mathbf{Q}_N \mathbf{x}_N\\ &\mathbf{u}_k = \argmin_{\tilde{\mathbf{u}}} \mathcal{H}(\mathbf{x}_k, \tilde{\mathbf{u}}, \lambda_{k+1}) = -\mathbf{R}^{-1} \mathbf{B}^\top \lambda_{k+1} \end{aligned} xk+1=∇λH(xk,uk,λk+1)=Axk+Bukλk=∇xH(xk,uk,λk+1)=Qxk+A⊤λk+1λN=∂xN∂ℓF=QNxNuk=u~argminH(xk,u~,λk+1)=−R−1B⊤λk+1
Note:
H ( x k , u ~ , λ k + 1 ) = 1 2 x k ⊤ Q x k + 1 2 u ~ ⊤ R u ~ + λ k + 1 ⊤ ( A x k + B u ~ ) \mathcal{H}\left(\mathbf{x}_k, \tilde{\mathbf{u}}, \lambda_{k+1}\right) = \frac{1}{2}\mathbf{x}_k^\top \mathbf{Q} \mathbf{x}_k + \frac{1}{2}\tilde{\mathbf{u}}^\top \mathbf{R} \tilde{\mathbf{u}} + \lambda_{k+1}^\top \left(\mathbf{A} \mathbf{x}_k + \mathbf{B} \tilde{\mathbf{u}} \right) H(xk,u~,λk+1)=21xk⊤Qxk+21u~⊤Ru~+λk+1⊤(Axk+Bu~)
To minimize H \mathcal{H} H, we take derivative w.r.t. u ~ \tilde{\mathbf{u}} u~ and set it to zero:
∂ H ∂ u ~ = R u ~ + B ⊤ λ k + 1 = 0 ⇒ u ~ = − R − 1 B ⊤ λ k + 1 \frac{\partial \mathcal{H}}{\partial \tilde{\mathbf{u}}} = \mathbf{R} \tilde{\mathbf{u}} + \mathbf{B}^\top \lambda_{k+1} = 0 \Rightarrow \tilde{\mathbf{u}} = -\mathbf{R}^{-1} \mathbf{B}^\top \lambda_{k+1} ∂u~∂H=Ru~+B⊤λk+1=0⇒u~=−R−1B⊤λk+1
(1) Procedure
i. Start with an initial guess
u
1
:
N
−
1
\mathbf{u}_{1:N-1}
u1:N−1 trajectory
ii. Simulate (“rollout”) to get
x
1
:
N
\mathbf{x}_{1:N}
x1:N
iii. Backward pass to get
λ
\lambda
λ and
Δ
u
\Delta \mathbf{u}
Δu
iv. Rollout with line search on
Δ
u
\Delta \mathbf{u}
Δu
v. Go to iii until convergence
(2) Example: double integrator
Dynamics:
x
˙
=
[
q
˙
q
¨
]
=
[
0
1
0
0
]
[
q
q
˙
]
+
[
0
1
]
u
\dot{\mathbf{x}} = \begin{bmatrix} \dot{\mathbf{q}}\\ \ddot{\mathbf{q}} \end{bmatrix} = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} \mathbf{q}\\ \dot{\mathbf{q}} \end{bmatrix} + \begin{bmatrix} 0\\ 1 \end{bmatrix} \mathbf{u}
x˙=[q˙q¨]=[0010][qq˙]+[01]u
Think of this as a brick sliding on ice (no friction).
Discretize with Euler’s method:
x
k
+
1
=
[
1
h
0
1
]
x
k
+
[
h
2
2
h
]
u
k
=
A
x
k
+
B
u
k
\mathbf{x}_{k+1} = \begin{bmatrix} 1 & h\\ 0 & 1 \end{bmatrix} \mathbf{x}_k + \begin{bmatrix} \frac{h^2}{2}\\ h \end{bmatrix} \mathbf{u}_k = \mathbf{A} \mathbf{x}_k + \mathbf{B} \mathbf{u}_k
xk+1=[10h1]xk+[2h2h]uk=Axk+Buk
Code:
using LinearAlgebra
using PyPlot
# Discrete dynamics
h = 0.1 # time step
A = [1 h; 0 1]
B = [0.5*h*h; h]
n = 2 # number of state
m = 1 # number of controls
Tfinal = 5.0 # final time #try larger values
N = Int(Tfinal/h)+1 # number of time steps
thist = Array(range(0,h*(N-1), step=h));
# Initial conditions
x0 = [1.0; 0]
# Cost weights
Q = 1.0*I(2)
R = 0.1
Qn = 1.0*I(2)
# Cost function
function J(xhist,uhist)
cost = 0.5*xhist[:,end]'*Qn*xhist[:,end]
for k = 1:(N-1)
cost = cost + 0.5*xhist[:,k]'*Q*xhist[:,k] + 0.5*uhist[k]'*R*uhist[k]
end
return cost
end
# Dynamics
function rollout(xhist, uhist)
xnew = zeros(size(xhist))
xnew[:,1] = xhist[:,1]
for k = 1:(N-1)
xnew[:,k+1] .= A*xnew[:,k] + B*uhist[k]
end
return xnew
end
# Initial guess
xhist = repeat(x0, 1, N)
uhist = zeros(N-1)
Δu = ones(N-1)
λhist = zeros(n,N)
xhist = rollout(xhist, uhist) #initial rollout to get state trajectory
# J(xhist,uhist) #Initial cost
b = 1e-2 #line search tolerance
α = 1.0
iter = 0
while maximum(abs.(Δu[:])) > 1e-2 #terminate when the gradient is small
#Backward pass to compute λ and Δu
λhist[:,N] .= Qn*xhist[:,N]
for k = N-1:-1:1
Δu[k] = -(uhist[k]+R\B'*λhist[:,k+1])
λhist[:,k] .= Q*xhist[:,k] + A'*λhist[:,k+1]
end
#Forward pass with line search to compute x
α = 1.0
unew = uhist + α.*Δu
xnew = rollout(xhist, unew)
while J(xnew, unew) > J(xhist, uhist) - b*α*Δu[:]'*Δu[:]
α = 0.5*α
unew = uhist + α.*Δu
xnew = rollout(xhist, unew)
end
uhist .= unew;
xhist .= xnew;
iter += 1
end
# Plot x1 vs. x2, u vs. t, x vs. t, etc.
clf()
figure(figsize=(12,3.5))
subplot(1,2,1)
plot(thist, xhist[1,:], label="Position")
plot(thist, xhist[2,:], label="Velocity")
xlabel("Time")
title("State Trajectory")
legend()
subplot(1,2,2)
plot(thist[1:end-1], uhist, label="Control Input")
xlabel("Time")
title("Control Input")
legend()
display(gcf())
savefig("lqr-shooting.png", dpi=300, bbox_inches="tight")
Result:
Set the final time to 10.0s, we get:
The solutions are similar, but a longer time horizon needs more iterations to converge (2415 vs. 664). Its time complexity is linear to the time horizon.
3. LQR as a QP
(1) Redefine some variables
Assume initial state
x
1
\mathbf{x}_1
x1 is given (not a decision variable).
Define
z
=
[
u
1
x
2
u
2
x
3
⋮
x
N
]
\mathbf{z} = \begin{bmatrix} \mathbf{u}_1\\ \mathbf{x}_2\\ \mathbf{u}_2\\ \mathbf{x}_3\\ \vdots\\ \mathbf{x}_N \end{bmatrix}
z=
u1x2u2x3⋮xN
.
Define cost matrix:
H
=
d
i
a
g
(
R
1
,
Q
2
,
R
2
,
Q
3
,
⋯
,
Q
N
)
,
\mathbf{H} = \mathrm{diag}(\mathbf{R}_1, \mathbf{Q}_2, \mathbf{R}_2, \mathbf{Q}_3, \cdots, \mathbf{Q}_N),
H=diag(R1,Q2,R2,Q3,⋯,QN), such that
J
=
1
2
z
⊤
H
z
J = \frac{1}{2}\mathbf{z}^\top \mathbf{H} \mathbf{z}
J=21z⊤Hz
Do the same for dynamics:
Define
C
\mathbf{C}
C and
d
\mathbf{d}
d:
C
=
[
B
−
I
0
0
⋯
0
0
0
0
A
B
−
I
⋯
0
0
0
⋮
⋮
⋮
⋮
⋱
⋮
⋮
⋮
0
0
0
0
⋯
A
B
−
I
]
,
d
=
[
−
A
x
1
0
⋮
0
]
\mathbf{C} = \begin{bmatrix} \mathbf{B} & -\mathbf{I} & \mathbf{0} & \mathbf{0} & \cdots & \mathbf{0}& \mathbf{0}& \mathbf{0}\\ \mathbf{0} & \mathbf{A} & \mathbf{B} & -\mathbf{I} & \cdots & \mathbf{0}& \mathbf{0}& \mathbf{0}\\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots & \vdots\\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} & \cdots & \mathbf{A}& \mathbf{B}& -\mathbf{I} \end{bmatrix}, \mathbf{d} = \begin{bmatrix} -\mathbf{A}\mathbf{x}_1\\ \mathbf{0}\\ \vdots\\ \mathbf{0} \end{bmatrix}
C=
B0⋮0−IA⋮00B⋮00−I⋮0⋯⋯⋱⋯00⋮A00⋮B00⋮−I
,d=
−Ax10⋮0
So the dynamics constraint can be written as:
C
z
=
d
\mathbf{C}\mathbf{z} = \mathbf{d}
Cz=d
(2) QP formulation and its closed-form solution
Now we can write the LQR problem as a QP:
min
z
J
=
1
2
z
⊤
H
z
s.t.
C
z
=
d
\begin{aligned} \min_{\mathbf{z}} \; &J = \frac{1}{2}\mathbf{z}^\top \mathbf{H} \mathbf{z}\\ \text{s.t. } &\mathbf{C}\mathbf{z} = \mathbf{d} \end{aligned}
zmins.t. J=21z⊤HzCz=d
The Lagrangian is:
L
=
1
2
z
⊤
H
z
+
λ
⊤
(
C
z
−
d
)
\mathcal{L} = \frac{1}{2}\mathbf{z}^\top \mathbf{H} \mathbf{z} + \lambda^\top (\mathbf{C}\mathbf{z} - \mathbf{d})
L=21z⊤Hz+λ⊤(Cz−d)
KKT conditions:
∇
z
L
=
H
z
+
C
⊤
λ
=
0
∇
λ
L
=
C
z
−
d
=
0
⇒
[
H
C
⊤
C
0
]
[
z
λ
]
=
[
0
d
]
\begin{aligned} \nabla_\mathbf{z} \mathcal{L} &= \mathbf{H}\mathbf{z} + \mathbf{C}^\top \lambda = \mathbf{0}\\ \nabla_\lambda \mathcal{L} &= \mathbf{C}\mathbf{z} - \mathbf{d} = \mathbf{0} \end{aligned} \Rightarrow \begin{bmatrix} \mathbf{H} & \mathbf{C}^\top\\ \mathbf{C} & \mathbf{0} \end{bmatrix} \begin{bmatrix} \mathbf{z}\\ \lambda \end{bmatrix} = \begin{bmatrix} \mathbf{0}\\ \mathbf{d} \end{bmatrix}
∇zL∇λL=Hz+C⊤λ=0=Cz−d=0⇒[HCC⊤0][zλ]=[0d]
We get the exact solution by solving one linear system.
(3) Example: double integrator
Code:
using LinearAlgebra
using PyPlot
using SparseArrays
# Discrete dynamics
h = 0.1 # time step
A = [1 h; 0 1]
B = [0.5*h*h; h]
n = 2 # number of state
m = 1 # number of controls
Tfinal = 100.0 # final time
N = Int(Tfinal/h)+1 # number of time steps
thist = Array(range(0,h*(N-1), step=h));
# Initial conditions
x0 = [1.0; 0]
# Cost weights
Q = sparse(1.0*I(2))
R = sparse(0.1*I(1))
Qn = sparse(1.0*I(2))
#Cost function
function J(xhist,uhist)
cost = 0.5*xhist[:,end]'*Qn*xhist[:,end]
for k = 1:(N-1)
cost = cost + 0.5*xhist[:,k]'*Q*xhist[:,k] + 0.5*(uhist[k]'*R*uhist[k])[1]
end
return cost
end
# Cost
H = blockdiag(R, kron(I(N-2), blockdiag(Q,R)), Qn);
# Constraints
C = kron(I(N-1), [B -I(2)])
for k = 1:N-2
C[(k*n).+(1:n), (k*(n+m)-n).+(1:n)] .= A
end
d = [-A*x0; zeros(size(C,1)-n)];
# Solve the linear system
y = [H C'; C zeros(size(C,1),size(C,1))]$$zeros(size(H,1)); d]
# Get state history
z = y[1:size(H,1)] # states and controls [u0,x1,u1,...,xN]
Z = reshape(z,n+m,N-1)
xhist = Z[m+1:end,:]
xhist = [x0 xhist]
# Get control history
uhist = Z[1,:];
# Plot x1 vs. x2, u vs. t, x vs. t, etc.
times = range(0,h*(N-1), step=h)
clf()
figure(figsize=(12,3.5))
subplot(1,2,1)
plot(times, xhist[1,:], label="Position")
plot(times, xhist[2,:], label="Velocity")
xlabel("Time")
title("State Trajectory")
legend()
subplot(1,2,2)
plot(times[1:end-1], uhist, label="Control Input")
xlabel("Time")
title("Control Input")
legend()
display(gcf())
savefig("lqr-qp.png", dpi=300, bbox_inches="tight")
Result:
Compute time is much shorter than shooting method, and a longer time horizon is available.
4. A closer look at the LQR QP
(1) KKT system
The KKT system for LQR is very sparse (lots of zeros) and has a lot of structure.
[
R
B
⊤
Q
−
I
A
⊤
R
B
⊤
Q
−
I
A
⊤
R
B
⊤
Q
N
−
I
B
−
I
A
B
−
I
0
A
B
−
I
]
[
u
1
x
2
u
2
x
3
u
3
x
4
λ
2
λ
3
λ
4
]
=
[
0
0
0
0
0
0
−
A
x
1
0
0
]
\left[ \begin{array}{c} \begin{matrix} \mathbf{R}& & & & & & \mathbf{B}^{\top}& & \\ & \mathbf{Q}& & & & & -\mathbf{I}& \mathbf{A}^{\top}& \\ & & \mathbf{R}& & & & & \mathbf{B}^{\top}& \\ & & & \mathbf{Q}& & & & -\mathbf{I}& \mathbf{A}^{\top}\\ & & & & \mathbf{R}& & & & \mathbf{B}^{\top}\\ & & & & & \mathbf{Q}_N& & & -\mathbf{I}\\ \mathbf{B}& -\mathbf{I}& & & & & & & \\ & \mathbf{A}& \mathbf{B}& -\mathbf{I}& & & & \mathbf{0}& \\ & & & \mathbf{A}& \mathbf{B}& -\mathbf{I}& & & \\ \end{matrix}\\ \end{array} \right] \left[ \begin{array}{c} \begin{matrix} \mathbf{u}_1\\ \mathbf{x}_2\\ \mathbf{u}_2\\ \mathbf{x}_3\\ \mathbf{u}_3\\ \mathbf{x}_4\\ \lambda_2\\ \lambda_3\\ \lambda_4 \end{matrix}\\ \end{array} \right] = \left[ \begin{array}{c} \begin{matrix} \mathbf{0}\\ \mathbf{0}\\ \mathbf{0}\\ \mathbf{0}\\ \mathbf{0}\\ \mathbf{0}\\ -\mathbf{A}\mathbf{x}_1\\ \mathbf{0}\\ \mathbf{0}\\ \end{matrix}\\ \end{array} \right]
RBQ−IARBQ−IARBQN−IB⊤−IA⊤B⊤−I0A⊤B⊤−I
u1x2u2x3u3x4λ2λ3λ4
=
000000−Ax100
(2) Solving procedure (Riccati recursion)
i.
Q
N
x
4
−
λ
4
=
0
⇒
λ
4
=
Q
N
x
4
\mathbf{Q}_N \mathbf{x}_4 - \lambda_4 = \mathbf{0} \Rightarrow \lambda_4 = \mathbf{Q}_N \mathbf{x}_4
QNx4−λ4=0⇒λ4=QNx4
ii. Substitute
λ
4
\lambda_4
λ4 into the last equation:
R
u
3
+
B
⊤
λ
4
=
R
u
3
+
B
⊤
Q
N
x
4
=
0
\mathbf{R}\mathbf{u}_3 + \mathbf{B}^\top \lambda_4 = \mathbf{R}\mathbf{u}_3 + \mathbf{B}^\top \mathbf{Q}_N \mathbf{x}_4 = \mathbf{0}
Ru3+B⊤λ4=Ru3+B⊤QNx4=0
Plug in dynamics for
x
4
\mathbf{x}_4
x4:
R
u
3
+
B
⊤
Q
N
(
A
x
3
+
B
u
3
)
=
0
\mathbf{R}\mathbf{u}_3 + \mathbf{B}^\top \mathbf{Q}_N \left(\mathbf{A}\mathbf{x}_3 + \mathbf{B}\mathbf{u}_3\right) = \mathbf{0}
Ru3+B⊤QN(Ax3+Bu3)=0
⇒
u
3
=
−
(
R
+
B
⊤
Q
N
B
)
−
1
B
⊤
Q
N
A
x
3
≜
−
K
3
x
3
\Rightarrow \mathbf{u}_3 = -\left(\mathbf{R} + \mathbf{B}^\top \mathbf{Q}_N \mathbf{B}\right)^{-1} \mathbf{B}^\top \mathbf{Q}_N \mathbf{A}\mathbf{x}_3 \triangleq -\mathbf{K}_3 \mathbf{x}_3
⇒u3=−(R+B⊤QNB)−1B⊤QNAx3≜−K3x3
iii. Substitute
u
3
\mathbf{u}_3
u3 into the fourth equation:
Q
x
3
−
λ
3
+
A
⊤
λ
4
=
0
\mathbf{Q}\mathbf{x}_3 -\lambda_3 + \mathbf{A}^\top \lambda_4 = \mathbf{0}
Qx3−λ3+A⊤λ4=0
Plug in
λ
4
\lambda_4
λ4:
Q
x
3
−
λ
3
+
A
⊤
Q
N
x
4
=
0
\mathbf{Q}\mathbf{x}_3 -\lambda_3 + \mathbf{A}^\top \mathbf{Q}_N \mathbf{x}_4 = \mathbf{0}
Qx3−λ3+A⊤QNx4=0
Plug in dynamics:
Q
x
3
−
λ
3
+
A
⊤
Q
N
(
A
x
3
+
B
u
3
)
=
0
\mathbf{Q}\mathbf{x}_3 -\lambda_3 + \mathbf{A}^\top \mathbf{Q}_N \left(\mathbf{A}\mathbf{x}_3 + \mathbf{B}\mathbf{u}_3\right) = \mathbf{0}
Qx3−λ3+A⊤QN(Ax3+Bu3)=0
Plug in
u
3
\mathbf{u}_3
u3:
Q
x
3
−
λ
3
+
A
⊤
Q
N
(
A
x
3
−
B
K
3
x
3
)
=
0
\mathbf{Q}\mathbf{x}_3 -\lambda_3 + \mathbf{A}^\top \mathbf{Q}_N \left(\mathbf{A}\mathbf{x}_3 - \mathbf{B}\mathbf{K}_3 \mathbf{x}_3\right) = \mathbf{0}
Qx3−λ3+A⊤QN(Ax3−BK3x3)=0
⇒
λ
3
=
(
Q
+
A
⊤
Q
N
(
A
−
B
K
3
)
)
x
3
≜
P
3
x
2
\Rightarrow \lambda_3 = \left(\mathbf{Q} + \mathbf{A}^\top \mathbf{Q}_N \left(\mathbf{A} - \mathbf{B}\mathbf{K}_3\right)\right) \mathbf{x}_3 \triangleq \mathbf{P}_3 \mathbf{x}_2
⇒λ3=(Q+A⊤QN(A−BK3))x3≜P3x2
Now we have a recursion for
K
k
\mathbf{K}_k
Kk and
P
k
\mathbf{P}_k
Pk:
P
N
=
Q
N
,
K
k
=
(
R
+
B
⊤
P
k
+
1
B
)
−
1
B
⊤
P
k
+
1
A
,
P
k
=
Q
+
A
⊤
P
k
+
1
(
A
−
B
K
k
)
\begin{aligned} &\mathbf{P}_N = \mathbf{Q}_N, \\ &\mathbf{K}_k = \left(\mathbf{R} + \mathbf{B}^\top \mathbf{P}_{k+1} \mathbf{B}\right)^{-1} \mathbf{B}^\top \mathbf{P}_{k+1} \mathbf{A}, \\ &\mathbf{P}_k = \mathbf{Q} + \mathbf{A}^\top \mathbf{P}_{k+1} \left(\mathbf{A} - \mathbf{B}\mathbf{K}_k\right) \end{aligned}
PN=QN,Kk=(R+B⊤Pk+1B)−1B⊤Pk+1A,Pk=Q+A⊤Pk+1(A−BKk)
-
This is called the Riccati recursion/equation. We can solve the QP by doing a backward Riccati recursion followed by a forward rollout to compute x 1 : N \mathbf{x}_{1:N} x1:N and u 1 : N − 1 \mathbf{u}_{1:N-1} u1:N−1 form initial conditions.
-
General (dense) QP has complexity O ( N 3 ( n + m ) 3 ) \mathcal{O}(N^3(n+m)^3) O(N3(n+m)3) ( N N N is the time horizon, n n n is the state dimension, m m m is the control dimension).
-
Riccati recursion has complexity O ( N ( n + m ) 3 ) \mathcal{O}(N(n+m)^3) O(N(n+m)3).
-
Even more important: we now have a feedback policy u k = − K k x k \mathbf{u}_k = -\mathbf{K}_k \mathbf{x}_k uk=−Kkxk instead of an open-loop trajectory.
(3) Example: double integrator
Code:
P = zeros(n,n,N)
K = zeros(m,n,N-1)
P[:,:,N] .= Qn
#Backward Riccati recursion
for k = (N-1):-1:1
K[:,:,k] .= (R + B'*P[:,:,k+1]*B)\(B'*P[:,:,k+1]*A)
P[:,:,k] .= Q + A'*P[:,:,k+1]*(A-B*K[:,:,k])
end
#Forward rollout starting at x0
xhist = zeros(n,N)
xhist[:,1] = x0
uhist = zeros(m,N-1)
for k = 1:(N-1)
uhist[:,k] .= -K[:,:,k]*xhist[:,k]
xhist[:,k+1] .= A*xhist[:,k] + B*uhist[k]
end
-
Plot the control feedback, we find that the feedback matrix K \mathbf{K} K converges to a constant value.
-
Using Julia’s
dlqr
function, we get the feedback matrix, which is the same as the converged value. -
Therefore, when we want to stabilize the system, we can use a constant feedback matrix (treat it as an infinite horizon LQR).