Review:
- Infinite-horizon LQR
- Controllability
- Dynamic programming
Lecture 9 Convex Model Predictive Control
Overview
- Convexit background
- Convex MPC
1. Finally: what are the Lagrange multipliers?
Recall Riccati derivation from QP:
λ
k
=
P
k
x
k
\lambda_k = \mathbf{P}_k \mathbf{x}_k
λk=Pkxk
From DP:
V
k
(
x
)
=
1
2
x
⊤
P
k
x
V_k(\mathbf{x}) = \frac{1}{2}\mathbf{x}^\top \mathbf{P}_k \mathbf{x}
Vk(x)=21x⊤Pkx
It is easy to see that λ k = ∇ V k ( x ) ∣ x = x k = P k x k \lambda_k = \nabla \left.V_k(\mathbf{x})\right|_{\mathbf{x}=\mathbf{x}_k} = \mathbf{P}_k \mathbf{x}_k λk=∇Vk(x)∣x=xk=Pkxk.
- Dynamics multipliers are cost-to-go gradients.
- Carries over to nonlinear setting (not just LQR).
In the nonlinear case, even if we cannot get the value function, we can still get its gradient.
DP vs. QP:
The state and control input are the same.
Compare matrix
K
\mathbf{K}
K from dlqr
and DP:
#Compute infinite-horizon K matrix using ControlSystems.jl
Kinf = dlqr(A,B,Q,R[1])
#Compare to oursa
K[:,:,1]-Kinf
Result:
1×2 Matrix{Float64}:
-6.72929e-9 -2.28764e-9
The matrix P \mathbf{P} P has a similar result.
2. Convexity Model Predictive Control
(1) Motivation
- LQR is very powerful but we often need to reason about constraints.
- Often these constraints are simple (e.g. torque limits).
- Constraints break the Riccati solution, but we can still solve the QP online.
- Convex MPC has gotten popular as computers have gotten faster.
(1) Background: Convexity
(a) Convex set
X
⊆
R
n
\mathcal{X} \subseteq \mathbb{R}^n
X⊆Rn is convex if for any
x
1
,
x
2
∈
X
\mathbf{x}_1, \mathbf{x}_2 \in \mathcal{X}
x1,x2∈X, the line segment between them is in
X
\mathcal{X}
X. (A line connecting any two points in the set is also contained in the set.)
Standard examples:
- Linear subspace: X = { x ∈ R n ∣ A x = b } \mathcal{X} = \{\mathbf{x} \in \mathbb{R}^n | \mathbf{Ax} = \mathbf{b}\} X={x∈Rn∣Ax=b}
- Half space/Box/Polytope: X = { x ∈ R n ∣ A x ≤ b } \mathcal{X} = \{\mathbf{x} \in \mathbb{R}^n | \mathbf{Ax} \leq \mathbf{b}\} X={x∈Rn∣Ax≤b}
- Ellipsoid: X = { x ∈ R n ∣ x ⊤ P − 1 x ≤ 1 , P ≻ 0 } \mathcal{X} = \{\mathbf{x} \in \mathbb{R}^n | \mathbf{x}^\top \mathbf{P}^{-1} \mathbf{x} \leq 1, \mathbf{P} \succ 0\} X={x∈Rn∣x⊤P−1x≤1,P≻0}
- Cone: X = { x ∈ R n ∣ ∥ x i ∥ 2 ≤ x 1 , i = 2 , … , n } \mathcal{X} = \{\mathbf{x} \in \mathbb{R}^n | \left\|\mathbf{x}_i\right\|_2 \leq \mathbf{x}_1, i = 2, \dots, n\} X={x∈Rn∣∥xi∥2≤x1,i=2,…,n} ( L 2 L_2 L2 norm → \rightarrow → second-order cone or standard ice cream cone)
(b) Convex function
A function
f
:
R
n
→
R
f: \mathbb{R}^n \rightarrow \mathbb{R}
f:Rn→R whose epigraph is a convex set.
(standard definition:
f
(
λ
x
1
+
(
1
−
λ
)
x
2
)
≤
λ
f
(
x
1
)
+
(
1
−
λ
)
f
(
x
2
)
f(\lambda \mathbf{x}_1 + (1-\lambda)\mathbf{x}_2) \leq \lambda f(\mathbf{x}_1) + (1-\lambda)f(\mathbf{x}_2)
f(λx1+(1−λ)x2)≤λf(x1)+(1−λ)f(x2) for all
x
1
,
x
2
∈
R
n
\mathbf{x}_1, \mathbf{x}_2 \in \mathbb{R}^n
x1,x2∈Rn and
λ
∈
[
0
,
1
]
\lambda \in [0,1]
λ∈[0,1])
The epigraph of a function is the set of points lying on or above its graph. Its definition is:
epi ( f ) = { ( x , y ) ∈ R n + 1 ∣ f ( x ) ≤ y } \text{epi}(f) = \{(\mathbf{x}, y) \in \mathbb{R}^{n+1} | f(\mathbf{x}) \leq y\} epi(f)={(x,y)∈Rn+1∣f(x)≤y}
Standard examples:
- Linear function: f ( x ) = c ⊤ x f(\mathbf{x}) = \mathbf{c}^\top \mathbf{x} f(x)=c⊤x
- Quadratic function: f ( x ) = 1 2 x ⊤ Q x + c ⊤ x , Q ≻ 0 f(\mathbf{x}) = \frac{1}{2}\mathbf{x}^\top \mathbf{Q} \mathbf{x} + \mathbf{c}^\top \mathbf{x}, \mathbf{Q} \succ 0 f(x)=21x⊤Qx+c⊤x,Q≻0
- Norms: f ( x ) = ∥ x ∥ p = ( ∑ i = 1 n ∣ x i ∣ p ) 1 / p , p ≥ 1 f(\mathbf{x}) = \left\|\mathbf{x}\right\|_p = \left(\sum_{i=1}^n \left|x_i\right|^p\right)^{1/p}, p \geq 1 f(x)=∥x∥p=(∑i=1n∣xi∣p)1/p,p≥1
© Convex optimization problem
minimize a convex function over a convex set:
minimize
f
(
x
)
subject to
x
∈
X
\begin{array}{ll} \text{minimize} & f(\mathbf{x}) \\ \text{subject to} & \mathbf{x} \in \mathcal{X} \end{array}
minimizesubject tof(x)x∈X
Standard examples:
- Linear program (LP): f ( x ) = c ⊤ x , X = { x ∈ R n ∣ A x ≤ b } f(\mathbf{x}) = \mathbf{c}^\top \mathbf{x}, \mathcal{X} = \{\mathbf{x} \in \mathbb{R}^n | \mathbf{Ax} \leq \mathbf{b}\} f(x)=c⊤x,X={x∈Rn∣Ax≤b}
- Quadratic program (QP): f ( x ) = 1 2 x ⊤ Q x + c ⊤ x , X = { x ∈ R n ∣ A x ≤ b } f(\mathbf{x}) = \frac{1}{2}\mathbf{x}^\top \mathbf{Q} \mathbf{x} + \mathbf{c}^\top \mathbf{x}, \mathcal{X} = \{\mathbf{x} \in \mathbb{R}^n | \mathbf{Ax} \leq \mathbf{b}\} f(x)=21x⊤Qx+c⊤x,X={x∈Rn∣Ax≤b}
- Quadratically-constrained quadratic program (QCQP): f ( x ) = 1 2 x ⊤ Q 0 x + c ⊤ x , X = { x ∈ R n ∣ x ⊤ Q i x + a i ⊤ x ≤ b i , i = 1 , … , m } f(\mathbf{x}) = \frac{1}{2}\mathbf{x}^\top \mathbf{Q}_0 \mathbf{x} + \mathbf{c}^\top \mathbf{x}, \mathcal{X} = \{\mathbf{x} \in \mathbb{R}^n | \mathbf{x}^\top \mathbf{Q}_i \mathbf{x} + \mathbf{a}_i^\top \mathbf{x} \leq b_i, i = 1, \dots, m\} f(x)=21x⊤Q0x+c⊤x,X={x∈Rn∣x⊤Qix+ai⊤x≤bi,i=1,…,m} (ellipsoide constraints)
- Second-order cone program (SOCP): f ( x ) = c ⊤ x , X = { x ∈ R n ∣ ∥ A i x + b i ∥ 2 ≤ c i ⊤ x + d i , i = 1 , … , m } f(\mathbf{x}) = \mathbf{c}^\top \mathbf{x}, \mathcal{X} = \{\mathbf{x} \in \mathbb{R}^n | \left\|\mathbf{A}_i \mathbf{x} + \mathbf{b}_i\right\|_2 \leq \mathbf{c}_i^\top \mathbf{x} + d_i, i = 1, \dots, m\} f(x)=c⊤x,X={x∈Rn∣∥Aix+bi∥2≤ci⊤x+di,i=1,…,m} (cone constraints)
LP is a special case of QP, QP is a special case of QCQP, QCQP is a special case of SOCP.
LP ⊂ QP ⊂ QCQP ⊂ SOCP \text{LP} \subset \text{QP} \subset \text{QCQP} \subset \text{SOCP} LP⊂QP⊂QCQP⊂SOCP
(2) Properties of convex optimization problems
- Convex problems don’t have nay spurious local optima that satisfy the KKT conditions.
⇒ \Rightarrow ⇒ Any local optimum is a global optimum. - Practically, Newton’s method converges really fast and reliably (5-10 iterations max).
⇒ \Rightarrow ⇒ We can bound solution time for real-time control.
3. Convex MPC
Think about this as “constrainted LQR”.
Remember from QP, if we have a cost-to-go function, we can get
u
\mathbf{u}
u by solving a one-step problem:
u
k
=
arg min
u
ℓ
(
x
k
,
u
)
+
V
k
+
1
(
f
(
x
k
,
u
)
)
=
arg min
u
1
2
u
⊤
R
u
+
1
2
(
A
k
x
k
+
B
k
u
)
⊤
P
k
+
1
(
A
k
x
k
+
B
k
u
)
\begin{aligned} \mathbf{u}_k &= \argmin_{\mathbf{u}} \ell(\mathbf{x}_k, \mathbf{u}) + V_{k+1}(\mathbf{f}(\mathbf{x}_k, \mathbf{u}))\\ &=\argmin_{\mathbf{u}} \frac{1}{2}\mathbf{u}^\top \mathbf{R} \mathbf{u} + \frac{1}{2}\left(\mathbf{A}_k \mathbf{x}_k + \mathbf{B}_k \mathbf{u} \right)^\top \mathbf{P}_{k+1} \left(\mathbf{A}_k \mathbf{x}_k + \mathbf{B}_k \mathbf{u} \right)\\ \end{aligned}
uk=uargminℓ(xk,u)+Vk+1(f(xk,u))=uargmin21u⊤Ru+21(Akxk+Bku)⊤Pk+1(Akxk+Bku)
We can add constraints on u \mathbf{u} u to this one-step problem but this will perform poorly beacuse V ( x ) V(x) V(x) was computed without constraints.
So, there is no reason we can’t add more steps to the one-step problem:
min
x
1
:
H
,
u
1
:
H
−
1
(
∑
k
=
1
H
−
1
1
2
x
k
⊤
Q
x
k
+
1
2
u
k
⊤
R
u
k
)
+
1
2
x
H
⊤
P
x
H
s.t.
x
k
+
1
=
A
k
x
k
+
B
k
u
k
x
k
∈
X
,
u
k
∈
U
\min_{\mathbf{x}_{1:H}, \mathbf{u}_{1:H-1}}\left( \sum_{k=1}^{H-1} \frac{1}{2}\mathbf{x}_k^\top \mathbf{Q} \mathbf{x}_k + \frac{1}{2}\mathbf{u}_k^\top \mathbf{R} \mathbf{u}_k\right) + \frac{1}{2}\mathbf{x}_H^\top \mathbf{P} \mathbf{x}_H\\ \text{s.t.} \quad \mathbf{x}_{k+1} = \mathbf{A}_k \mathbf{x}_k + \mathbf{B}_k \mathbf{u}_k\\ \mathbf{x}_k \in \mathcal{X}, \mathbf{u}_k \in \mathcal{U}
x1:H,u1:H−1min(k=1∑H−121xk⊤Qxk+21uk⊤Ruk)+21xH⊤PxHs.t.xk+1=Akxk+Bkukxk∈X,uk∈U
- The last term 1 2 x H ⊤ P x H \frac{1}{2}\mathbf{x}_H^\top \mathbf{P} \mathbf{x}_H 21xH⊤PxH is the LQR cost-to-go.
- H ≤ N H\leq N H≤N is called the horizon.
- With no additional constraints, MPC (“receeding horizon control”) exactly matches LQR for any H H H.
- Intuition: explicit constrained optimization over first H H H steps gets the state close enough to the reference that the constraints are no longer active and LQR solution/cost-to-go is valid further into the future.
- In general:
- A good approximation of V ( x ) V(x) V(x) is important for good performance.
- Better V ( x ) V(x) V(x) ⇒ \Rightarrow ⇒ shorter horizon H H H.
- Longer horizon H H H ⇒ \Rightarrow ⇒ less reliance on V ( x ) V(x) V(x).
(3) Example: Planar quadrotor
Dynamics:
m
x
¨
=
−
(
u
1
+
u
2
)
sin
θ
m
y
¨
=
(
u
1
+
u
2
)
cos
θ
−
m
g
J
θ
¨
=
1
2
l
(
u
1
−
u
2
)
\begin{aligned} m\ddot{x} &= -(u_1+u_2)\sin\theta\\ m\ddot{y} &= (u_1+u_2)\cos\theta - mg\\ J\ddot{\theta} &= \frac{1}{2}l(u_1-u_2) \end{aligned}
mx¨my¨Jθ¨=−(u1+u2)sinθ=(u1+u2)cosθ−mg=21l(u1−u2)
Linearize about hover:
u
1
=
u
2
=
1
2
m
g
u_1 = u_2 = \frac{1}{2}mg
u1=u2=21mg
⇒ { Δ x ¨ = − g Δ θ Δ y ¨ = 1 m ( Δ u 1 + Δ u 2 ) Δ θ ¨ = l 2 J ( Δ u 2 − Δ u 1 ) \Rightarrow \begin{cases} \Delta\ddot{x} = -g\Delta\theta\\ \Delta\ddot{y} = \frac{1}{m}\left(\Delta u_1 + \Delta u_2\right)\\ \Delta\ddot{\theta} = \frac{l}{2J}\left(\Delta u_2 - \Delta u_1\right) \end{cases} ⇒⎩ ⎨ ⎧Δx¨=−gΔθΔy¨=m1(Δu1+Δu2)Δθ¨=2Jl(Δu2−Δu1)
State equation:
[
Δ
x
˙
Δ
y
˙
Δ
θ
˙
Δ
x
¨
Δ
y
¨
Δ
θ
¨
]
=
[
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
−
g
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
]
[
Δ
x
Δ
y
Δ
θ
Δ
x
˙
Δ
y
˙
Δ
θ
˙
]
+
[
0
0
0
0
0
0
0
0
1
m
1
m
l
2
J
−
l
2
J
]
[
Δ
u
1
Δ
u
2
]
\begin{bmatrix} \Delta \dot{x}\\ \Delta \dot{y}\\ \Delta \dot{\theta}\\ \Delta \ddot{x}\\ \Delta \ddot{y}\\ \Delta \ddot{\theta} \end{bmatrix} = \begin{bmatrix} 0 & 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 & 1\\ 0 & 0 & -g & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} \Delta x\\ \Delta y\\ \Delta \theta\\ \Delta \dot{x}\\ \Delta \dot{y}\\ \Delta \dot{\theta} \end{bmatrix} + \begin{bmatrix} 0 & 0\\ 0 & 0\\ 0 & 0\\ 0 & 0\\ \frac{1}{m} & \frac{1}{m}\\ \frac{l}{2J} & -\frac{l}{2J} \end{bmatrix} \begin{bmatrix} \Delta u_1\\ \Delta u_2 \end{bmatrix}
Δx˙Δy˙Δθ˙Δx¨Δy¨Δθ¨
=
000000000000000−g00100000010000001000
ΔxΔyΔθΔx˙Δy˙Δθ˙
+
0000m12Jl0000m1−2Jl
[Δu1Δu2]
Namely, x ˙ = A x + B u \dot{\mathbf{x}} = \mathbf{A}\mathbf{x} + \mathbf{B}\mathbf{u} x˙=Ax+Bu.
MPC cost function:
J
=
∑
k
=
1
H
−
1
(
1
2
(
x
k
−
x
r
e
f
)
⊤
Q
(
x
k
−
x
r
e
f
)
+
1
2
Δ
u
k
⊤
R
Δ
u
k
)
+
1
2
(
x
H
−
x
r
e
f
)
⊤
P
H
(
x
H
−
x
r
e
f
)
J = \sum_{k=1}^{H-1} \left( \frac{1}{2}(\mathbf{x}_k-\mathbf{x}_{ref})^\top \mathbf{Q} (\mathbf{x}_k-\mathbf{x}_{ref}) + \frac{1}{2}\Delta\mathbf{u}_k^\top \mathbf{R} \Delta\mathbf{u}_k \right) \\ + \frac{1}{2}(\mathbf{x}_H-\mathbf{x}_{ref})^\top \mathbf{P}_H (\mathbf{x}_H-\mathbf{x}_{ref})
J=k=1∑H−1(21(xk−xref)⊤Q(xk−xref)+21Δuk⊤RΔuk)+21(xH−xref)⊤PH(xH−xref)
Code:
using LinearAlgebra
using PyPlot
using SparseArrays
using ForwardDiff
using ControlSystems
using OSQP
#Model parameters
g = 9.81 #m/s^2
m = 1.0 #kg
ℓ = 0.3 #meters
J = 0.2*m*ℓ*ℓ
#Thrust limits
umin = [0.2*m*g; 0.2*m*g]
umax = [0.6*m*g; 0.6*m*g]
h = 0.05 #time step (20 Hz)
#Planar Quadrotor Dynamics
function quad_dynamics(x,u)
θ = x[3]
ẍ = (1/m)*(u[1] + u[2])*sin(θ)
ÿ = (1/m)*(u[1] + u[2])*cos(θ) - g
θ̈ = (1/J)*(ℓ/2)*(u[2] - u[1])
return [x[4:6]; ẍ; ÿ; θ̈]
end
function quad_dynamics_rk4(x,u)
#RK4 integration with zero-order hold on u
f1 = quad_dynamics(x, u)
f2 = quad_dynamics(x + 0.5*h*f1, u)
f3 = quad_dynamics(x + 0.5*h*f2, u)
f4 = quad_dynamics(x + h*f3, u)
return x + (h/6.0)*(f1 + 2*f2 + 2*f3 + f4)
end
#Linearized dynamics for hovering
x_hover = zeros(6)
u_hover = [0.5*m*g; 0.5*m*g]
A = ForwardDiff.jacobian(x->quad_dynamics_rk4(x,u_hover),x_hover);
B = ForwardDiff.jacobian(u->quad_dynamics_rk4(x_hover,u),u_hover);
quad_dynamics_rk4(x_hover, u_hover)
Nx = 6 # number of state
Nu = 2 # number of controls
Tfinal = 10.0 # final time
Nt = Int(Tfinal/h)+1 # number of time steps
thist = Array(range(0,h*(Nt-1), step=h));
# Cost weights
Q = Array(1.0*I(Nx));
R = Array(.01*I(Nu));
Qn = Array(1.0*I(Nx));
#Cost function
function cost(xhist,uhist)
cost = 0.5*xhist[:,end]'*Qn*xhist[:,end]
for k = 1:(size(xhist,2)-1)
cost = cost + 0.5*xhist[:,k]'*Q*xhist[:,k] + 0.5*(uhist[k]'*R*uhist[k])[1]
end
return cost
end
#LQR Hover Controller
P = dare(A,B,Q,R)
K = dlqr(A,B,Q,R)
function lqr_controller(t,x,K,xref)
return u_hover - K*(x-xref)
end
#Build QP matrices for OSQP
Nh = 20 #one second horizon at 20Hz
Nx = 6
Nu = 2
U = kron(Diagonal(I,Nh), [I zeros(Nu,Nx)]) #Matrix that picks out all u
Θ = kron(Diagonal(I,Nh), [0 0 0 0 1 0 0 0]) #Matrix that picks out all x3 (θ)
H = sparse([kron(Diagonal(I,Nh-1),[R zeros(Nu,Nx); zeros(Nx,Nu) Q]) zeros((Nx+Nu)*(Nh-1), Nx+Nu); zeros(Nx+Nu,(Nx+Nu)*(Nh-1)) [R zeros(Nu,Nx); zeros(Nx,Nu) P]])
b = zeros(Nh*(Nx+Nu))
C = sparse([[B -I zeros(Nx,(Nh-1)*(Nu+Nx))]; zeros(Nx*(Nh-1),Nu) [kron(Diagonal(I,Nh-1), [A B]) zeros((Nh-1)*Nx,Nx)] + [zeros((Nh-1)*Nx,Nx) kron(Diagonal(I,Nh-1),[zeros(Nx,Nu) Diagonal(-I,Nx)])]])
#Dynamics + Thrust limit constraints
D = [C; U]
lb = [zeros(Nx*Nh); kron(ones(Nh),umin-u_hover)]
ub = [zeros(Nx*Nh); kron(ones(Nh),umax-u_hover)]
#Dynamics + thrust limit + bound constraint on θ to keep the system within small-angle approximation
#D = [C; U; Θ]
#lb = [zeros(Nx*Nh); kron(ones(Nh),umin-u_hover); -0.2*ones(Nh)]
#ub = [zeros(Nx*Nh); kron(ones(Nh),umax-u_hover); 0.2*ones(Nh)]
prob = OSQP.Model()
OSQP.setup!(prob; P=H, q=b, A=D, l=lb, u=ub, verbose=false, eps_abs=1e-8, eps_rel=1e-8, polish=1);
#MPC Controller
function mpc_controller(t,x,xref)
#Update QP problem
lb[1:6] .= -A*x
ub[1:6] .= -A*x
for j = 1:(Nh-1)
b[(Nu+(j-1)*(Nx+Nu)).+(1:Nx)] .= -Q*xref
end
b[(Nu+(Nh-1)*(Nx+Nu)).+(1:Nx)] .= -P*xref
OSQP.update!(prob, q=b, l=lb, u=ub)
#Solve QP
results = OSQP.solve!(prob)
Δu = results.x[1:Nu]
return u_hover + Δu
end
function closed_loop(x0,controller,N)
xhist = zeros(length(x0),N)
u0 = controller(1,x0)
uhist = zeros(length(u0),N-1)
uhist[:,1] .= u0
xhist[:,1] .= x0
for k = 1:(N-1)
uk = controller(k,xhist[:,k])
uhist[:,k] = max.(min.(umax, uk), umin) #enforce control limits
xhist[:,k+1] .= quad_dynamics_rk4(xhist[:,k],uhist[:,k])
end
return xhist, uhist
end
x_ref = [0.0; 1.0; 0; 0; 0; 0]
x0 = [10.0; 2.0; 0.0; 0; 0; 0]
xhist1, uhist1 = closed_loop(x0, (t,x)->lqr_controller(t,x,K,x_ref), Nt);
xhist2, uhist2 = closed_loop(x0, (t,x)->mpc_controller(t,x,x_ref), Nt);
start from
x
0
=
[
1.0
2.0
0.0
0
0
0
]
⊤
\mathbf{x}_0 = \begin{bmatrix}1.0 & 2.0 & 0.0 & 0 & 0 & 0\end{bmatrix}^\top
x0=[1.02.00.0000]⊤:
state trajectory:
control input:
start from
x
0
=
[
10.0
2.0
0.0
0
0
0
]
⊤
\mathbf{x}_0 = \begin{bmatrix}10.0 & 2.0 & 0.0 & 0 & 0 & 0\end{bmatrix}^\top
x0=[10.02.00.0000]⊤:
state trajectory:
control input:
Add bound constraint on
θ
\theta
θ to keep the system within small-angle approximation:
state trajectory:
control input: