高级优化理论与方法(十三)

Non-linear Constrained Optimization

min f ( x ) f(x) f(x)
s.t. h ( x ) = 0 h(x)=0 h(x)=0
g ( x ) ≤ 0 g(x)\leq 0 g(x)0

x ∈ R n , f : R n → R , h : R n → R m , g : R n → R p x\in \mathbb{R}^n, f: \mathbb{R}^n\rightarrow \mathbb{R},h:\mathbb{R}^n\rightarrow \mathbb{R}^m,g:\mathbb{R}^n\rightarrow \mathbb{R}^p xRn,f:RnR,h:RnRm,g:RnRp

注:非线性优化问题和线性优化问题的最大区别在于目标函数是否是线性函数。

Case 1

min f ( x ) = 0 f(x)=0 f(x)=0
s.t. h ( x ) = 0 h(x)=0 h(x)=0

h : R n → R m , h ∈ C 1 h:\mathbb{R}^n\rightarrow \mathbb{R}^m,h\in C^1 h:RnRm,hC1 (continuously differential)

Definition

Def: Let x ∗ x^* x be with h 1 ( x ∗ ) = 0 , ⋯   , h m ( x ∗ ) = 0 h_1(x^*)=0,\cdots,h_m(x^*)=0 h1(x)=0,,hm(x)=0. x ∗ x^* x is a regular point, if ∇ h 1 ( x ∗ ) , ⋯   , ∇ h m ( x ∗ ) \nabla h_1(x^*),\cdots,\nabla h_m(x^*) h1(x),,hm(x) are linearly independent.

Jacobian: D h ( x ∗ ) = [ D h 1 ( x ∗ ) D h 2 ( x ∗ ) ⋯ D h m ( x ∗ ) ] T Dh(x^*)=\begin{bmatrix} Dh_1(x^*)\\ Dh_2(x^*)\\ \cdots\\ Dh_m(x^*) \end{bmatrix}^T Dh(x)= Dh1(x)Dh2(x)Dhm(x) T

Def: Surface: S = { x ∈ R n : h 1 ( x ) = 0 , ⋯   , h m ( x ) = 0 } S=\{x\in\mathbb{R}^n:h_1(x)=0,\cdots,h_m(x)=0\} S={xRn:h1(x)=0,,hm(x)=0}

Example 1

n = 3 , m = 1 , h ( x ) = x 2 − x 3 2 n=3,m=1,h(x)=x_2-x_3^2 n=3,m=1,h(x)=x2x32
D h ( x ) = [ 0 , 1 , − 2 x 3 ] Dh(x)=[0,1,-2x_3] Dh(x)=[0,1,2x3]
∀ x ∈ R 3 , D h ( x ) ≠ 0 \forall x\in\mathbb{R}^3,Dh(x)\neq 0 xR3,Dh(x)=0
S = { x : x 2 − x 3 2 = 0 } S=\{x:x_2-x_3^2=0\} S={x:x2x32=0}

Example 2

h 1 ( x ) = x 1 , h 2 ( x ) = x 2 − x 3 2 h_1(x)=x_1,h_2(x)=x_2-x_3^2 h1(x)=x1,h2(x)=x2x32

D h ( x ∗ ) = [ 1 0 0 0 1 − 2 x 3 ] Dh(x^*)=\begin{bmatrix} 1&0&0\\ 0&1&-2x_3 \end{bmatrix} Dh(x)=[100102x3]
S = { x : x 1 = 0 , x 2 − x 3 2 = 0 } S=\{x:x_1=0,x_2-x_3^2=0\} S={x:x1=0,x2x32=0}

Necessary/Sufficient Conditions

FONC: x ∗ x^* x local minimizer ⇒ ∇ f ( x ∗ ) = 0 \Rightarrow \nabla f(x^*)=0 f(x)=0
SONC: x ∗ x^* x local minimizer ⇒ ∇ f ( x ∗ ) = 0 , ∀ y : y T F ( x ∗ ) y ≥ 0 \Rightarrow \nabla f(x^*)=0,\forall y:y^T F(x^*)y\geq 0 f(x)=0,y:yTF(x)y0
SOSC: (1) ∇ f ( x ∗ ) = 0 \nabla f(x^*)=0 f(x)=0 (2) ∀ y : y T F ( x ∗ ) y ≥ 0 ⇒ x ∗ \forall y:y^T F(x^*)y\geq 0\Rightarrow x^* y:yTF(x)y0x strictly local minimizer

Definition

Def: A curve C C C on a surface S S S is a set of points { x ( t ) ∈ S : t ∈ ( a , b ) } , x ( t ) : R → R n \{x(t)\in S:t\in(a,b)\},x(t):\mathbb{R}\rightarrow \mathbb{R}^n {x(t)S:t(a,b)},x(t):RRn is a continuous function.

Curve differentiable: x ˙ ( t ) = d x d t ( t ) = [ x ˙ 1 ( t ) x ˙ 2 ( t ) ⋯ x ˙ n ( t ) ] \dot{x}(t)=\frac{dx}{dt}(t)=\begin{bmatrix} \dot{x}_1(t)\\ \dot{x}_2(t)\\ \cdots\\ \dot{x}_n(t) \end{bmatrix} x˙(t)=dtdx(t)= x˙1(t)x˙2(t)x˙n(t) exists for all t ∈ ( a , b ) t\in (a,b) t(a,b)
x ¨ ( t ) = d 2 x d t 2 ( t ) = [ x ¨ 1 ( t ) x ¨ 2 ( t ) ⋯ x ¨ n ( t ) ] \ddot{x}(t)=\frac{d^2x}{dt^2}(t)=\begin{bmatrix} \ddot{x}_1(t)\\ \ddot{x}_2(t)\\ \cdots\\ \ddot{x}_n(t) \end{bmatrix} x¨(t)=dt2d2x(t)= x¨1(t)x¨2(t)x¨n(t) exists for all t ∈ ( a , b ) t\in (a,b) t(a,b)

Def: tangent space at x ∗ ∈ S = { x ∈ R n : h ( x ) = 0 } x^*\in S=\{x\in\mathbb{R}^n:h(x)=0\} xS={xRn:h(x)=0} is the set T ( x ∗ ) = { y : D h ( x ∗ ) y = 0 } T(x^*)=\{y:Dh(x^*)y=0\} T(x)={y:Dh(x)y=0}

Example

S = { x ∈ R 3 : h 1 ( x ) = x 1 = 0 , h 2 ( x ) = x 1 − x 2 = 0 } S=\{x\in \mathbb{R}^3: h_1(x)=x_1=0,h_2(x)=x_1-x_2=0\} S={xR3:h1(x)=x1=0,h2(x)=x1x2=0}

D h ( x ∗ ) = [ 1 0 0 1 − 1 0 ] Dh(x^*)=\begin{bmatrix} 1&0&0\\ 1&-1&0 \end{bmatrix} Dh(x)=[110100]
S S S regular points
T ( x ) = { y : ∇ h 1 ( x ) T y = 0 , ∇ h 2 ( x ) T y = 0 } = { [ 0 , 0 , α ] : α ∈ R } ⇒ x 3 T(x)=\{y:\nabla h_1(x)^Ty=0,\nabla h_2(x)^Ty=0\}=\{[0,0,\alpha]:\alpha\in\mathbb{R}\}\Rightarrow x_3 T(x)={y:h1(x)Ty=0,h2(x)Ty=0}={[0,0,α]:αR}x3-axis

Theorem

Thm: Let x ∗ x^* x be a regular point. T ( x ∗ ) T(x^*) T(x): tangent space at x ∗ x^* x. Then: y ∈ T ( x ∗ ) ⇔ ∃ y\in T(x^*)\Leftrightarrow \exist yT(x) differentiable curve on S S S passing through x ∗ x^* x with derivative y y y at x ∗ x^* x.

FONC(Lagrange’s Condition)

2-Dimensional

h : R 3 → R h: \mathbb{R}^3\rightarrow \mathbb{R} h:R3R
Let x ∗ = [ x 1 ∗ , x 2 ∗ ] T , h ( x ∗ ) = 0 x^*=[x_1^*,x_2^*]^T, h(x^*)=0 x=[x1,x2]T,h(x)=0
Assume ∇ h ( x ∗ ) ≠ 0 \nabla h(x^*)\neq 0 h(x)=0
Let x ( t ) : R → R 2 , x ( t ) x(t):\mathbb{R} \rightarrow \mathbb{R}^2,x(t) x(t):RR2,x(t) continuously differentiable.
x ( t ) = [ x 1 ( t ) x 2 ( t ) ] , t ∈ ( a , b ) , x ∗ = x ( t ∗ ) x(t)=\begin{bmatrix} x_1(t)\\ x_2(t) \end{bmatrix},t\in(a,b),x^*=x(t^*) x(t)=[x1(t)x2(t)],t(a,b),x=x(t)
∵ ∀ t ∈ ( a , b ) : h ( x ( t ) ) = 0 \because \forall t\in (a,b): h(x(t))=0 t(a,b):h(x(t))=0
∴ ∀ t : d d t h ( x ( t ) ) = 0 \therefore \forall t: \frac{d}{dt}h(x(t))=0 t:dtdh(x(t))=0
∴ ∇ h ( x ∗ ) \therefore \nabla h(x^*) h(x) orthogonal to x ( t ∗ ) x(t^*) x(t)

Assume x ∗ = x ( t ∗ ) x^*=x(t^*) x=x(t) minimizer of f ( x ) f(x) f(x) on S = { x : h ( x ) = 0 } S=\{x:h(x)=0\} S={x:h(x)=0}

Define ϕ ( t ) = f ( x ( t ) ) ⇒ F O N C d ϕ d t ( t ∗ ) = 0 \phi(t)=f(x(t))\stackrel{FONC}{\Rightarrow} \frac{d\phi}{dt}(t^*)=0 ϕ(t)=f(x(t))FONCdtdϕ(t)=0
0 = d d t ϕ ( t ∗ ) = ∇ f ( x ( t ∗ ) ) T x ˙ ( t ∗ ) = ∇ f ( x ∗ ) T x ˙ ( t ∗ ) 0=\frac{d}{dt}\phi(t^*)=\nabla f(x(t^*))^T\dot{x}(t^*)=\nabla f(x^*)^T\dot{x}(t^*) 0=dtdϕ(t)=f(x(t))Tx˙(t)=f(x)Tx˙(t)
⇒ ∇ f ( x ∗ ) \Rightarrow \nabla f(x^*) f(x) is orthogonal to x ˙ ( t ∗ ) \dot{x}(t^*) x˙(t)
∇ f ( x ∗ ) = λ ∇ h ( x ∗ ) \nabla f(x^*)=\lambda \nabla h(x^*) f(x)=λh(x)

Summary:

x ∗ x^* x is a minimizer of f : R 2 → R f:\mathbb{R}^2\rightarrow \mathbb{R} f:R2R with h ( x ) = 0 , h : R 2 → R h(x)=0,h:\mathbb{R}^2\rightarrow \mathbb{R} h(x)=0,h:R2R. Then, ∇ h ( x ∗ ) \nabla h(x^*) h(x) and ∇ f ( x ∗ ) \nabla f(x^*) f(x) are parallel.
⇒ \Rightarrow If ∇ h ( x ∗ ) ≠ 0 \nabla h(x^*)\neq 0 h(x)=0, then ∃ λ ∗ \exist \lambda^* λ s.t. ∇ f ( x ∗ ) + λ ∗ ∇ h ( x ∗ ) = 0 \nabla f(x^*)+\lambda^*\nabla h(x^*)=0 f(x)+λh(x)=0

Lagrange’s Theorem[FONC]

x ∗ x^* x is a local minimizer of f : R n → R f:\mathbb{R}^n\rightarrow\mathbb{R} f:RnR, subject to h ( x ) = 0 , h : R n → R m , m ≤ n h(x)=0, h:\mathbb{R}^n\rightarrow\mathbb{R}^m,m\leq n h(x)=0,h:RnRm,mn. Assume x ∗ x^* x is regular. Then ∃ x ∗ ∈ R m \exist x^*\in \mathbb{R}^m xRm s.t. D f ( x ∗ ) + λ ∗ T D h ( x ∗ ) = 0 Df(x^*)+{\lambda^*}^TDh(x^*)=0 Df(x)+λTDh(x)=0

Lagrange’s Function

Lagrange’s function: l : R n × R m → R l:\mathbb{R}^n\times\mathbb{R}^m\rightarrow \mathbb{R} l:Rn×RmR
l ( x , λ ) = f ( x ) + λ T h ( x ) l(x,\lambda)=f(x)+\lambda^Th(x) l(x,λ)=f(x)+λTh(x)

min l ( x , λ ) ⇐ l(x,\lambda)\Leftarrow l(x,λ) FONC
D l ( x ∗ , λ ∗ ) = 0 ⇒ { D x l ( x ∗ , λ ∗ ) = 0 D λ l ( x ∗ , λ ∗ ) = 0 Dl(x^*,\lambda^*)=0\Rightarrow \begin{cases} D_xl(x^*,\lambda^*)=0\\ D_{\lambda}l(x^*,\lambda^*)=0 \end{cases} Dl(x,λ)=0{Dxl(x,λ)=0Dλl(x,λ)=0

Example 1

已知长方体的表面积为 A A A,求体积的最大值。
max x 1 x 2 x 3 x_1x_2x_3 x1x2x3
s.t. x 1 x 2 + x 2 x 3 + x 1 x 3 = A 2 ( A > 0 ) x_1x_2+x_2x_3+x_1x_3=\frac{A}{2}(A>0) x1x2+x2x3+x1x3=2A(A>0)
f ( x ) = − x 1 x 2 x 3 , h ( x ) = x 1 x 2 + x 2 x 3 + x 1 x 3 − A 2 f(x)=-x_1x_2x_3,h(x)=x_1x_2+x_2x_3+x_1x_3-\frac{A}{2} f(x)=x1x2x3,h(x)=x1x2+x2x3+x1x32A
∇ f ( x ) = [ − x 2 x 3 , − x 1 x 3 , − x 1 x 2 ] T \nabla f(x)=[-x_2x_3,-x_1x_3,-x_1x_2]^T f(x)=[x2x3,x1x3,x1x2]T
∇ h ( x ) = [ x 2 + x 3 , x 1 + x 3 , x 1 + x 2 ] T \nabla h(x)=[x_2+x_3,x_1+x_3,x_1+x_2]^T h(x)=[x2+x3,x1+x3,x1+x2]T
All feasible solutions are regular.
λ ∈ R \lambda\in\mathbb{R} λR
{ ∇ f ( x ) + λ ∇ h ( x ) = 0 h ( x ) = 0 ⇒ { x 2 x 3 − λ ( x 2 + x 3 ) = 0 x 1 x 3 − λ ( x 1 + x 3 ) = 0 x 1 x 2 − λ ( x 1 + x 2 ) = 0 x 1 x 2 + x 2 x 3 + x 1 x 3 − A 2 = 0 \begin{cases} \nabla f(x)+\lambda \nabla h(x)=0\\ h(x)=0 \end{cases}\Rightarrow \begin{cases} x_2x_3-\lambda(x_2+x_3)=0\\ x_1x_3-\lambda(x_1+x_3)=0\\ x_1x_2-\lambda(x_1+x_2)=0\\ x_1x_2+x_2x_3+x_1x_3-\frac{A}{2}=0 \end{cases} {f(x)+λh(x)=0h(x)=0 x2x3λ(x2+x3)=0x1x3λ(x1+x3)=0x1x2λ(x1+x2)=0x1x2+x2x3+x1x32A=0

x 1 = x 2 = x 3 = A 6 x_1=x_2=x_3=\sqrt{\frac{A}{6}} x1=x2=x3=6A 时,取到最值

Example 2

f ( x ) = x 1 2 + x 2 2 , h ( x ) = x 1 2 + 2 x 2 2 − 1 f(x)=x_1^2+x_2^2,h(x)=x_1^2+2x_2^2-1 f(x)=x12+x22,h(x)=x12+2x221
∇ f ( x ) = [ 2 x 1 2 x 2 ] , ∇ h ( x ) = [ 2 x 1 4 x 2 ] \nabla f(x)=\begin{bmatrix} 2x_1\\ 2x_2 \end{bmatrix},\nabla h(x)=\begin{bmatrix} 2x_1\\ 4x_2 \end{bmatrix} f(x)=[2x12x2],h(x)=[2x14x2]
All feasible solutions are regular.
{ ∇ f ( x ) + λ ∇ h ( x ) = 0 h ( x ) = 0 ⇒ { 2 x 1 + 2 λ x 1 = 0 2 x 2 + 4 λ x 2 = 0 x 1 2 + 2 x 2 2 = 1 \begin{cases} \nabla f(x)+\lambda \nabla h(x)=0\\ h(x)=0 \end{cases}\Rightarrow \begin{cases} 2x_1+2\lambda x_1=0\\ 2x_2+4\lambda x_2=0\\ x_1^2+2x_2^2=1 \end{cases} {f(x)+λh(x)=0h(x)=0 2x1+2λx1=02x2+4λx2=0x12+2x22=1

either x 1 = 0 x_1=0 x1=0 or λ = − 1 \lambda=-1 λ=1

λ = − 1 ⇒ { x 1 = ± 1 x 2 = 0 \lambda=-1\Rightarrow\begin{cases} x_1=\pm 1\\ x_2=0 \end{cases} λ=1{x1=±1x2=0

x 1 = 0 ⇒ { λ = − 1 2 x 2 = ± 1 2 x_1=0\Rightarrow\begin{cases} \lambda=-\frac{1}{2}\\ x_2=\pm \frac{1}{\sqrt{2}} \end{cases} x1=0{λ=21x2=±2 1

f ( [ 1 0 ] ) = f ( [ − 1 0 ] ) = 1 f(\begin{bmatrix} 1\\ 0 \end{bmatrix})=f(\begin{bmatrix} -1\\ 0 \end{bmatrix})=1 f([10])=f([10])=1

f ( [ 0 1 2 ] ) = f ( [ 0 − 1 2 ] ) = 1 2 f(\begin{bmatrix} 0\\ \frac{1}{\sqrt{2}} \end{bmatrix})=f(\begin{bmatrix} 0\\ -\frac{1}{\sqrt{2}} \end{bmatrix})=\frac{1}{2} f([02 1])=f([02 1])=21

x 1 = 0 , x 2 = ± 1 2 x_1=0,x_2=\pm \frac{1}{\sqrt{2}} x1=0,x2=±2 1时,取到最小值 1 2 \frac{1}{2} 21

Example 3

min − x T Q x -x^TQx xTQx
s.t. x T P x = 1 x^TPx=1 xTPx=1
P , Q > 0 , P T = P , Q T = Q P,Q>0,P^T=P,Q^T=Q P,Q>0,PT=P,QT=Q

f ( x ) = − x T Q x , h ( x ) = x T P x − 1 f(x)=-x^TQx,h(x)=x^TPx-1 f(x)=xTQx,h(x)=xTPx1
l ( x , λ ) = x T Q x + λ ( 1 − x T P x ) l(x,\lambda)=x^TQx+\lambda(1-x^TPx) l(x,λ)=xTQx+λ(1xTPx)
D x l ( x , λ ) = 2 x T Q − 2 λ x T P = 0 ⇒ ( λ P − Q ) x = 0 ⇒ P − 1 Q x = λ x ⇒ λ , x D_xl(x,\lambda)=2x^TQ-2\lambda x^TP=0\Rightarrow (\lambda P-Q)x=0\Rightarrow P^{-1}Qx=\lambda x\Rightarrow \lambda,x Dxl(x,λ)=2xTQ2λxTP=0(λPQ)x=0P1Qx=λxλ,x are P − 1 Q P^{-1}Q P1Q’s eigenvalue and eigenvector
D λ l ( x , λ ) = 1 − x T P x = 0 D_{\lambda}l(x,\lambda)=1-x^TPx=0 Dλl(x,λ)=1xTPx=0

Q x = P λ x Qx=P\lambda x Qx=Pλx
⇒ x T Q x = λ x T P x \Rightarrow x^TQx=\lambda x^TPx xTQx=λxTPx
⇒ x T Q x = λ \Rightarrow x^TQx=\lambda xTQx=λ
⇒ λ ∗ : \Rightarrow \lambda^*: λ: maximal eigenvalue of P − 1 Q P^{-1}Q P1Q

SONC

Assume f : R n → R , h : R n → R m f:\mathbb{R}^n\rightarrow \mathbb{R},h:\mathbb{R}^n\rightarrow \mathbb{R}^m f:RnR,h:RnRm twice continuously differentiable.
l ( x , λ ) = f ( x ) + λ T h ( x ) = f ( x ) + λ 1 h 1 ( x ) + ⋯ + λ m h m ( x ) l(x,\lambda)=f(x)+\lambda^Th(x)=f(x)+\lambda_1h_1(x)+\cdots+\lambda_mh_m(x) l(x,λ)=f(x)+λTh(x)=f(x)+λ1h1(x)++λmhm(x)
L ( x , λ ) = F ( x ) + λ 1 H 1 ( x ) + ⋯ + λ m H m ( x ) L(x,\lambda)=F(x)+\lambda_1H_1(x)+\cdots+\lambda_mH_m(x) L(x,λ)=F(x)+λ1H1(x)++λmHm(x)

Thm(SONC): x ∗ x^* x a local minimizer of f : R n → R f:\mathbb{R}^n\rightarrow \mathbb{R} f:RnR with h ( x ) = 0 , h : R n → R m , m ≤ n , f , h ∈ C 2 h(x)=0,h:\mathbb{R}^n\rightarrow \mathbb{R}^m,m\leq n,f,h\in C^2 h(x)=0,h:RnRm,mn,f,hC2. Then, ∃ λ ∗ ∈ R m \exist \lambda^*\in \mathbb{R}^m λRm, s.t. { D f ( x ∗ ) + λ ∗ T D h ( x ∗ ) = 0 ∀ y ∈ T ( x ∗ ) = { y : D h ( x ∗ ) y = 0 } : y T L ( x ∗ , λ ∗ ) y ≥ 0 \begin{cases} Df(x^*)+{\lambda^*}^TDh(x^*)=0\\ \forall y\in T(x^*)=\{y:Dh(x^*)y=0\}:y^TL(x^*,\lambda^*)y\geq 0 \end{cases} {Df(x)+λTDh(x)=0yT(x)={y:Dh(x)y=0}:yTL(x,λ)y0

SOSC

f , h ∈ C 2 f,h\in C^2 f,hC2, If ∃ x ∗ ∈ R n , λ ∗ ∈ R m \exist x^*\in\mathbb{R}^n,\lambda^*\in \mathbb{R}^m xRn,λRm, s.t.

  1. D f ( x ∗ ) + λ ∗ T D h ( x ∗ ) = 0 Df(x^*)+{\lambda^*}^TDh(x^*)=0 Df(x)+λTDh(x)=0
  2. ∀ y ∈ T ( x ∗ ) : y T L ( x ∗ , λ ∗ ) y > 0 \forall y\in T(x^*):y^TL(x^*,\lambda^*)y>0 yT(x):yTL(x,λ)y>0

then x ∗ x^* x is a strict local minimizer of f ( x ) f(x) f(x) w.r.t. h ( x ) = 0 h(x)=0 h(x)=0

Example 1

max x T Q x x^TQx xTQx
s.t. x T P x = 1 x^TPx=1 xTPx=1

Q = [ 4 0 0 1 ] , P = [ 2 0 0 1 ] Q=\begin{bmatrix} 4&0\\ 0&1 \end{bmatrix},P=\begin{bmatrix} 2&0\\ 0&1 \end{bmatrix} Q=[4001],P=[2001]

P − 1 Q = [ 2 0 0 1 ] P^{-1}Q=\begin{bmatrix} 2&0\\ 0&1 \end{bmatrix} P1Q=[2001]
⇒ λ 1 = 2 , λ 2 = 1 \Rightarrow \lambda_1=2,\lambda_2=1 λ1=2,λ2=1
⇒ λ ∗ = 2 \Rightarrow \lambda^*=2 λ=2
⇒ x ∗ = [ 1 2 , 0 ] T \Rightarrow x^*=[\frac{1}{\sqrt{2}},0]^T x=[2 1,0]T or x ∗ = [ − 1 2 , 0 ] T x^*=[-\frac{1}{\sqrt{2}},0]^T x=[2 1,0]T

Example 2

Consider min 1 2 x T Q x \frac{1}{2}x^TQx 21xTQx
s.t. A x = b Ax=b Ax=b

Q > 0 , Q = Q T , A ∈ R m × n , m ≤ n , b ∈ R m , r a n k A = m Q>0,Q=Q^T,A\in\mathbb{R}^{m\times n},m\leq n, b\in\mathbb{R}^m,rankA=m Q>0,Q=QT,ARm×n,mn,bRm,rankA=m

l ( x , λ ) = 1 2 x T Q x + λ T ( b − A x ) l(x,\lambda)=\frac{1}{2}x^TQx+\lambda^T(b-Ax) l(x,λ)=21xTQx+λT(bAx)
D x l ( x , λ ) = x T Q − λ T A = 0 D_xl(x,\lambda)=x^TQ-\lambda^TA=0 Dxl(x,λ)=xTQλTA=0
⇒ x = Q − 1 A T λ \Rightarrow x=Q^{-1}A^T\lambda x=Q1ATλ
⇒ A x = A Q − 1 A T λ \Rightarrow Ax=AQ^{-1}A^T\lambda Ax=AQ1ATλ
⇒ λ = ( A Q − 1 A T ) − 1 b \Rightarrow \lambda=(AQ^{-1}A^T)^{-1}b λ=(AQ1AT)1b
⇒ x = Q − 1 A T ( A Q − 1 A T ) − 1 b \Rightarrow x=Q^{-1}A^T(AQ^{-1}A^T)^{-1}b x=Q1AT(AQ1AT)1b

L ( x , λ ) = Q > 0 L(x,\lambda)=Q>0 L(x,λ)=Q>0

Case 2

min f ( x ) f(x) f(x)
s.t. h ( x ) = 0 h(x)=0 h(x)=0
g ( x ) ≤ 0 g(x)\leq 0 g(x)0

f : R n → R f:\mathbb{R}^n\rightarrow \mathbb{R} f:RnR
h : R n → R m , m ≤ n h:\mathbb{R}^n\rightarrow \mathbb{R}^m,m\leq n h:RnRm,mn
g : R n → R p g:\mathbb{R}^n\rightarrow \mathbb{R}^p g:RnRp

Definition

Def: An inequality constraint g j ( x ) ≤ 0 g_j(x)\leq 0 gj(x)0 is called active at x ∗ x^* x, if g j ( x ∗ ) = 0 g_j(x^*)=0 gj(x)=0; otherwise, inactive.

Def: Let x ∗ x^* x satisfy h ( x ∗ ) = 0 h(x^*)=0 h(x)=0 and g ( x ∗ ) ≤ 0 g(x^*)\leq 0 g(x)0. Let J ( x ∗ ) = { j : g j ( x ∗ ) = 0 } , x ∗ J(x^*)=\{j: g_j(x^*)=0\},x^* J(x)={j:gj(x)=0},x is called regular, if ∇ h i ( x ∗ ) \nabla h_i(x^*) hi(x) for all 1 ≤ i ≤ m 1\leq i\leq m 1im and ∇ g i ( x ∗ ) \nabla g_i(x^*) gi(x) for all j ∈ J ( x ∗ ) j\in J(x^*) jJ(x) are linear independent.

KKT-Theorem(FONC)

Let f , h , g ∈ C 1 , x ∗ f,h,g\in C^1, x^* f,h,gC1,x be a regular point and a local minimizer of f ( x ) f(x) f(x) w.r.t. h ( x ∗ ) = 0 h(x^*)=0 h(x)=0 and g ( x ∗ ) ≤ 0 g(x^*)\leq 0 g(x)0. Then, there exist λ ∗ ∈ R m \lambda^*\in\mathbb{R}^m λRm and μ ∗ ∈ R p \mu^*\in\mathbb{R}^p μRp s.t.

  1. μ ∗ ≥ 0 \mu^*\geq 0 μ0
  2. D f ( x ∗ ) + λ ∗ T D h ( x ∗ ) + μ ∗ T D g ( x ∗ ) = 0 Df(x^*)+{\lambda^*}^TDh(x^*)+{\mu^*}^TDg(x^*)=0 Df(x)+λTDh(x)+μTDg(x)=0
  3. μ ∗ T g ( x ∗ ) = 0 {\mu^*}^Tg(x^*)=0 μTg(x)=0
Example 1

min − 400 R ( 10 + R ) 2 -\frac{400R}{(10+R)^2} (10+R)2400R
s.t. − R ≤ 0 -R\leq 0 R0

∇ f ( R ) = − 400 ( 10 − R ) ( 10 + R ) 3 \nabla f(R)=-\frac{400(10-R)}{(10+R)^3} f(R)=(10+R)3400(10R)

{ μ ≥ 0 D f ( x ∗ ) + λ ∗ T D h ( x ∗ ) + μ ∗ T D g ( x ∗ ) = 0 μ T g ( x ) = 0 g ( x ) ≤ 0 h ( x ) = 0 \begin{cases} \mu\geq 0\\ Df(x^*)+{\lambda^*}^TDh(x^*)+{\mu^*}^TDg(x^*)=0\\ \mu^T g(x)=0\\ g(x)\leq 0\\ h(x)=0 \end{cases} μ0Df(x)+λTDh(x)+μTDg(x)=0μTg(x)=0g(x)0h(x)=0

⇒ { μ ≥ 0 − 400 ( 10 − R ) ( 10 + R ) 3 − μ = 0 μ R = 0 R ≥ 0 \Rightarrow \begin{cases} \mu\geq 0\\ -\frac{400(10-R)}{(10+R)^3}-\mu=0\\ \mu R=0\\ R\geq 0 \end{cases} μ0(10+R)3400(10R)μ=0μR=0R0

If μ > 0 \mu>0 μ>0, then R = 0 , μ = − 4 R=0,\mu=-4 R=0,μ=4(✕)
If μ = 0 ⇒ R = 10 \mu=0\Rightarrow R=10 μ=0R=10(✓ )

Example 2

min − 4000 ( 10 + R ) 2 -\frac{4000}{(10+R)^2} (10+R)24000
s.t. − R < 0 -R<0 R<0

∇ f ( R ) = 8000 ( 10 + R ) 3 \nabla f(R)=\frac{8000}{(10+R)^3} f(R)=(10+R)38000

KKT: { μ ≥ 0 8000 ( 10 + R ) 3 − μ = 0 μ R = 0 R ≥ 0 \begin{cases} \mu\geq 0\\ \frac{8000}{(10+R)^3}-\mu=0\\ \mu R=0\\ R\geq 0 \end{cases} μ0(10+R)38000μ=0μR=0R0

μ = 0 ⇒ \mu=0\Rightarrow μ=0 no solution(✕)
μ > 0 ⇒ R = 0 , μ = 8 \mu>0\Rightarrow R=0,\mu=8 μ>0R=0,μ=8(✓ )

总结

这节课主要介绍了非线性约束优化问题。按照不同的约束条件,把问题分为了两种情形。第一种情形是只有等式约束,第二种情形既有等式约束又有不等式约束。在第一种情形下,重点介绍了拉格朗日条件,并在二维情况下推导出了拉格朗日条件。由于拉格朗日条件是一阶必要条件(FONC),又进一步介绍了用拉格朗日条件来求最值的拉格朗日乘数法。然后简要地介绍了二阶必要条件(SONC)和二阶充分条件(SOSC)。最后考虑了第二种情形,并给出了KKT条件。

  • 4
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值