2.1. Find the global minimum and maximum points of the function f ( x , y ) = x 2 + y 2 + 2 x − 3 y f(x,y)=x^2+y^2+2x-3y f(x,y)=x2+y2+2x−3y over the unit ball S = B [ 0 , 1 ] = { ( x , y ) : x 2 + y 2 ≤ 1 } S=B[0,1]=\left\{(x,y):x^2+y^2\le 1\right\} S=B[0,1]={(x,y):x2+y2≤1}.
解:
f
(
x
,
y
)
=
(
x
+
1
)
2
+
(
y
−
3
2
)
2
−
13
4
f(x,y)=(x+1)^2+(y-\frac{3}{2})^2-\frac{13}{4}
f(x,y)=(x+1)2+(y−23)2−413
显然最小值点和最大值点在
y
=
−
3
2
x
y=-\frac{3}{2}x
y=−23x上
{
y
=
−
3
2
x
x
2
+
y
2
=
1
⇒
{
x
=
2
13
y
=
−
3
13
或
{
x
=
−
2
13
y
=
3
13
\begin{cases} y=-\frac{3}{2}x\\ x^2+y^2=1 \end{cases}\Rightarrow \begin{cases} x=\frac{2}{\sqrt{13}}\\ y=-\frac{3}{\sqrt{13}} \end{cases}\text{或} \begin{cases} x=-\frac{2}{\sqrt{13}}\\ y=\frac{3}{\sqrt{13}} \end{cases}
{y=−23xx2+y2=1⇒{x=132y=−133或{x=−132y=133
所以最小值点
(
−
2
13
,
3
13
)
(-\frac{2}{\sqrt{13}},\frac{3}{\sqrt{13}})
(−132,133),最大值点
(
2
13
,
−
3
13
)
(\frac{2}{\sqrt{13}},-\frac{3}{\sqrt{13}})
(132,−133)
2.2. Let a ∈ R n \mathbf{a}\in \mathbb{R}^n a∈Rn be a nonzero vector. Show that the maximum of a T x \mathbf{a}^T\mathbf{x} aTx over B [ 0 , 1 ] = { x ∈ R n : ∥ x ∥ ≤ 1 } B[0,1]=\left\{\mathbf{x}\in\mathbb{R}^n:\|\mathbf{x}\|\le 1\right\} B[0,1]={x∈Rn:∥x∥≤1} is attained at x ∗ = a ∥ a ∥ \mathbf{x}^*=\frac{\mathbf{a}}{\|\mathbf{a}\|} x∗=∥a∥a and that the maximal value is ∥ a ∥ \|\mathbf{a}\| ∥a∥.
解:
由柯西不等式
a
T
x
≤
∥
a
∥
∥
x
∥
≤
∥
a
∥
\mathbf{a}^T\mathbf{x}\le \|\mathbf{a}\|\|\mathbf{x}\|\le \|\mathbf{a}\|
aTx≤∥a∥∥x∥≤∥a∥
当且仅当
x
\mathbf{x}
x与
a
\mathbf{a}
a成比例时取等
并且
∥
x
∥
=
1
\|\mathbf{x}\|=1
∥x∥=1,所以
x
∗
=
a
∥
a
∥
\mathbf{x}^*=\frac{\mathbf{a}}{\|\mathbf{a}\|}
x∗=∥a∥a
2.3. Find the global minimum and maximum points of the function f ( x , y ) = 2 x − 3 y f(x,y)=2x-3y f(x,y)=2x−3y over the set S = { ( x , y ) : 2 x 2 + 5 y 2 ≤ 1 } S=\left\{(x,y):2x^2+5y^2\le 1\right\} S={(x,y):2x2+5y2≤1}.
解:
设
z
=
2
x
−
3
y
z=2x-3y
z=2x−3y
于是
y
=
2
3
x
−
z
3
y=\frac{2}{3}x-\frac{z}{3}
y=32x−3z
显然最小值点和最大值点在
y
=
−
3
2
x
y=-\frac{3}{2}x
y=−23x上
{
y
=
−
3
2
x
2
x
2
+
5
y
2
=
1
⇒
{
x
=
2
53
y
=
−
3
53
或
{
x
=
−
2
53
y
=
3
53
\begin{cases} y=-\frac{3}{2}x\\ 2x^2+5y^2=1 \end{cases}\Rightarrow \begin{cases} x=\frac{2}{\sqrt{53}}\\ y=-\frac{3}{\sqrt{53}} \end{cases}\text{或} \begin{cases} x=-\frac{2}{\sqrt{53}}\\ y=\frac{3}{\sqrt{53}} \end{cases}
{y=−23x2x2+5y2=1⇒{x=532y=−533或{x=−532y=533
所以最小值点 ( − 2 53 , 3 53 ) (-\frac{2}{\sqrt{53}},\frac{3}{\sqrt{53}}) (−532,533),最大值点 ( 2 53 , − 3 53 ) (\frac{2}{\sqrt{53}},-\frac{3}{\sqrt{53}}) (532,−533)
2.4. Show that if A , B \mathbf{A},\mathbf{B} A,B are n × n n\times n n×n positive semidefinite matrices, then their sum A + B \mathbf{A}+\mathbf{B} A+B is also positive semidefinite.
解
∀
x
≠
0
,
x
T
A
x
≥
0
,
x
T
B
x
≥
0
\forall \mathbf{x}\neq 0, \mathbf{x}^T\mathbf{A}\mathbf{x}\ge 0,\mathbf{x}^T\mathbf{B}\mathbf{x}\ge 0
∀x=0,xTAx≥0,xTBx≥0
所以
∀
x
≠
0
,
x
T
(
A
+
B
)
x
≥
0
⇒
(
A
+
B
)
⪰
0
\forall \mathbf{x}\neq 0, \mathbf{x}^T\left(\mathbf{A}+\mathbf{B}\right)\mathbf{x}\ge 0\Rightarrow \left(\mathbf{A}+\mathbf{B}\right)\succeq 0
∀x=0,xT(A+B)x≥0⇒(A+B)⪰0
2.5. Let
A
∈
R
n
×
n
\mathbf{A}\in \mathbb{R}^{n\times n}
A∈Rn×n and
B
∈
R
n
×
n
\mathbf{B}\in \mathbb{R}^{n\times n}
B∈Rn×n be two symmetric matrices. Prove that the following two claims are equivalent:
(i)
A
\mathbf{A}
A and
B
\mathbf{B}
B are positive semidefinite.
(ii)
(
A
0
n
×
m
0
m
×
n
B
)
\begin{pmatrix} \mathbf{A}& 0_{n\times m}\\ 0_{m\times n} & \mathbf{B}\\ \end{pmatrix}
(A0m×n0n×mB) is positive semidefinite.
解:
(
i
)
⇒
(
i
i
)
(i)\Rightarrow (ii)
(i)⇒(ii)
因为
∀
x
≠
0
,
x
T
A
x
≥
0
∀
y
≠
0
,
y
T
B
y
≥
0
\forall \mathbf{x}\neq 0, \mathbf{x}^T\mathbf{A}\mathbf{x}\ge 0\\ \forall \mathbf{y}\neq 0, \mathbf{y}^T\mathbf{B}\mathbf{y}\ge 0\\
∀x=0,xTAx≥0∀y=0,yTBy≥0
于是
∀
z
=
(
x
y
)
≠
0
,
z
T
(
A
0
n
×
m
0
m
×
n
B
)
z
=
x
T
A
x
+
y
T
B
y
≥
0
\forall \mathbf{z}=\begin{pmatrix} \mathbf{x}\\ \mathbf{y} \end{pmatrix}\neq 0, \mathbf{z}^T\begin{pmatrix} \mathbf{A}& 0_{n\times m}\\ 0_{m\times n} & \mathbf{B}\\ \end{pmatrix}\mathbf{z}=\mathbf{x}^T\mathbf{A}\mathbf{x}+\mathbf{y}^T\mathbf{B}\mathbf{y}\ge 0
∀z=(xy)=0,zT(A0m×n0n×mB)z=xTAx+yTBy≥0
所以
(
A
0
n
×
m
0
m
×
n
B
)
⪰
0
\begin{pmatrix} \mathbf{A}& 0_{n\times m}\\ 0_{m\times n} & \mathbf{B}\\ \end{pmatrix}\succeq 0
(A0m×n0n×mB)⪰0
(
i
i
)
⇒
(
i
)
(ii)\Rightarrow (i)
(ii)⇒(i)
∀
z
=
(
x
0
)
≠
0
,
z
T
(
A
0
n
×
m
0
m
×
n
B
)
z
=
x
T
A
x
≥
0
⇒
A
⪰
0
\forall \mathbf{z}=\begin{pmatrix} \mathbf{x}\\ 0 \end{pmatrix}\neq 0, \mathbf{z}^T\begin{pmatrix} \mathbf{A}& 0_{n\times m}\\ 0_{m\times n} & \mathbf{B}\\ \end{pmatrix}\mathbf{z}=\mathbf{x}^T\mathbf{A}\mathbf{x}\ge 0\Rightarrow \mathbf{A}\succeq 0
∀z=(x0)=0,zT(A0m×n0n×mB)z=xTAx≥0⇒A⪰0
∀ z = ( 0 y ) ≠ 0 , z T ( A 0 n × m 0 m × n B ) z = y T B y ≥ 0 ⇒ B ⪰ 0 \forall \mathbf{z}=\begin{pmatrix} 0\\ \mathbf{y} \end{pmatrix}\neq 0, \mathbf{z}^T\begin{pmatrix} \mathbf{A}& 0_{n\times m}\\ 0_{m\times n} & \mathbf{B}\\ \end{pmatrix}\mathbf{z}=\mathbf{y}^T\mathbf{B}\mathbf{y}\ge 0\Rightarrow \mathbf{B}\succeq 0 ∀z=(0y)=0,zT(A0m×n0n×mB)z=yTBy≥0⇒B⪰0
2.6. Let
B
∈
R
n
×
k
\mathbf{B}\in\mathbb{R}^{n\times k}
B∈Rn×k and let
A
=
B
B
T
\mathbf{A}=\mathbf{B}\mathbf{B}^T
A=BBT.
(i)Prove
A
\mathbf{A}
A is positive semidefinite.
(ii)Prove that
A
\mathbf{A}
A is positive definite if and only if
B
\mathbf{B}
B has a full row rank.
解:
(i)
∀
x
≠
0
,
x
T
A
x
=
x
T
B
B
T
x
=
∥
B
T
x
∥
2
≥
0
⇒
A
⪰
0
\forall \mathbf{x}\neq 0,\mathbf{x}^T\mathbf{Ax}=\mathbf{x}^T\mathbf{B}\mathbf{B}^T\mathbf{x}=\|\mathbf{B}^T\mathbf{x}\|^2\ge 0\Rightarrow \mathbf{A}\succeq 0
∀x=0,xTAx=xTBBTx=∥BTx∥2≥0⇒A⪰0
(ii)
利用
r
(
A
)
=
r
(
A
T
)
=
r
(
A
T
A
)
r(\mathbf{A})=r(\mathbf{A}^T)=r(\mathbf{A}^T\mathbf{A})
r(A)=r(AT)=r(ATA)
显然成立
2.7.
(i) Let
A
\mathbf{A}
A be an
n
×
n
n\times n
n×n symmetric matrix. Show that
A
\mathbf{A}
A is positive semidefinite if and only if there exists a matrix
B
∈
R
n
×
n
\mathbf{B}\in\mathbb{R}^{n\times n}
B∈Rn×n such that
A
=
B
B
T
\mathbf{A}=\mathbf{B}\mathbf{B}^T
A=BBT.
(ii) Let
x
∈
R
n
\mathbf{x}\in \mathbb{R}^n
x∈Rn and Let
A
\mathbf{A}
A be defined as
A
i
j
=
x
i
x
j
,
i
,
j
=
1
,
2
,
⋯
,
n
.
\mathbf{A}_{ij}=\mathbf{x}_i\mathbf{x}_j,\quad i,j=1,2,\cdots,n.
Aij=xixj,i,j=1,2,⋯,n.
Show that
A
\mathbf{A}
A is positive semidefinite and that it is not a positive definite matrix when
n
>
1
n>1
n>1.
解:
(i)实对称矩阵必可对角化
存在正交矩阵
P
\mathbf{P}
P,使得
A
=
P
Λ
P
T
\mathbf{A}=\mathbf{P}\Lambda \mathbf{P}^T
A=PΛPT
其中
Λ
\Lambda
Λ是对角线为
A
\mathbf{A}
A的特征值的对角矩阵
如果
A
⪰
0
\mathbf{A}\succeq 0
A⪰0
A
=
P
Λ
P
T
=
P
Λ
1
2
Λ
1
2
P
T
\mathbf{A}=\mathbf{P}\Lambda \mathbf{P}^T=\mathbf{P}\Lambda^{\frac{1}{2}}\Lambda^{\frac{1}{2}} \mathbf{P}^T
A=PΛPT=PΛ21Λ21PT
取
B
=
Λ
1
2
P
T
\mathbf{B}=\Lambda^{\frac{1}{2}} \mathbf{P}^T
B=Λ21PT,即可
如果 A = B B T \mathbf{A}=\mathbf{B}\mathbf{B}^T A=BBT,由上一题, A ⪰ 0 \mathbf{A}\succeq 0 A⪰0
(ii)
A
=
x
x
T
\mathbf{A}=\mathbf{x}\mathbf{x}^T
A=xxT
∀
y
≠
0
,
y
T
A
y
=
∥
x
T
y
∥
2
≥
0
⇒
A
⪰
0
\forall \mathbf{y}\neq 0,\mathbf{y}^T\mathbf{Ay}=\|\mathbf{x}^T\mathbf{y}\|^2\ge0\Rightarrow\mathbf{A}\succeq 0
∀y=0,yTAy=∥xTy∥2≥0⇒A⪰0
取
x
=
(
1
,
0
)
T
\mathbf{x}=\left(1,0\right)^T
x=(1,0)T
A
=
(
1
0
0
0
)
⪰
0
\mathbf{A}=\begin{pmatrix} 1&0\\ 0&0 \end{pmatrix}\succeq0
A=(1000)⪰0,并不是正定
2.8. Let
Q
∈
R
n
×
n
\mathbf{Q}\in\mathbb{R}^{n\times n}
Q∈Rn×n be a positive definite matrix. Show that the “Q-norm” defined by
∥
x
∥
Q
=
x
T
Q
x
\|\mathbf{x}\|_{\mathbf{Q}}=\sqrt{\mathbf{x}^T\mathbf{Q}\mathbf{x}}
∥x∥Q=xTQx
is indeed a norm.
解:
∀
x
≠
0
,
x
T
Q
x
>
0
\forall \mathbf{x}\neq 0,\mathbf{x}^T\mathbf{Qx}>0
∀x=0,xTQx>0
所以 ∀ x ≠ 0 , ∥ x ∥ Q > 0 \forall \mathbf{x}\neq 0,\|\mathbf{x}\|_\mathbf{Q}>0 ∀x=0,∥x∥Q>0,满足非负性
∀ k ∈ R , ∥ k x ∥ Q = ∣ k ∣ ∥ x ∥ Q \forall k\in\mathbb{R},\|k\mathbf{x}\|_\mathbf{Q}=\left|k\right|\|\mathbf{x}\|_{\mathbf{Q}} ∀k∈R,∥kx∥Q=∣k∣∥x∥Q,满足齐次性
∀
x
,
y
∈
R
+
+
,
x
+
y
<
x
+
y
\forall x,y\in\mathbb{R}_{++},\sqrt{x+y}<\sqrt{x}+\sqrt{y}
∀x,y∈R++,x+y<x+y
所以
∥
x
+
y
∥
Q
=
x
T
Q
x
+
y
T
Q
y
≤
x
T
Q
x
+
y
T
Q
y
=
∥
x
∥
Q
+
∥
y
∥
Q
\|\mathbf{x}+\mathbf{y}\|_\mathbf{Q}=\sqrt{\mathbf{x}^T\mathbf{Q}\mathbf{x}+\mathbf{y}^T\mathbf{Q}\mathbf{y}}\le \sqrt{\mathbf{x}^T\mathbf{Q}\mathbf{x}}+\sqrt{\mathbf{y}^T\mathbf{Q}\mathbf{y}}=\|\mathbf{x}\|_{\mathbf{Q}}+\|\mathbf{y}\|_{\mathbf{Q}}
∥x+y∥Q=xTQx+yTQy≤xTQx+yTQy=∥x∥Q+∥y∥Q
2.9. Let
A
\mathbf{A}
A be an
n
×
n
n\times n
n×n positive semidefinite matrix.
(i)Show that for any
i
≠
j
i\neq j
i=j
A
i
i
A
j
j
≥
A
i
j
2
\mathbf{A}_{ii}\mathbf{A}_{jj}\ge \mathbf{A}_{ij}^2
AiiAjj≥Aij2
(ii)Show that if for some
i
∈
{
1
,
2
,
⋯
,
n
}
A
i
i
=
0
i\in\left\{1,2,\cdots,n\right\}\mathbf{A}_{ii}=0
i∈{1,2,⋯,n}Aii=0,then the ith row of
A
\mathbf{A}
A consists of zeros.
解:
(i)
设
x
=
x
1
e
i
+
x
2
e
j
\mathbf{x}=x_1\mathbf{e}_i+x_2\mathbf{e}_j
x=x1ei+x2ej
x
T
A
x
=
A
i
i
x
1
2
+
2
A
i
j
2
x
1
x
2
+
A
j
j
x
2
2
=
(
x
1
x
2
)
T
(
A
i
i
A
i
j
A
i
j
A
j
j
)
(
x
1
x
2
)
≥
0
\begin{aligned} \mathbf{x}^T\mathbf{A}\mathbf{x}&=\mathbf{A}_{ii}x_1^2+2\mathbf{A}_{ij}^2x_1 x_2+\mathbf{A}_{jj}x_2^2\\ &=\begin{pmatrix}x_1\\x_2\\\end{pmatrix}^T\begin{pmatrix}\mathbf{A}_{ii}&\mathbf{A}_{ij}\\ \mathbf{A}_{ij}&\mathbf{A}_{jj}\\\end{pmatrix}\begin{pmatrix}x_1\\x_2\\\end{pmatrix}\\ &\ge0 \end{aligned}
xTAx=Aiix12+2Aij2x1x2+Ajjx22=(x1x2)T(AiiAijAijAjj)(x1x2)≥0
所以
(
A
i
i
A
i
j
A
i
j
A
j
j
)
⪰
0
⇒
A
i
i
A
j
j
≥
A
i
j
2
\begin{pmatrix}\mathbf{A}_{ii}&\mathbf{A}_{ij}\\ \mathbf{A}_{ij}&\mathbf{A}_{jj}\\\end{pmatrix}\succeq 0\Rightarrow \mathbf{A}_{ii}\mathbf{A}_{jj}\ge \mathbf{A}_{ij}^2
(AiiAijAijAjj)⪰0⇒AiiAjj≥Aij2
(ii)
A
i
j
2
≤
A
i
i
A
j
j
=
0
⇒
A
i
j
=
0
\mathbf{A}_{ij}^2\le \mathbf{A}_{ii}\mathbf{A}_{jj}=0\Rightarrow \mathbf{A}_{ij}=0
Aij2≤AiiAjj=0⇒Aij=0
所以第i行为0
2.10. Let
A
α
\mathbf{A}^{\alpha}
Aα be the
n
×
n
n\times n
n×n matrix
(
n
>
1
)
\left(n>1\right)
(n>1) defined by
A
i
j
α
=
{
α
,
i
=
j
,
1
,
i
≠
j
.
\mathbf{A}_{ij}^{\alpha}=\begin{cases} \alpha,&i=j,\\ 1,&i\neq j. \end{cases}
Aijα={α,1,i=j,i=j.
Show that
A
α
\mathbf{A}^{\alpha}
Aα is positive semidefinite if and only if
α
≥
1
\alpha\ge 1
α≥1
解:
∣
λ
I
−
A
α
∣
=
∣
λ
−
α
−
1
−
1
⋯
−
1
−
1
λ
−
α
−
1
⋯
−
1
−
1
−
1
λ
−
α
⋯
−
1
⋮
⋮
⋱
−
1
−
1
−
1
⋯
λ
−
α
∣
=
∣
λ
−
α
−
n
+
1
−
1
−
1
⋯
−
1
λ
−
α
−
n
+
1
λ
−
α
−
1
⋯
−
1
λ
−
α
−
n
+
1
−
1
λ
−
α
⋯
−
1
⋮
⋮
⋱
λ
−
α
−
n
+
1
−
1
−
1
⋯
λ
−
α
∣
=
(
λ
−
α
−
n
+
1
)
∣
1
−
1
−
1
⋯
−
1
1
λ
−
α
−
1
⋯
−
1
1
−
1
λ
−
α
⋯
−
1
⋮
⋮
⋱
1
−
1
−
1
⋯
λ
−
α
∣
=
(
λ
−
α
−
n
+
1
)
∣
1
0
0
⋯
0
0
λ
−
α
+
1
0
⋯
0
0
0
λ
−
α
+
1
⋯
0
⋮
⋮
⋱
0
0
0
⋯
λ
−
α
+
1
∣
=
(
λ
−
α
−
n
+
1
)
(
λ
−
α
+
1
)
n
−
1
=
0
\begin{aligned} &\quad \left|\lambda \mathbf{I}-\mathbf{A}^{\alpha}\right|\\ &=\left|\begin{array}{cccc} \lambda -\alpha & -1 & -1&\cdots& -1\\ -1 & \lambda -\alpha& -1 & \cdots & -1\\ -1 & -1 & \lambda -\alpha & \cdots& -1\\ \vdots & \vdots & & \ddots & \\ -1 & -1 & -1 & \cdots &\lambda -\alpha\\ \end{array}\right| \\ &=\left|\begin{array}{cccc} \lambda-\alpha-n+1 & -1 & -1&\cdots& -1\\ \lambda-\alpha-n+1 & \lambda -\alpha& -1 & \cdots & -1\\ \lambda-\alpha-n+1 & -1 & \lambda -\alpha & \cdots& -1\\ \vdots & \vdots & & \ddots & \\ \lambda-\alpha-n+1 & -1 & -1 & \cdots &\lambda -\alpha\\ \end{array}\right| \\ &=(\lambda-\alpha-n+1)\left|\begin{array}{cccc} 1 & -1 & -1&\cdots& -1\\ 1 & \lambda -\alpha& -1 & \cdots & -1\\ 1 & -1 & \lambda -\alpha & \cdots& -1\\ \vdots & \vdots & & \ddots & \\ 1 & -1 & -1 & \cdots &\lambda -\alpha\\ \end{array}\right| \\ &=\left(\lambda-\alpha-n+1\right)\left|\begin{array}{cccc} 1 & 0 & 0&\cdots& 0\\ 0 & \lambda -\alpha+1& 0 & \cdots & 0\\ 0 & 0 & \lambda -\alpha +1& \cdots&0\\ \vdots & \vdots & & \ddots & \\ 0 & 0 & 0 & \cdots &\lambda -\alpha+1\\ \end{array}\right| \\ &=\left(\lambda-\alpha-n+1\right)\left(\lambda-\alpha+1\right)^{n-1}\\ &=0 \end{aligned}
∣λI−Aα∣=∣∣∣∣∣∣∣∣∣∣∣λ−α−1−1⋮−1−1λ−α−1⋮−1−1−1λ−α−1⋯⋯⋯⋱⋯−1−1−1λ−α∣∣∣∣∣∣∣∣∣∣∣=∣∣∣∣∣∣∣∣∣∣∣λ−α−n+1λ−α−n+1λ−α−n+1⋮λ−α−n+1−1λ−α−1⋮−1−1−1λ−α−1⋯⋯⋯⋱⋯−1−1−1λ−α∣∣∣∣∣∣∣∣∣∣∣=(λ−α−n+1)∣∣∣∣∣∣∣∣∣∣∣111⋮1−1λ−α−1⋮−1−1−1λ−α−1⋯⋯⋯⋱⋯−1−1−1λ−α∣∣∣∣∣∣∣∣∣∣∣=(λ−α−n+1)∣∣∣∣∣∣∣∣∣∣∣100⋮00λ−α+10⋮000λ−α+10⋯⋯⋯⋱⋯000λ−α+1∣∣∣∣∣∣∣∣∣∣∣=(λ−α−n+1)(λ−α+1)n−1=0
所以特征值为
α
+
n
−
1
\alpha+n-1
α+n−1和n-1个
α
−
1
\alpha-1
α−1
{
α
+
n
−
1
≥
0
α
−
1
≥
0
⇔
α
≥
1
\begin{cases} \alpha+n-1\ge0\\ \alpha-1 \ge 0 \end{cases}\Leftrightarrow\alpha\ge 1
{α+n−1≥0α−1≥0⇔α≥1
所以
A
α
⪰
0
⇔
α
≥
1
\mathbf{A}^\alpha \succeq 0\Leftrightarrow \alpha\ge 1
Aα⪰0⇔α≥1
2.11. Let
d
∈
Δ
n
\mathbf{d}\in\Delta_n
d∈Δn (
Δ
n
\Delta_n
Δn being the unit-simplex).Show that the
n
×
n
n\times n
n×n matrix
A
\mathbf{A}
A defined by
A
i
j
=
{
d
i
−
d
i
2
,
i
=
j
,
−
d
i
d
j
,
i
≠
j
,
\mathbf{A}_{ij}=\begin{cases} d_i-d_i^2,&i=j,\\ -d_i d_j, &i\neq j, \end{cases}
Aij={di−di2,−didj,i=j,i=j,
is positive semidefinite.
解:
∣
A
i
i
∣
−
∑
i
≠
j
∣
A
i
j
∣
=
d
i
−
d
i
2
−
∑
i
≠
j
d
i
d
j
=
d
i
−
∑
j
=
1
n
d
i
d
j
=
d
i
−
d
i
∑
j
=
1
n
d
j
=
d
i
−
d
i
=
0
\begin{aligned} &\quad \left|A_{ii}\right|-\sum_{i\neq j}\left|\mathbf{A}_{ij}\right|\\ &=d_i-d_i^2-\sum_{i\neq j}d_i d_j\\ &=d_i-\sum_{j=1}^{n} d_i d_j\\ &=d_i-d_i \sum_{j=1}^{n} d_j\\ &=d_i-d_i\\ &=0 \end{aligned}
∣Aii∣−i=j∑∣Aij∣=di−di2−i=j∑didj=di−j=1∑ndidj=di−dij=1∑ndj=di−di=0
所以
∣
A
i
i
∣
≥
∑
i
≠
j
∣
A
i
j
∣
\quad \left|A_{ii}\right|\ge \sum_{i\neq j}\left|\mathbf{A}_{ij}\right|
∣Aii∣≥i=j∑∣Aij∣
所以
A
\mathbf{A}
A是对角占优矩阵
又
A
i
i
≥
0
\mathbf{A}_{ii}\ge 0
Aii≥0
所以
A
⪰
0
\mathbf{A}\succeq 0
A⪰0
2.12. Prove that a 2 × 2 2\times 2 2×2 matrix A \mathbf{A} A is negative semidefinite if and only if T r ( A ) ≤ 0 Tr(\mathbf{A})\le 0 Tr(A)≤0 and d e t ( A ) ≥ 0 det(\mathbf{A})\ge 0 det(A)≥0.
解:
{
T
r
(
A
)
=
λ
1
+
λ
2
≤
0
d
e
t
(
A
)
=
λ
1
λ
2
≥
0
⇔
λ
1
,
λ
2
≤
0
⇔
A
⪯
0
\begin{cases} Tr(\mathbf{A})=\lambda_1+\lambda_2\le 0\\ det(\mathbf{A})=\lambda_1\lambda_2\ge 0\\ \end{cases}\Leftrightarrow \lambda_1,\lambda_2\le0 \Leftrightarrow \mathbf{A}\preceq 0
{Tr(A)=λ1+λ2≤0det(A)=λ1λ2≥0⇔λ1,λ2≤0⇔A⪯0
2.13.For each of the following matrices determine whether they are positive/negative semidefinite/ definite or indefinite:
(i)
(
2
2
0
0
2
2
0
0
0
0
3
1
0
0
1
3
)
\begin{pmatrix} 2&2&0&0\\ 2&2&0&0\\ 0&0&3&1\\ 0&0&1&3\\ \end{pmatrix}
⎝⎜⎜⎛2200220000310013⎠⎟⎟⎞
(ii)
(
2
2
2
2
3
3
2
3
3
)
\begin{pmatrix} 2&2&2\\ 2&3&3\\ 2&3&3\\ \end{pmatrix}
⎝⎛222233233⎠⎞
(iii)
(
2
1
3
1
2
1
3
1
2
)
\begin{pmatrix} 2&1&3\\ 1&2&1\\ 3&1&2\\ \end{pmatrix}
⎝⎛213121312⎠⎞
(iv)
(
−
5
1
1
1
−
7
1
1
1
−
5
)
\begin{pmatrix} -5&1&1\\ 1&-7&1\\ 1&1&-5\\ \end{pmatrix}
⎝⎛−5111−7111−5⎠⎞
解:
(i)
A
\mathbf{A}
A是对角占优矩阵,对角线元素非负,所以
A
⪰
0
\mathbf{A}\succeq 0
A⪰0
(ii)
∣
λ
I
−
B
∣
=
∣
λ
−
2
−
2
−
2
−
2
λ
−
3
−
3
−
2
−
3
λ
−
3
∣
=
∣
λ
−
2
−
2
−
2
−
2
λ
−
3
−
3
0
−
λ
λ
∣
=
∣
λ
−
2
−
4
−
2
−
2
λ
−
6
−
3
0
0
λ
∣
=
λ
(
λ
2
−
8
λ
+
12
−
8
)
=
λ
(
λ
2
−
8
λ
+
4
)
=
λ
(
λ
−
(
4
+
2
3
)
)
(
λ
−
(
4
−
2
3
)
)
\begin{aligned} &\quad \left|\lambda \mathbf{I}-\mathbf{B}\right|\\ &=\left|\begin{array}{cccc} \lambda-2&-2&-2\\ -2 &\lambda-3 & -3\\ -2 & -3 &\lambda-3 \end{array}\right|\\ &=\left|\begin{array}{cccc} \lambda-2&-2&-2\\ -2 &\lambda-3 & -3\\ 0 & -\lambda &\lambda \end{array}\right|\\ &=\left|\begin{array}{cccc} \lambda-2&-4&-2\\ -2 &\lambda-6 & -3\\ 0 & 0 &\lambda \end{array}\right|\\ &=\lambda(\lambda^2-8\lambda+12-8)\\ &=\lambda(\lambda^2-8\lambda+4)\\ &=\lambda\left(\lambda-(4+2\sqrt{3})\right)\left(\lambda-(4-2\sqrt{3})\right) \end{aligned}
∣λI−B∣=∣∣∣∣∣∣λ−2−2−2−2λ−3−3−2−3λ−3∣∣∣∣∣∣=∣∣∣∣∣∣λ−2−20−2λ−3−λ−2−3λ∣∣∣∣∣∣=∣∣∣∣∣∣λ−2−20−4λ−60−2−3λ∣∣∣∣∣∣=λ(λ2−8λ+12−8)=λ(λ2−8λ+4)=λ(λ−(4+23))(λ−(4−23))
所以
B
⪰
0
\mathbf{B}\succeq 0
B⪰0
(iii)
∣
λ
I
−
B
∣
=
∣
λ
−
2
−
1
−
3
−
1
λ
−
2
−
1
−
3
−
1
λ
−
2
∣
=
∣
λ
+
1
−
1
−
3
0
λ
−
2
−
1
−
λ
−
1
−
1
λ
−
2
∣
=
∣
λ
+
1
−
1
−
3
0
λ
−
2
−
1
0
−
2
λ
−
5
∣
=
(
λ
+
1
)
(
λ
2
−
7
λ
+
8
)
=
(
λ
+
1
)
(
λ
−
7
+
17
2
)
(
λ
−
7
−
17
2
)
\begin{aligned} &\quad \left|\lambda \mathbf{I}-\mathbf{B}\right|\\ &=\left|\begin{array}{cccc} \lambda-2&-1&-3\\ -1 &\lambda-2 & -1\\ -3 & -1 &\lambda-2 \end{array}\right|\\ &=\left|\begin{array}{cccc} \lambda+1&-1&-3\\ 0 &\lambda-2 & -1\\ -\lambda-1 & -1 &\lambda-2 \end{array}\right|\\ &=\left|\begin{array}{cccc} \lambda+1&-1&-3\\ 0 &\lambda-2 & -1\\ 0 & -2 &\lambda-5 \end{array}\right|\\ &=\left(\lambda+1\right)\left(\lambda^2-7\lambda+8\right)\\ &=\left(\lambda+1\right)\left(\lambda-\frac{7+\sqrt{17}}{2}\right)\left(\lambda-\frac{7-\sqrt{17}}{2}\right)\\ \end{aligned}
∣λI−B∣=∣∣∣∣∣∣λ−2−1−3−1λ−2−1−3−1λ−2∣∣∣∣∣∣=∣∣∣∣∣∣λ+10−λ−1−1λ−2−1−3−1λ−2∣∣∣∣∣∣=∣∣∣∣∣∣λ+100−1λ−2−2−3−1λ−5∣∣∣∣∣∣=(λ+1)(λ2−7λ+8)=(λ+1)(λ−27+17)(λ−27−17)
特征值有正有负,所以
C
\mathbf{C}
C不定
(iv)
因为
−
D
-\mathbf{D}
−D是严格对角占优矩阵,且对角线元素是正的,所以
−
D
≻
0
⇒
D
≺
0
-\mathbf{D}\succ 0\Rightarrow \mathbf{D}\prec 0
−D≻0⇒D≺0
2.14. (Schur complement lemma) Let
D
=
(
A
b
b
T
c
)
\mathbf{D}=\begin{pmatrix} \mathbf{A}&\mathbf{b}\\ \mathbf{b}^T& c \end{pmatrix}
D=(AbTbc)
where
A
∈
R
n
×
n
,
b
∈
R
n
,
c
∈
R
\mathbf{A}\in\mathbb{R}^{n\times n},\mathbf{b}\in\mathbb{R}^n,c\in\mathbb{R}
A∈Rn×n,b∈Rn,c∈R.Suppose that
A
≻
0
\mathbf{A}\succ 0
A≻0.Prove that
D
⪰
0
\mathbf{D}\succeq 0
D⪰0 if and only if
c
−
b
T
A
−
1
b
≥
0
c-\mathbf{b}^T\mathbf{A}^{-1}\mathbf{b}\ge 0
c−bTA−1b≥0.
解:
设
T
=
(
A
0
0
c
−
b
T
A
−
1
b
)
\mathbf{T}=\begin{pmatrix} \mathbf{A}&0\\ 0&c-\mathbf{b}^T\mathbf{A}^{-1}\mathbf{b} \end{pmatrix}
T=(A00c−bTA−1b)
N
=
(
I
0
b
T
A
−
1
1
)
\mathbf{N}=\begin{pmatrix} \mathbf{I}&0\\ \mathbf{b}^{T}\mathbf{A}^{-1}&1 \end{pmatrix}
N=(IbTA−101)
D
=
N
T
N
T
\mathbf{D}=\mathbf{N}\mathbf{T}\mathbf{N}^T
D=NTNT
于是显然成立
2.15. For each of the following functions, determine whether it is coercive or not:
(i)
f
(
x
1
,
x
2
)
=
x
1
4
+
x
2
4
f\left(x_1,x_2\right)=x_1^4+x_2^4
f(x1,x2)=x14+x24
(ii)
f
(
x
1
,
x
2
)
=
e
x
1
2
+
e
x
2
2
−
x
1
200
−
x
2
200
f\left(x_1,x_2\right)=e^{x_1^2}+e^{x_2^2}-x_1^{200}-x_2^{200}
f(x1,x2)=ex12+ex22−x1200−x2200
(iii)
f
(
x
1
,
x
2
)
=
2
x
1
2
−
8
x
1
x
2
+
x
2
2
f\left(x_1,x_2\right)=2x_1^2-8x_1 x_2+x_2^2
f(x1,x2)=2x12−8x1x2+x22
(iv)
f
(
x
1
,
x
2
)
=
4
x
1
2
+
2
x
1
x
2
+
2
x
2
2
f\left(x_1,x_2\right)=4x_1^2+2x_1 x_2+2x_2^2
f(x1,x2)=4x12+2x1x2+2x22
(v)
f
(
x
1
,
x
2
,
x
3
)
=
x
1
3
+
x
2
3
+
x
3
3
f\left(x_1,x_2,x_3\right)=x_1^3+x_2^3+x_3^3
f(x1,x2,x3)=x13+x23+x33
(vi)
f
(
x
1
,
x
2
)
=
x
1
2
−
2
x
1
x
2
2
+
x
2
4
f\left(x_1,x_2\right)=x_1^2-2x_1 x_2^2+x_2^4
f(x1,x2)=x12−2x1x22+x24
(vii)
f
(
x
)
=
x
T
A
x
∥
x
∥
+
1
f\left(\mathbf{x}\right)=\frac{\mathbf{x}^T\mathbf{Ax}}{\|\mathbf{x}\|+1}
f(x)=∥x∥+1xTAx,where
A
∈
R
n
×
n
\mathbf{A}\in\mathbb{R}^{n\times n}
A∈Rn×n is positive definite.
解:
(i)
当
∥
x
∥
→
∞
\|\mathbf{x}\|\to \infty
∥x∥→∞,
f
(
x
1
,
x
2
)
≥
∥
x
∥
2
→
∞
f\left(x_1,x_2\right)\ge \|\mathbf{x}\|^2\to\infty
f(x1,x2)≥∥x∥2→∞
所以是
(ii)
e
x
1
2
+
e
x
2
2
e^{x_1^2}+e^{x_2^2}
ex12+ex22占据主导,所以是
(iii)
A
=
(
2
−
4
−
4
1
)
\mathbf{A}=\begin{pmatrix} 2&-4\\ -4&1\\ \end{pmatrix}
A=(2−4−41)
并不正定,所以不是
(iv)
A
=
(
4
1
1
2
)
≻
0
\mathbf{A}=\begin{pmatrix} 4&1\\ 1&2\\ \end{pmatrix}\succ0
A=(4112)≻0
所以是
(v)
当
x
1
→
−
∞
,
x
2
→
−
∞
,
x
3
→
−
∞
x_1\to -\infty,x_2\to -\infty,x_3\to -\infty
x1→−∞,x2→−∞,x3→−∞时
f
(
x
1
,
x
2
,
x
3
)
→
−
∞
f\left(x_1,x_2,x_3\right)\to-\infty
f(x1,x2,x3)→−∞
所以不是
(vi)
f
(
x
1
,
x
2
)
=
(
x
1
−
x
2
2
)
2
f\left(x_1,x_2\right)=\left(x_1-x_2^2\right)^2
f(x1,x2)=(x1−x22)2
取
v
=
(
t
,
t
)
T
\mathbf{v}=\left(t,\sqrt{t}\right)^T
v=(t,t)T
当
t
→
∞
t\to \infty
t→∞时,
∥
v
∥
→
∞
\|\mathbf{v}\|\to \infty
∥v∥→∞
但是
f
(
v
)
→
0
f\left(\mathbf{v}\right)\to 0
f(v)→0
所以不是
(vii)
f
(
x
)
≥
λ
m
i
n
∥
x
∥
2
∥
x
∥
+
1
f\left(\mathbf{x}\right)\ge\frac{\lambda_{min}\|\mathbf{x}\|^2}{\|\mathbf{x}\|+1}
f(x)≥∥x∥+1λmin∥x∥2
当
∥
x
∥
→
∞
\|\mathbf{x}\|\to \infty
∥x∥→∞时,
λ
1
∥
x
∥
∥
x
∥
+
1
→
∞
\frac{\lambda_1\|\mathbf{x}\|}{\|\mathbf{x}\|+1}\to \infty
∥x∥+1λ1∥x∥→∞
所以是
2.15. Find a function
f
:
R
2
→
R
f:\mathbb{R}^2\to \mathbb{R}
f:R2→R which is not coercive and satisfies that for any
α
∈
R
\alpha \in\mathbb{R}
α∈R
lim
∣
x
1
∣
→
∞
f
(
x
1
,
α
x
1
)
=
lim
∣
x
2
∣
→
∞
f
(
α
x
2
,
x
2
)
=
∞
\lim\limits_{\left|x_1\right|\to\infty}f\left(x_1,\alpha x_1\right)=\lim\limits_{\left|x_2\right|\to\infty}f\left(\alpha x_2,x_2\right)=\infty
∣x1∣→∞limf(x1,αx1)=∣x2∣→∞limf(αx2,x2)=∞
解:
f
(
x
1
,
x
2
)
=
(
x
1
−
x
2
2
)
2
f\left(x_1,x_2\right)=\left(x_1-x_2^2\right)^2
f(x1,x2)=(x1−x22)2
2.17. For each of the following functions, find all the stationary points and classify them according to whether they are saddle points, strict/nonstrict local/global minimum/ maximum points:
(i)
f
(
x
1
,
x
2
)
=
(
4
x
1
2
−
x
2
)
2
f\left(x_1,x_2\right)=\left(4x_1^2-x_2\right)^2
f(x1,x2)=(4x12−x2)2
(ii)
f
(
x
1
,
x
2
,
x
3
)
=
x
1
4
−
2
x
1
2
+
x
2
2
+
2
x
2
x
3
+
2
x
3
2
f\left(x_1,x_2,x_3\right)=x_1^4-2x_1^2+x_2^2+2x_2x_3+2x_3^2
f(x1,x2,x3)=x14−2x12+x22+2x2x3+2x32
(iii)
f
(
x
1
,
x
2
)
=
2
x
2
3
−
6
x
2
2
+
3
x
1
2
x
2
f\left(x_1,x_2\right)=2x_2^3-6x_2^2+3x_1^2x_2
f(x1,x2)=2x23−6x22+3x12x2
(iv)
f
(
x
1
,
x
2
)
=
x
1
4
+
2
x
1
2
x
2
+
x
2
2
−
4
x
1
2
−
8
x
1
−
8
x
2
f\left(x_1,x_2\right)=x_1^4+2x_1^2x_2+x_2^2-4x_1^2-8x_1-8x_2
f(x1,x2)=x14+2x12x2+x22−4x12−8x1−8x2
(v)
f
(
x
1
,
x
2
)
=
(
x
1
−
2
x
2
)
4
+
64
x
1
x
2
f\left(x_1,x_2\right)=\left(x_1-2x_2\right)^4+64x_1x_2
f(x1,x2)=(x1−2x2)4+64x1x2
(vi)
f
(
x
1
,
x
2
)
=
2
x
1
2
+
3
x
2
2
−
2
x
1
x
2
+
2
x
1
−
3
x
2
f\left(x_1,x_2\right)=2x_1^2+3x_2^2-2x_1x_2+2x_1-3x_2
f(x1,x2)=2x12+3x22−2x1x2+2x1−3x2
(vii)
f
(
x
1
,
x
2
)
=
x
1
2
+
4
x
1
x
2
+
x
2
2
+
x
1
−
x
2
f\left(x_1,x_2\right)=x_1^2+4x_1x_2+x_2^2+x_1-x_2
f(x1,x2)=x12+4x1x2+x22+x1−x2
解:
(i)
∇
f
=
(
16
x
1
(
4
x
1
2
−
x
2
)
−
2
(
4
x
1
2
−
x
2
)
)
=
0
⇒
4
x
1
2
=
x
2
\nabla f= \begin{pmatrix} 16x_1\left(4x_1^2-x_2\right)\\ -2\left(4x_1^2-x_2\right)\\ \end{pmatrix}=0\Rightarrow4x_1^2=x_2
∇f=(16x1(4x12−x2)−2(4x12−x2))=0⇒4x12=x2
f
(
x
1
,
x
2
)
≥
0
=
f
(
x
1
,
4
x
1
2
)
f\left(x_1,x_2\right)\ge 0=f\left(x_1,4x_1^2\right)
f(x1,x2)≥0=f(x1,4x12)
所以
(
x
1
,
4
x
1
2
)
\left(x_1,4x_1^2\right)
(x1,4x12)上的点是全局最小值点
或者
∇
2
f
(
x
1
,
4
x
1
2
)
=
(
16
(
12
x
1
2
−
x
2
)
−
16
x
1
−
16
x
1
2
x
2
)
=
(
128
x
1
2
−
16
x
1
−
16
x
1
2
x
2
)
⪰
0
\nabla^2f\left(x_1,4x_1^2\right)= \begin{pmatrix} 16\left(12x_1^2-x_2\right)&-16x_1\\ -16x_1&2x_2 \end{pmatrix}= \begin{pmatrix} 128x_1^2&-16x_1\\ -16x_1&2x_2 \end{pmatrix}\succeq 0
∇2f(x1,4x12)=(16(12x12−x2)−16x1−16x12x2)=(128x12−16x1−16x12x2)⪰0
因此也是全局最小值点
(ii)
∇
f
=
(
4
x
1
3
−
4
x
1
2
x
2
+
2
x
3
2
x
2
+
4
x
3
)
=
0
⇒
(
x
1
x
2
x
3
)
=
(
0
0
0
)
o
r
(
1
0
0
)
o
r
(
−
1
0
0
)
\nabla f= \begin{pmatrix} 4x_1^3-4x_1\\ 2x_2+2x_3\\ 2x_2+4x_3\\ \end{pmatrix}=0\Rightarrow \begin{pmatrix} x_1\\ x_2\\ x_3\\ \end{pmatrix}= \begin{pmatrix} 0\\ 0\\ 0\\ \end{pmatrix} or \begin{pmatrix} 1\\ 0\\ 0\\ \end{pmatrix} or \begin{pmatrix} -1\\ 0\\ 0\\ \end{pmatrix}
∇f=⎝⎛4x13−4x12x2+2x32x2+4x3⎠⎞=0⇒⎝⎛x1x2x3⎠⎞=⎝⎛000⎠⎞or⎝⎛100⎠⎞or⎝⎛−100⎠⎞
∇
2
f
=
(
12
x
1
2
−
4
0
0
0
2
2
0
2
4
)
\nabla^2f= \begin{pmatrix} 12x_1^2-4&0&0\\ 0&2&2\\ 0&2&4\\ \end{pmatrix}
∇2f=⎝⎛12x12−400022024⎠⎞
∇
2
f
(
0
,
0
,
0
)
\nabla^2 f\left(0,0,0\right)
∇2f(0,0,0)不定,所以
(
0
,
0
,
0
)
\left(0,0,0\right)
(0,0,0)是鞍点
∇
2
f
(
1
,
0
,
0
)
≻
0
\nabla^2 f\left(1,0,0\right)\succ 0
∇2f(1,0,0)≻0,所以
(
1
,
0
,
0
)
\left(1,0,0\right)
(1,0,0)是严格局部最小值点
∇
2
f
(
−
1
,
0
,
0
)
≻
0
\nabla^2 f\left(-1,0,0\right)\succ 0
∇2f(−1,0,0)≻0,所以
(
−
1
,
0
,
0
)
\left(-1,0,0\right)
(−1,0,0)是严格局部最小值点
(iii)
∇
f
=
(
6
x
1
x
2
6
x
2
2
−
12
x
2
+
3
x
1
2
)
=
0
⇒
(
x
1
x
2
)
=
(
0
0
)
o
r
(
0
2
)
\nabla f= \begin{pmatrix} 6x_1x_2\\ 6x_2^2-12x_2+3x_1^2\\ \end{pmatrix}=0\Rightarrow \begin{pmatrix} x_1\\ x_2\\ \end{pmatrix}= \begin{pmatrix} 0\\ 0\\ \end{pmatrix} or \begin{pmatrix} 0\\ 2\\ \end{pmatrix}
∇f=(6x1x26x22−12x2+3x12)=0⇒(x1x2)=(00)or(02)
∇
2
f
=
6
(
x
2
x
1
x
1
2
(
x
2
−
1
)
)
\nabla^2f=6 \begin{pmatrix} x_2&x_1\\ x_1&2\left(x_2-1\right)\\ \end{pmatrix}
∇2f=6(x2x1x12(x2−1))
∇
2
f
(
0
,
0
)
\nabla^2 f\left(0,0\right)
∇2f(0,0)不定,所以
(
0
,
0
)
\left(0,0\right)
(0,0)是鞍点
∇
2
f
(
0
,
2
)
≻
0
\nabla^2 f\left(0,2\right)\succ 0
∇2f(0,2)≻0,所以
(
0
,
2
)
\left(0,2\right)
(0,2)是严格局部最小值点
(iv)
∇
f
=
(
4
x
1
3
+
4
x
1
x
2
−
8
x
1
−
8
2
x
1
2
+
2
x
2
−
8
)
=
0
⇒
(
x
1
x
2
)
=
(
1
3
)
\nabla f= \begin{pmatrix} 4x_1^3+4x_1x_2-8x_1-8\\ 2x_1^2+2x_2-8\\ \end{pmatrix}=0\Rightarrow \begin{pmatrix} x_1\\ x_2\\ \end{pmatrix}= \begin{pmatrix} 1\\ 3\\ \end{pmatrix}
∇f=(4x13+4x1x2−8x1−82x12+2x2−8)=0⇒(x1x2)=(13)
∇
2
f
(
1
,
3
)
=
2
(
6
x
1
2
+
2
x
2
−
4
2
x
1
2
x
1
1
)
=
2
(
8
2
2
1
)
≻
0
\nabla^2f\left(1,3\right)= 2\begin{pmatrix} 6x_1^2+2x_2-4&2x_1\\ 2x_1&1\\ \end{pmatrix}= 2\begin{pmatrix} 8&2\\ 2&1\\ \end{pmatrix}\succ 0
∇2f(1,3)=2(6x12+2x2−42x12x11)=2(8221)≻0
所以
(
1
,
3
)
\left(1,3\right)
(1,3)是严格局部最小值点
又因为
f
(
x
1
,
x
2
)
=
(
x
1
2
+
x
2
−
4
)
2
+
(
x
1
−
1
)
2
−
20
f\left(x_1,x_2\right)=\left(x_1^2+x_2-4\right)^2+\left(x_1-1\right)^2-20
f(x1,x2)=(x12+x2−4)2+(x1−1)2−20
所以是严格全局最小值点
(v)
∇
f
=
(
4
(
x
1
−
2
x
2
)
3
+
64
x
2
−
8
(
x
1
−
2
x
2
)
3
+
64
x
1
)
=
0
⇒
(
x
1
x
2
)
=
(
0
0
)
o
r
(
−
1
1
2
)
o
r
(
1
−
1
2
)
\nabla f= \begin{pmatrix} 4\left(x_1-2x_2\right)^3+64x_2\\ -8\left(x_1-2x_2\right)^3+64x_1\\ \end{pmatrix}=0\Rightarrow \begin{pmatrix} x_1\\ x_2\\ \end{pmatrix}= \begin{pmatrix} 0\\ 0\\ \end{pmatrix}or \begin{pmatrix} -1\\ \frac{1}{2}\\ \end{pmatrix}or \begin{pmatrix} 1\\ -\frac{1}{2}\\ \end{pmatrix}
∇f=(4(x1−2x2)3+64x2−8(x1−2x2)3+64x1)=0⇒(x1x2)=(00)or(−121)or(1−21)
∇
2
f
=
4
(
3
(
x
1
−
2
x
2
)
2
−
6
(
x
1
−
2
x
2
)
2
+
16
−
6
(
x
1
−
2
x
2
)
2
+
16
12
(
x
1
−
2
x
2
)
2
)
\nabla^2f=4 \begin{pmatrix} 3\left(x_1-2x_2\right)^2&-6\left(x_1-2x_2\right)^2+16\\ -6\left(x_1-2x_2\right)^2+16&12\left(x_1-2x_2\right)^2\\ \end{pmatrix}
∇2f=4(3(x1−2x2)2−6(x1−2x2)2+16−6(x1−2x2)2+1612(x1−2x2)2)
∇
2
f
(
0
,
0
)
\nabla^2 f\left(0,0\right)
∇2f(0,0)不定,所以
(
0
,
0
)
\left(0,0\right)
(0,0)是鞍点
∇
2
f
(
−
1
,
1
2
)
≻
0
\nabla^2 f\left(-1,\frac{1}{2}\right)\succ 0
∇2f(−1,21)≻0,所以
(
−
1
,
1
2
)
\left(-1,\frac{1}{2}\right)
(−1,21)是严格局部最小值点
∇
2
f
(
1
,
−
1
2
)
≻
0
\nabla^2 f\left(1,-\frac{1}{2}\right)\succ 0
∇2f(1,−21)≻0,所以
(
1
,
−
1
2
)
\left(1,-\frac{1}{2}\right)
(1,−21)是严格局部最小值点
(vi)
∇
f
=
(
4
x
1
−
2
x
2
+
2
6
x
2
−
2
x
1
−
3
)
=
0
⇒
(
x
1
x
2
)
=
(
−
3
10
2
5
)
\nabla f= \begin{pmatrix} 4x_1-2x_2+2\\ 6x_2-2x_1-3\\ \end{pmatrix}=0\Rightarrow \begin{pmatrix} x_1\\ x_2\\ \end{pmatrix}= \begin{pmatrix} -\frac{3}{10}\\ \frac{2}{5}\\ \end{pmatrix}
∇f=(4x1−2x2+26x2−2x1−3)=0⇒(x1x2)=(−10352)
∇
2
f
=
(
4
−
2
−
2
6
)
≻
0
\nabla^2f= \begin{pmatrix} 4&-2\\ -2&6\\ \end{pmatrix}\succ0
∇2f=(4−2−26)≻0
所以
(
−
3
10
,
2
5
)
\left(-\frac{3}{10},\frac{2}{5}\right)
(−103,52)是严格全局最小值点
(vii)
∇
f
=
(
2
x
1
+
4
x
2
+
1
4
x
1
+
2
x
2
−
1
)
=
0
⇒
(
x
1
x
2
)
=
(
1
2
−
1
2
)
\nabla f= \begin{pmatrix} 2x_1+4x_2+1\\ 4x_1+2x_2-1\\ \end{pmatrix}=0\Rightarrow \begin{pmatrix} x_1\\ x_2\\ \end{pmatrix}= \begin{pmatrix} \frac{1}{2}\\ -\frac{1}{2}\\ \end{pmatrix}
∇f=(2x1+4x2+14x1+2x2−1)=0⇒(x1x2)=(21−21)
∇
2
f
=
(
2
4
4
2
)
\nabla^2f= \begin{pmatrix} 2&4\\ 4&2\\ \end{pmatrix}
∇2f=(2442)
∇
2
f
\nabla^2f
∇2f不定,所以
(
1
2
,
−
1
2
)
\left(\frac{1}{2},-\frac{1}{2}\right)
(21,−21)是鞍点
2.18. Let
f
f
f be twice continuously differentiable function over
R
n
\mathbb{R}^n
Rn. Suppose that
∇
2
f
(
x
)
≻
0
\nabla^2 f\left(\mathbf{x}\right)\succ 0
∇2f(x)≻0 for any
x
∈
R
n
\mathbf{x}\in\mathbb{R}^n
x∈Rn.Prove that a stationary point of
f
f
f is necessarily a strict global minimum point.
解:
(应该是说如果是驻点,则是严格全局最小点吧)
设
x
∗
\mathbf{x}^*
x∗是一个驻点
f
(
x
)
−
f
(
x
∗
)
=
1
2
(
x
−
x
∗
)
T
∇
2
f
(
z
)
(
x
−
x
∗
)
>
0
f\left(\mathbf{x}\right)-f\left(\mathbf{x}^*\right)=\frac{1}{2}\left(\mathbf{x}-\mathbf{x}^*\right)^T\nabla^2 f\left(\mathbf{z}\right)\left(\mathbf{x}-\mathbf{x}^*\right)>0
f(x)−f(x∗)=21(x−x∗)T∇2f(z)(x−x∗)>0
其中
x
≠
x
∗
\mathbf{x}\neq \mathbf{x}^*
x=x∗,
z
\mathbf{z}
z介于
x
,
x
∗
\mathbf{x},\mathbf{x}^*
x,x∗之间
可以得到,这个驻点严格全局最小值
且是唯一的,否则与海瑟矩阵正定矛盾
2.19. Let f ( x ) = x T A x + 2 b T x + c f\left(\mathbf{x}\right)=\mathbf{x}^T\mathbf{Ax}+2\mathbf{b}^T\mathbf{x}+c f(x)=xTAx+2bTx+c, where A ∈ R n × n \mathbf{A}\in\mathbb{R}^{n\times n} A∈Rn×n is symmetric, b ∈ R n \mathbf{b}\in\mathbf{R}^n b∈Rn, and c ∈ R c\in \mathbb{R} c∈R. Suppose that A ⪰ 0 \mathbf{A}\succeq 0 A⪰0.Show that f is bounded below over R n \mathbb{R}^n Rn if and only if b ∈ R a n g e ( A ) = { A y : y ∈ R n } \mathbf{b}\in Range\left(\mathbf{A}\right)=\left\{\mathbf{Ay}:\mathbf{y}\in\mathbb{R}^n\right\} b∈Range(A)={Ay:y∈Rn}.
(A function f is bounded below over a set C C C if there exists a constant α \alpha α such that f ( x ) ≥ α f\left(\mathbf{x}\right)\ge \alpha f(x)≥α for any x ∈ C \mathbf{x}\in C x∈C)
解:
f ( x ) = x T A x + 2 b T x + c ∇ f ( x ) = 2 A x + 2 b ∇ f 2 ( x ) = 2 A f\left(\mathbf{x}\right)=\mathbf{x}^T\mathbf{Ax}+2\mathbf{b}^T\mathbf{x}+c\\ \nabla f\left(\mathbf{x}\right)=2\mathbf{Ax}+2\mathbf{b}\\ \nabla f^2\left(\mathbf{x}\right)=2\mathbf{A} f(x)=xTAx+2bTx+c∇f(x)=2Ax+2b∇f2(x)=2A
如果
b
∈
R
a
n
g
e
(
A
)
=
{
A
y
:
y
∈
R
n
}
\mathbf{b}\in Range\left(\mathbf{A}\right)=\left\{\mathbf{Ay}:\mathbf{y}\in\mathbb{R}^n\right\}
b∈Range(A)={Ay:y∈Rn},
说明
∇
f
(
x
)
=
0
\nabla f\left(\mathbf{x}\right)=0
∇f(x)=0有解,则
f
f
f存在全局最小值,所以有下界
如果
f
f
f有下界,假设
b
∉
R
a
n
g
e
(
A
)
\mathbf{b}\notin Range\left(\mathbf{A}\right)
b∈/Range(A)
则
b
⊥̸
N
(
A
T
)
=
N
(
A
)
\mathbf{b}\not\perp N\left(\mathbf{A}^T\right)=N\left(\mathbf{A}\right)
b⊥N(AT)=N(A)(其实我也不确定这个对不对)
于是存在
y
\mathbf{y}
y,使得
y
T
A
y
=
0
,
b
T
y
<
0
\mathbf{y}^T\mathbf{Ay}=0,\mathbf{b}^T\mathbf{y}<0
yTAy=0,bTy<0
当
λ
→
+
∞
\lambda \to +\infty
λ→+∞时,有
f
(
λ
y
)
→
−
∞
f\left(\lambda \mathbf{y}\right)\to -\infty
f(λy)→−∞,矛盾
所以
b
∈
R
a
n
g
e
(
A
)
=
{
A
y
:
y
∈
R
n
}
\mathbf{b}\in Range\left(\mathbf{A}\right)=\left\{\mathbf{Ay}:\mathbf{y}\in\mathbb{R}^n\right\}
b∈Range(A)={Ay:y∈Rn}