证明: f ( x ) = x T Ψ x f(\boldsymbol x)=\boldsymbol x^{\text T} \boldsymbol{\Psi x} f(x)=xTΨx在 ∣ x ∣ = 1 |\boldsymbol x|=1 ∣x∣=1上的最大、最小值分别为 Ψ \boldsymbol\Psi Ψ的最大、最小特征值。
x \boldsymbol x x为 n n n维列向量, Ψ \boldsymbol\Psi Ψ为 n n n维对称矩阵。
设 x = [ x 1 , x 2 , ⋯ , x n ] T \boldsymbol x=[x_1,x_2,\cdots,x_n]^{\text T} x=[x1,x2,⋯,xn]T, e i \boldsymbol e_i ei为第 i i i个元素为 1 1 1、其余元素为 0 0 0的列向量。 f ( x ) f(\boldsymbol x) f(x)求偏导,得
∂ ∂ x i f ( x ) = ( ∂ ∂ x i x T ) Ψ x + x T Ψ ( ∂ ∂ x i x ) = e i T Ψ x + x T Ψ e i = 2 e i T Ψ x \begin{aligned} \dfrac{\partial}{\partial x_i}f(\boldsymbol x)&=\left(\dfrac{\partial}{\partial x_i}\boldsymbol x^{\text T}\right)\boldsymbol{\Psi x}+\boldsymbol x^{\text T}\boldsymbol{\Psi}\left(\dfrac{\partial}{\partial x_i}\boldsymbol x\right)\\ &=\boldsymbol e_i^{\text T}\boldsymbol{\Psi x}+\boldsymbol x^{\text T}\boldsymbol{\Psi}\boldsymbol e_i\\ &=2\boldsymbol e_i^{\text T}\boldsymbol{\Psi x} \end{aligned} ∂xi∂f(x)=(∂xi∂xT)Ψx+xTΨ(∂xi∂x)=eiTΨx+xTΨei=2eiTΨx
构造拉格朗日函数 L ( x , λ ) = f ( x ) − λ ( ∣ x ∣ 2 − 1 ) L(\boldsymbol x,\lambda)=f(\boldsymbol x)-\lambda(|\boldsymbol x|^2-1) L(x,λ)=f(x)−λ(∣x∣2−1)。 L ( x , λ ) L(\boldsymbol x,\lambda) L(x,λ)求偏导,得
∂ ∂ x i L ( x , λ ) = ∂ ∂ x i f ( x ) − λ ⋅ 2 x i = 2 ( e i T Ψ x − λ x i ) \dfrac{\partial}{\partial x_i}L(\boldsymbol x,\lambda)=\dfrac{\partial}{\partial x_i}f(\boldsymbol x)-\lambda\cdot 2x_i=2(\boldsymbol e_i^{\text T}\boldsymbol{\Psi x}-\lambda x_i) ∂xi∂L(x,λ)=∂xi∂f(x)−λ⋅2xi=2(eiTΨx−λxi)
由于 e i T Ψ \boldsymbol e_i^{\text T}\boldsymbol{\Psi} eiTΨ为 Ψ \boldsymbol\Psi Ψ的第 i i i行,有
1 2 [ ∂ ∂ x 1 L ( x , λ ) ∂ ∂ x 2 L ( x , λ ) ⋮ ∂ ∂ x n L ( x , λ ) ] = [ e 1 T Ψ e 2 T Ψ ⋮ e n T Ψ ] x − λ [ x 1 x 2 ⋮ x n ] = ( Ψ − λ E ) x \dfrac 12\begin{bmatrix} \dfrac{\partial}{\partial x_1}L(\boldsymbol x,\lambda)\\ \dfrac{\partial}{\partial x_2}L(\boldsymbol x,\lambda)\\ \vdots\\ \dfrac{\partial}{\partial x_n}L(\boldsymbol x,\lambda) \end{bmatrix}=\begin{bmatrix} \boldsymbol e_1^{\text T}\boldsymbol{\Psi}\\ \boldsymbol e_2^{\text T}\boldsymbol{\Psi}\\ \vdots\\ \boldsymbol e_n^{\text T}\boldsymbol{\Psi} \end{bmatrix}\boldsymbol x-\lambda\begin{bmatrix} x_1\\ x_2\\ \vdots\\ x_n \end{bmatrix}=(\boldsymbol\Psi-\lambda\boldsymbol E)\boldsymbol x 21⎣⎢⎢⎢⎢⎢⎢⎢⎢⎡∂x1∂L(x,λ)∂x2∂L(x,λ)⋮∂xn∂L(x,λ)⎦⎥⎥⎥⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎡e1TΨe2TΨ⋮enTΨ⎦⎥⎥⎥⎤x−λ⎣⎢⎢⎢⎡x1x2⋮xn⎦⎥⎥⎥⎤=(Ψ−λE)x
因此 f ( x ) f(\boldsymbol x) f(x)在 x \boldsymbol x x处取得最值的必要条件为
( Ψ − λ E ) x = 0 (\boldsymbol\Psi-\lambda\boldsymbol E)\boldsymbol x=\boldsymbol 0 (Ψ−λE)x=0
若 ∣ Ψ − λ E ∣ ≠ 0 |\boldsymbol\Psi-\lambda\boldsymbol E|\neq 0 ∣Ψ−λE∣=0,上式有唯一解 x = 0 \boldsymbol x=\boldsymbol 0 x=0,与 ∣ x ∣ = 1 |\boldsymbol x|=1 ∣x∣=1不符。所以 ∣ Ψ − λ E ∣ = 0 |\boldsymbol\Psi-\lambda\boldsymbol E|=0 ∣Ψ−λE∣=0,即 λ \lambda λ为 Ψ \boldsymbol\Psi Ψ的特征值,且 x \boldsymbol x x为 λ \lambda λ对应的单位特征向量。因此
f ( x ) = x T Ψ x = x T λ x = λ ∣ x ∣ 2 = λ f(\boldsymbol x)=\boldsymbol x^{\text T}\boldsymbol{\Psi x}=\boldsymbol x^{\text T}\lambda\boldsymbol x=\lambda|\boldsymbol x|^2=\lambda f(x)=xTΨx=xTλx=λ∣x∣2=λ
故 Ψ \boldsymbol\Psi Ψ的所有特征值都是 f ( x ) f(\boldsymbol x) f(x)的取值,且包含 f ( x ) f(\boldsymbol x) f(x)的最值。因此 f ( x ) f(\boldsymbol x) f(x)的最值就是 Ψ \boldsymbol\Psi Ψ的特征值的最值。