目标:使得所有原始点与投影点之间的误差最小(最小重构误差)
以二维空间为例,已知
u
⃗
\vec{u}
u为单位向量,则有:
e ⃗ = x ⃗ − p i j x ⃗ \vec{e}=\vec{x}-p_{ij}\vec{x} e=x−pijx = x ⃗ − ( x ⃗ , u ⃗ ) u ⃗ \vec{x}-(\vec{x},\vec{u})\vec{u} x−(x,u)u = x ⃗ − ( x ⃗ T u ⃗ ) u ⃗ \vec{x}-(\vec{x}^T\vec{u})\vec{u} x−(xTu)u
其中
x
⃗
T
u
⃗
\vec{x}^T\vec{u}
xTu为标量。
拓展到多维空间,有
J = ∣ ∣ e ⃗ ∣ ∣ 2 = [ x ⃗ − ( x ⃗ T u ⃗ ) u ⃗ ] T [ x ⃗ − ( x ⃗ T u ⃗ ) u ⃗ ] = x ⃗ T x ⃗ − ( x ⃗ T u ⃗ ) 2 − ( x ⃗ T u ⃗ ) u ⃗ T x ⃗ + ( x ⃗ T u ⃗ ) 2 u ⃗ T u ⃗ = ∣ ∣ x ⃗ ∣ ∣ 2 − ( x ⃗ T u ⃗ ) 2 J \, = ||\vec{e}||^2 = [\vec{x}-(\vec{x}^T\vec{u})\vec{u}]^T[\vec{x}-(\vec{x}^T\vec{u})\vec{u}]\\\\\quad = \vec{x}^T\vec{x} - (\vec{x}^T\vec{u})^2 - (\vec{x}^T\vec{u})\vec{u}^T\vec{x} + (\vec{x}^T\vec{u})^2\vec{u}^T\vec{u}\\\quad = ||\vec{x}||^2 - (\vec{x}^T\vec{u})^2 J=∣∣e∣∣2=[x−(xTu)u]T[x−(xTu)u]=xTx−(xTu)2−(xTu)uTx+(xTu)2uTu=∣∣x∣∣2−(xTu)2
因为
x
⃗
\vec{x}
x为已知量,所以
m
i
n
J
minJ
minJ即为
m
a
x
(
x
⃗
T
u
⃗
)
2
=
m
a
x
[
u
⃗
T
(
x
⃗
x
⃗
T
)
u
⃗
]
max(\vec{x}^T\vec{u})^2=max[\vec{u}^T(\vec{x}\vec{x}^T)\vec{u}]
max(xTu)2=max[uT(xxT)u](因为
x
⃗
T
u
⃗
\vec{x}^T\vec{u}
xTu为标量)
现假设共有
N
N
N个样本
m a x ∑ i = 1 N u ⃗ T ( x i ⃗ x i ⃗ T ) u = u ⃗ T ( ∑ i = 1 N x i ⃗ x i ⃗ T ) u max\sum_{i=1}^{N}\vec{u}^T(\vec{x_i}\vec{x_i}^T)u=\vec{u}^T(\sum_{i=1}^{N}\vec{x_i}\vec{x_i}^T)u max∑i=1NuT(xixiT)u=uT(∑i=1NxixiT)u
设 X = ∑ i = 1 N x i ⃗ x i ⃗ T X=\sum_{i=1}^{N}\vec{x_i}\vec{x_i}^T X=∑i=1NxixiT,即为 m a x ( u ⃗ T X u ) s t ∣ ∣ u ⃗ ∣ ∣ = 1 max(\vec{u}^TXu) \ st\ ||\vec{u}||=1 max(uTXu) st ∣∣u∣∣=1
设拉格朗日函数为: L ( u ⃗ , λ ) = u ⃗ T X u + λ ( 1 − u ⃗ T u ⃗ ) L(\vec{u},\lambda)= \vec{u}^TXu+\lambda(1-\vec{u}^T\vec{u}) L(u,λ)=uTXu+λ(1−uTu)
则有
{
∂
L
∂
u
=
X
u
⃗
−
λ
u
⃗
=
0
∂
L
∂
λ
=
1
−
u
⃗
T
u
⃗
=
0
\begin{cases}\frac{\partial L}{\partial u}=X\vec{u}-\lambda\vec{u}=0\\\\\frac{\partial L}{\partial \lambda}=1-\vec{u}^T\vec{u}=0\end{cases}
⎩⎪⎨⎪⎧∂u∂L=Xu−λu=0∂λ∂L=1−uTu=0
{ X u ⃗ = λ u ⃗ u ⃗ T u ⃗ = 1 \begin{cases}X\vec{u}=\lambda\vec{u}\\\\\vec{u}^T\vec{u}=1 \end{cases} ⎩⎪⎨⎪⎧Xu=λuuTu=1
λ \lambda λ最大值对应的特征向量即为第一主成分,实质上就是求对称阵的特征值与特征向量。