Kernel Method: 3.线性判别分析与广义判别分析

3. LDA and GDA

3.1 Linear Discriminant Analysis

寻找一个方向向量满足:

  • 投影后的各类均值距离最大
  • 投影后每一类的样本与均值的距离最小

即增大类均值距离,增大每一类的样本聚集程度。目的是降低样本投影之间的重叠部分,增大可分性

在这里插入图片描述

L L L:样本类别数目; N i N_i Ni:第 i i i类样本的数目; N N N全部样本数目; x j ( i ) \boldsymbol x^{(i)}_j xj(i):第 j j j类中的第 i i i个样本

将所有的样本投影到方向向量 v \boldsymbol v v上, v T x 1 ( 1 ) , ⋯   , v T x N 1 ( 1 ) ; v T x 2 ( 2 ) , ⋯   , v T x N 2 ( 2 ) ; ⋯   ; v T x 1 ( L ) , ⋯   , v T x N L ( L ) \boldsymbol v^T\boldsymbol x^{(1)}_1,\cdots,\boldsymbol v^T\boldsymbol x^{(1)}_{N_1};\boldsymbol v^T\boldsymbol x^{(2)}_2,\cdots,\boldsymbol v^T\boldsymbol x^{(2)}_{N_2};\cdots;\boldsymbol v^T\boldsymbol x^{(L)}_1,\cdots,\boldsymbol v^T\boldsymbol x^{(L)}_{N_L} vTx1(1),,vTxN1(1);vTx2(2),,vTxN2(2);;vTx1(L),,vTxNL(L)

各类的均值为
m ‾ i = 1 N i ∑ j = 1 N i v T x j ( i ) = v T ( 1 N i ∑ j = 1 N i x j ( i ) ) = v T m i \overline{\boldsymbol m}_i=\frac{1}{N_i}\sum_{j=1}^{N_i}\boldsymbol v^T\boldsymbol x^{(i)}_j=\boldsymbol v^T\left(\frac{1}{N_i}\sum_{j=1}^{N_i}\boldsymbol x^{(i)}_j\right)=\boldsymbol v^T\boldsymbol m_i mi=Ni1j=1NivTxj(i)=vT(Ni1j=1Nixj(i))=vTmi
其中 m i m_i mi为原空间内第 i i i类的均值。

然后计算每一类均值之间距离的权重平方和
∑ i = 1 L − 1 ∑ j = i + 1 L N i N N j N ( m ‾ i − m ‾ j ) 2 = ∑ i = 1 L − 1 ∑ j = i + 1 L N i N N j N ( m ‾ i − m ‾ j ) ( m ‾ i − m ‾ j ) T = ∑ i = 1 L − 1 ∑ j = i + 1 L N i N N j N ( v T m i − v T m j ) ( v T m i − v T m j ) T = ∑ i = 1 L − 1 ∑ j = i + 1 L N i N N j N v T ( m i − m j ) ( m i − m j ) T v = v T ( ∑ i = 1 L − 1 ∑ j = i + 1 L N i N N j N ( m i − m j ) ( m i − m j ) T ) v = v T S b L D A v \begin{aligned} \sum^{L-1}_{i=1}{\sum^{L}_{j=i+1}{\frac{N_i}{N}\frac{N_j}{N}(\overline{\boldsymbol m}_i-\overline{\boldsymbol m}_j)^2}} &= \sum^{L-1}_{i=1}{\sum^{L}_{j=i+1}{\frac{N_i}{N}\frac{N_j}{N}(\overline{\boldsymbol m}_i-\overline{\boldsymbol m}_j)(\overline{\boldsymbol m}_i-\overline{\boldsymbol m}_j)^T}} \\ &= \sum^{L-1}_{i=1}{\sum^{L}_{j=i+1}{\frac{N_i}{N}\frac{N_j}{N}(\boldsymbol v^T\boldsymbol m_i-\boldsymbol v^T\boldsymbol m_j)(\boldsymbol v^T\boldsymbol m_i-\boldsymbol v^T\boldsymbol m_j)^T}} \\ &= \sum^{L-1}_{i=1}{\sum^{L}_{j=i+1}{\frac{N_i}{N}\frac{N_j}{N}\boldsymbol v^T(\boldsymbol m_i-\boldsymbol m_j)(\boldsymbol m_i-\boldsymbol m_j)^T\boldsymbol v}} \\ &= \boldsymbol v^T\left(\sum^{L-1}_{i=1}{\sum^{L}_{j=i+1}{\frac{N_i}{N}\frac{N_j}{N}}(\boldsymbol m_i-\boldsymbol m_j)(\boldsymbol m_i-\boldsymbol m_j)^T}\right)\boldsymbol v \\ &= \boldsymbol v^TS^{LDA}_{b}\boldsymbol v \end{aligned} i=1L1j=i+1LNNiNNj(mimj)2=i=1L1j=i+1LNNiNNj(mimj)(mimj)T=i=1L1j=i+1LNNiNNj(vTmivTmj)(vTmivTmj)T=i=1L1j=i+1LNNiNNjvT(mimj)(mimj)Tv=vT(i=1L1j=i+1LNNiNNj(mimj)(mimj)T)v=vTSbLDAv

S b L D A = ∑ i = 1 L − 1 ∑ j = i + 1 L N i N N j N ( m i − m j ) ( m i − m j ) T = 1 2 ∑ i = 1 L ∑ j = 1 L N i N N j N ( m i − m j ) ( m i − m j ) T = 1 2 ∑ i = 1 L ∑ j = 1 L N i N N j N ( m i m i T − m i m j T − m j m i T + m j m j T ) = 1 2 ( ∑ i = 1 L ∑ j = 1 L N i N N j N m i m i T − ∑ i = 1 L ∑ j = 1 L N i N N j N m i m j T − ∑ i = 1 L ∑ j = 1 L N i N N j N m j m i T + ∑ i = 1 L ∑ j = 1 L N i N N j N m j m j T ) = 1 2 ( ∑ i = 1 L N i N m i m i T ∑ j = 1 L N j N − ∑ i = 1 L N i N m i ∑ j = 1 L N j N m j T − ∑ j = 1 L N j N m j ∑ i = 1 L N i N m i T + ∑ i = 1 L N i N ∑ j = 1 L N j N m j T m j T ) = 1 2 ( ∑ i = 1 L N i N m i m i T − m 0 m 0 T − m 0 m 0 T + ∑ L j = 1 N j N m j m j T ) = ∑ L i = 1 N i N m i m i T − m 0 m 0 T = ∑ i = 1 L N i N ( m i − m 0 ) ( m i − m 0 ) T   ( 与 E [ ( x − x ˉ ) 2 ] = E [ x 2 ] − x ˉ 2 相 似 ) \begin{aligned} S^{LDA}_b &= \sum^{L-1}_{i=1}{\sum^{L}_{j=i+1}{\frac{N_i}{N}\frac{N_j}{N}}(\boldsymbol m_i-\boldsymbol m_j)(\boldsymbol m_i-\boldsymbol m_j)^T}\\ &= \frac{1}{2}\sum^{L}_{i=1}{\sum^{L}_{j=1}{\frac{N_i}{N}\frac{N_j}{N}}(\boldsymbol m_i-\boldsymbol m_j)(\boldsymbol m_i-\boldsymbol m_j)^T}\\ &= \frac{1}{2}\sum^{L}_{i=1}{\sum^{L}_{j=1}{\frac{N_i}{N}\frac{N_j}{N}}(\boldsymbol m_i\boldsymbol m_i^T-\boldsymbol m_i\boldsymbol m_j^T-\boldsymbol m_j\boldsymbol m_i^T+\boldsymbol m_j\boldsymbol m_j^T)}\\ &= \frac{1}{2}\left({\sum_{i=1}^{L}{\sum_{j=1}^{L}{\frac{N_i}{N}\frac{N_j}{N}\boldsymbol m_i\boldsymbol m_i^T}} -\sum_{i=1}^{L}{\sum_{j=1}^{L}{\frac{N_i}{N}\frac{N_j}{N}\boldsymbol m_i\boldsymbol m_j^T}} -\sum_{i=1}^{L}{\sum_{j=1}^{L}{\frac{N_i}{N}\frac{N_j}{N}\boldsymbol m_j\boldsymbol m_i^T}} +\sum_{i=1}^{L}{\sum_{j=1}^{L}{\frac{N_i}{N}\frac{N_j}{N}\boldsymbol m_j\boldsymbol m_j^T}}}\right)\\ &= \frac{1}{2}\left( \sum_{i=1}^{L}\frac{N_i}{N}\boldsymbol m_i\boldsymbol m_i^T\sum_{j=1}^{L}\frac{N_j}{N}- \sum_{i=1}^{L}\frac{N_i}{N}\boldsymbol m_i\sum_{j=1}^{L}\frac{N_j}{N}\boldsymbol m_j^T- \sum_{j=1}^{L}\frac{N_j}{N}\boldsymbol m_j\sum_{i=1}^{L}\frac{N_i}{N}\boldsymbol m_i^T+ \sum_{i=1}^{L}\frac{N_i}{N}\sum_{j=1}^{L}\frac{N_j}{N}\boldsymbol m_j^T\boldsymbol m_j^T \right)\\ &= \frac{1}{2}\left( \sum_{i=1}^{L}\frac{N_i}{N}\boldsymbol m_i\boldsymbol m_i^T-\boldsymbol m_0\boldsymbol m_0^T-\boldsymbol m_0\boldsymbol m_0^T+\sum_{L}^{j=1}\frac{N_j}{N}\boldsymbol m_j\boldsymbol m_j^T \right)\\ &= \sum_{L}^{i=1}\frac{N_i}{N}\boldsymbol m_i\boldsymbol m_i^T-\boldsymbol m_0\boldsymbol m_0^T\\ &= \sum_{i=1}^{L}\frac{N_i}{N}(\boldsymbol m_i-\boldsymbol m_0)(\boldsymbol m_i-\boldsymbol m_0)^T\ (与E[(x-\bar x)^2]=E[x^2]-\bar x^2相似) \end{aligned} SbLDA=i=1L1j=i+1LNNiNNj(mimj)(mimj)T=21i=1Lj=1LNNiNNj(mimj)(mimj)T=21i=1Lj=1LNNiNNj(mimiTmimjTmjmiT+mjmjT)=21(i=1Lj=1LNNiNNjmimiTi=1Lj=1LNNiNNjmimjTi=1Lj=1LNNiNNjmjmiT+i=1Lj=1LNNiNNjmjmjT)=21(i=1LNNimimiTj=1LNNji=1LNNimij=1LNNjmjTj=1LNNjmji=1LNNimiT+i=1LNNij=1LNNjmjTmjT)=21(i=1LNNimimiTm0m0Tm0m0T+Lj=1NNjmjmjT)=Li=1NNimimiTm0m0T=i=1LNNi(mim0)(mim0)T (E[(xxˉ)2]=E[x2]xˉ2)

其中,
m 0 = ∑ L i = 1 N i N m i = ∑ L i = 1 N i N ∑ k = 1 N i 1 N i x k ( i ) = ∑ L i = 1 ∑ N i k = 1 1 N x k ( i ) \boldsymbol m_0=\sum_{L}^{i=1}\frac{N_i}{N}\boldsymbol m_i=\sum_{L}^{i=1}\frac{N_i}{N}\sum_{k=1}^{N_i}\frac{1}{N_i}\boldsymbol x_k^{(i)}=\sum_{L}^{i=1}\sum_{N_i}^{k=1}\frac{1}{N}\boldsymbol x_k^{(i)} m0=Li=1NNimi=Li=1NNik=1NiNi1xk(i)=Li=1Nik=1N1xk(i)
综上,组间分散矩阵为:
S b L D A = ∑ i = 1 L − 1 ∑ j = i + 1 L N i N N j N ( m i − m j ) ( m i − m j ) T = ∑ i = 1 L N i N ( m i − m 0 ) ( m i − m 0 ) T S^{LDA}_b=\sum_{i=1}^{L-1}\sum^{L}_{j=i+1}\frac{N_i}{N}\frac{N_j}{N}(\boldsymbol m_i-\boldsymbol m_j)(\boldsymbol m_i-\boldsymbol m_j)^T=\sum_{i=1}^{L}\frac{N_i}{N}(\boldsymbol m_i-\boldsymbol m_0)(\boldsymbol m_i-\boldsymbol m_0)^T SbLDA=i=1L1j=i+1LNNiNNj(mimj)(mimj)T=i=1LNNi(mim0)(mim0)T
相当与每个集群的形心到整个集群的形心之间的距离乘上质量权重

在这里插入图片描述

类方差和为
∑ i = 1 L ∑ j = 1 N i 1 N ( v T x J ( i ) − m ‾ i ) 2 = ∑ i = 1 L ∑ j = 1 N i 1 N ( v T x j ( i ) − v T m i ) ( v T x j ( i ) − v T m i ) T = v T ( ∑ i = 1 L ∑ j = 1 N i 1 N ( x j ( i ) − m i ) ( x j ( i ) − m i ) T ) v = v T S w L D A v \begin{aligned} \sum_{i=1}^{L}\sum_{j=1}^{N_i}\frac{1}{N}(\boldsymbol v^Tx_J^{(i)}-\overline{\boldsymbol m}_i)^2 &= \sum_{i=1}^{L}\sum_{j=1}^{N_i}\frac{1}{N}(\boldsymbol v^T\boldsymbol x_j^{(i)}-\boldsymbol v^T\boldsymbol m_i)(\boldsymbol v^T\boldsymbol x_j^{(i)}-\boldsymbol v^T\boldsymbol m_i)^T\\ &= \boldsymbol v^T\left(\sum_{i=1}^{L}\sum_{j=1}^{N_i}\frac{1}{N}(\boldsymbol x_j^{(i)}-\boldsymbol m_i)(\boldsymbol x_j^{(i)}-\boldsymbol m_i)^T\right)\boldsymbol v\\ &= \boldsymbol v^TS^{LDA}_w\boldsymbol v \end{aligned} i=1Lj=1NiN1(vTxJ(i)mi)2=i=1Lj=1NiN1(vTxj(i)vTmi)(vTxj(i)vTmi)T=vT(i=1Lj=1NiN1(xj(i)mi)(xj(i)mi)T)v=vTSwLDAv
所以,组内分散矩阵为:
S w L D A = ∑ i = 1 L ∑ j = 1 N i 1 N ( x j ( i ) − m i ) ( x j ( i ) − m i ) T S^{LDA}_w=\sum_{i=1}^{L}\sum_{j=1}^{N_i}\frac{1}{N}(\boldsymbol x_j^{(i)}-\boldsymbol m_i)(\boldsymbol x_j^{(i)}-\boldsymbol m_i)^T SwLDA=i=1Lj=1NiN1(xj(i)mi)(xj(i)mi)T
第一主元向量可以由以下计算:
v = arg ⁡ max ⁡ v ∈ R d v T S b L D A v v T S w L D A v = arg ⁡ max ⁡ v T S b L D A v = 1 v T S b L D A v . {\color{red} \boldsymbol v=\mathop{\arg\max}_{\boldsymbol v\in\mathbb{R}^d}\frac{\boldsymbol v^TS^{LDA}_b\boldsymbol v}{\boldsymbol v^TS^{LDA}_w\boldsymbol v}=\mathop{\arg\max}_{\boldsymbol v^TS_b^{LDA}\boldsymbol v=1}\boldsymbol v^TS_b^{LDA}\boldsymbol v}. v=argmaxvRdvTSwLDAvvTSbLDAv=argmaxvTSbLDAv=1vTSbLDAv.
由Lagrangian方法可得,
f ( v , λ ) = v T S b L D A v − λ ( v T S w L D A v − 1 ) f(\boldsymbol v,\lambda)=\boldsymbol v^TS_b^{LDA}\boldsymbol v-\lambda(\boldsymbol v^TS_w^{LDA}\boldsymbol v-1) f(v,λ)=vTSbLDAvλ(vTSwLDAv1)

∂ f ∂ v = 2 S b L D A v − 2 λ S w L D A v ⇔ ( S w L D A ) − 1 S b L D A v = λ v ∂ f ∂ λ = v T S w L D A v − 1 = 0 ⇔ v T S w L D A v = 1 \begin{aligned} \frac{\partial f}{\partial \boldsymbol v}&=2S_b^{LDA}\boldsymbol v-2\lambda S_w^{LDA}\boldsymbol v \Leftrightarrow {\color{red}(S_w^{LDA})^{-1}S_b^{LDA}\boldsymbol v=\lambda \boldsymbol v}\\ \frac{\partial f}{\partial \lambda}&=\boldsymbol v^TS_w^{LDA}\boldsymbol v-1=0 \Leftrightarrow \boldsymbol v^TS_w^{LDA}\boldsymbol v=1 \end{aligned} vfλf=2SbLDAv2λSwLDAv(SwLDA)1SbLDAv=λv=vTSwLDAv1=0vTSwLDAv=1

当满足以上条件时, v T S b L D A v = λ v T S w L D A v = λ \boldsymbol v^TS^{LDA}_b\boldsymbol v=\lambda \boldsymbol v^TS^{LDA}_w\boldsymbol v=\lambda vTSbLDAv=λvTSwLDAv=λ

综上,求解第一主元等价于求解下列最大广义特征值,
S b L D A u = λ S w L D A u , v = 1 u T S w L D A u u S^{LDA}_b\boldsymbol u=\lambda S^{LDA}_w\boldsymbol u, \boldsymbol v=\frac{1}{\sqrt{\boldsymbol u^TS^{LDA}_w\boldsymbol u}}\boldsymbol u SbLDAu=λSwLDAu,v=uTSwLDAu 1u
其中后一项保证 v T S b L D A v = 1 \boldsymbol v^TS_b^{LDA}\boldsymbol v=1 vTSbLDAv=1

3.2 Generalized Discriminant Analysis

L L L:样本类别数目;

N i N_i Ni:第 i i i类样本的数目;

N N N全部样本数目;

ϕ ( x j ( i ) ) \phi(\boldsymbol x^{(i)}_j) ϕ(xj(i)):第 j j j类中的第 i i i个样本;

X i T = [ ϕ ( x 1 ( i ) ) , ⋯   , ϕ ( x N i ( i ) ) ] X^T_i=[\phi(\boldsymbol x^{(i)}_1),\cdots,\phi(\boldsymbol x^{(i)}_{N_i})] XiT=[ϕ(x1(i)),,ϕ(xNi(i))]

X T = [ X 1 T , ⋯   , X L T ] X^T=[X^T_1,\cdots,X^T_L] XT=[X1T,,XLT]

假设在空间 H H H内样本均值为零: m 0 = 0 \boldsymbol m_0=0 m0=0

则组间分散矩阵为:
S b G D A = ∑ i = 1 L N i N ( m i − m 0 ) ( m i − m 0 ) T = ∑ i = 1 L N i N m i m i T S^{GDA}_b=\sum_{i=1}^L\frac{N_i}{N}(\boldsymbol m_i-\boldsymbol m_0)(\boldsymbol m_i-\boldsymbol m_0)^T=\sum_{i=1}^L\frac{N_i}{N}\boldsymbol m_i\boldsymbol m_i^T SbGDA=i=1LNNi(mim0)(mim0)T=i=1LNNimimiT
组内分散矩阵为:
S w G D A = ∑ i = 1 L ∑ j = 1 N i 1 N ϕ ( x j ( i ) ) ϕ ( x j ( i ) ) T S^{GDA}_w=\sum_{i=1}^L\sum_{j=1}^{N_i}\frac{1}{N}\phi(\boldsymbol x^{(i)}_j)\phi(\boldsymbol x^{(i)}_j)^T SwGDA=i=1Lj=1NiN1ϕ(xj(i))ϕ(xj(i))T

m i = 1 N i ∑ j = 1 N i ϕ ( x j ( i ) ) = 1 N i [ ϕ ( x 1 ( i ) ) , ⋯   , ϕ ( x N i ( i ) ) ] [ 1 ⋮ 1 ] = 1 N i X i T 1 N i × 1 \boldsymbol m_i=\frac{1}{N_i}\sum_{j=1}^{N_i}\phi(\boldsymbol x^{(i)}_j)=\frac{1}{N_i}[\phi(\boldsymbol x^{(i)}_1),\cdots,\phi(\boldsymbol x^{(i)}_{N_i})]\begin{bmatrix}1\\\vdots\\1\end{bmatrix}=\frac{1}{N_i}X^T_i1_{N_i\times1} mi=Ni1j=1Niϕ(xj(i))=Ni1[ϕ(x1(i)),,ϕ(xNi(i))]11=Ni1XiT1Ni×1

m i m i T = 1 N i 2 X i T 1 N i × 1 1 1 × N i X i = 1 N i X i T B i X i \boldsymbol m_i\boldsymbol m_i^T=\frac{1}{N_i^2}X^T_i1_{N_i\times1}1_{1\times N_i}X_i=\frac{1}{N_i}X^T_iB_iX_i mimiT=Ni21XiT1Ni×111×NiXi=Ni1XiTBiXi

其中, B i = 1 N i 1 N i × N i B_i=\frac{1}{N_i}1_{N_i\times N_i} Bi=Ni11Ni×Ni组间分散矩阵为:
S b G D A = ∑ i = 1 L N i N m i m i T = 1 N ∑ i = 1 L X i T B i X i = 1 N [ X 1 T ⋯ X L T ] [ B 1 0 ⋱ 0 B L ] [ X i ⋮ X L ] = 1 N X T B X {\color{red}S^{GDA}_b}=\sum_{i=1}^L\frac{N_i}{N}\boldsymbol m_i\boldsymbol m_i^T=\frac{1}{N}\sum_{i=1}^LX^T_iB_iX_i=\frac{1}{N} \begin{bmatrix} X^T_1 & \cdots & X^T_L \end{bmatrix} \begin{bmatrix} B_1 & & 0\\ & \ddots &\\ 0 & & B_L \end{bmatrix} \begin{bmatrix} X_i \\ \vdots \\ X_L \end{bmatrix} =\frac{1}{N}X^TBX SbGDA=i=1LNNimimiT=N1i=1LXiTBiXi=N1[X1TXLT]B100BLXiXL=N1XTBX
组内分散矩阵为:
S w G D A = ∑ i = 1 L ∑ j = 1 N i 1 N ϕ ( x j ( i ) ) ϕ ( x j ( i ) ) T = 1 N ∑ i = 1 L [ ϕ ( x 1 ( i ) ) ⋯ ϕ ( x N i ( i ) ) ] [ ϕ ( x 1 ( i ) ) T ⋮ ϕ ( x N i ( i ) ) T ] = 1 N ∑ i = 1 L X i T X i = 1 N [ X 1 T ⋯ X L T ] [ X 1 ⋮ X L ] = 1 N X T X \begin{aligned} {\color{red}S^{GDA}_w}&=\sum_{i=1}^L\sum_{j=1}^{N_i}\frac{1}{N}\phi(\boldsymbol x^{(i)}_j)\phi(\boldsymbol x^{(i)}_j)^T\\ &=\frac{1}{N} \sum_{i=1}^L \begin{bmatrix} \phi(\boldsymbol x^{(i)}_1) & \cdots & \phi(\boldsymbol x^{(i)}_{N_i}) \end{bmatrix} \begin{bmatrix} \phi(\boldsymbol x^{(i)}_1)^T \\ \vdots \\ \phi(\boldsymbol x^{(i)}_{N_i})^T \end{bmatrix}\\ &=\frac{1}{N}\sum_{i=1}^LX^T_iX_i\\ &=\frac{1}{N} \begin{bmatrix} X_1^T & \cdots & X^T_L \end{bmatrix} \begin{bmatrix} X_1 \\ \vdots \\X_L \end{bmatrix}\\ &=\frac{1}{N}X^TX \end{aligned} SwGDA=i=1Lj=1NiN1ϕ(xj(i))ϕ(xj(i))T=N1i=1L[ϕ(x1(i))ϕ(xNi(i))]ϕ(x1(i))Tϕ(xNi(i))T=N1i=1LXiTXi=N1[X1TXLT]X1XL=N1XTX
同理,
S b G D A v = λ S w G D A v i . e .   ( 1 N X T B X ) v = λ ( 1 N X T X ) v   ( X 未 知 ) S^{GDA}_b\boldsymbol v=\lambda S^{GDA}_w \boldsymbol v \\ i.e.\ (\frac{1}{N}X^TBX)\boldsymbol v=\lambda (\frac{1}{N}X^TX)\boldsymbol v\ (X未知) SbGDAv=λSwGDAvi.e. (N1XTBX)v=λ(N1XTX)v (X)
假设 v v v可以由样本的线性组合表示,即
v = ∑ i = 1 L ∑ j = 1 N i α j ( i ) ϕ ( x j ( i ) ) = X T α . \boldsymbol v=\sum_{i=1}^L\sum_{j=1}^{N_i}\alpha_j^{(i)}\phi(\boldsymbol x^{(i)}_j)=X^T\boldsymbol \alpha. v=i=1Lj=1Niαj(i)ϕ(xj(i))=XTα.
将假设代入上式,
⇒ X T B X X T α = λ X T X X T α ⇒ X X T B X X T α = λ X X T X X T α ⇒ ( K B K ) α = λ ( K K ) α \begin{aligned} &\Rightarrow X^TBXX^T\boldsymbol \alpha=\lambda X^TXX^T\boldsymbol \alpha\\ &\Rightarrow XX^TBXX^T\boldsymbol \alpha=\lambda XX^TXX^T\boldsymbol \alpha\\ &\Rightarrow (KBK)\boldsymbol \alpha=\lambda(KK)\boldsymbol \alpha \end{aligned} XTBXXTα=λXTXXTαXXTBXXTα=λXXTXXTα(KBK)α=λ(KK)α
计算上式可获得 α \boldsymbol \alpha α,将测试样本投影到 v = X T α \boldsymbol v=X^T\boldsymbol \alpha v=XTα上,
v T ϕ ( x ) = ( X T α ) T ϕ ( x ) = α T [ ϕ ( x 1 ) T ⋮ ϕ ( x N ) T ] ϕ ( x ) = α T [ κ ( x 1 , x ) ⋮ κ ( x N , x ) ] \boldsymbol v^T\phi(\boldsymbol x)=(X^T\boldsymbol \alpha)^T\phi(\boldsymbol x)=\boldsymbol \alpha^T \begin{bmatrix} \phi(\boldsymbol x_1)^T \\ \vdots \\ \phi(\boldsymbol x_N)^T \end{bmatrix}\phi(\boldsymbol x) =\boldsymbol \alpha^T \begin{bmatrix} \kappa(\boldsymbol x_1,\boldsymbol x) \\ \vdots \\ \kappa(\boldsymbol x_N,\boldsymbol x) \end{bmatrix} vTϕ(x)=(XTα)Tϕ(x)=αTϕ(x1)Tϕ(xN)Tϕ(x)=αTκ(x1,x)κ(xN,x)
Ex: 在GDA中,组内分散矩阵为:
S w G D A = ∑ i = 1 L ∑ j = 1 N i 1 N ϕ ( x j ( i ) ) ϕ ( x j ( i ) ) T S^{GDA}_w=\sum_{i=1}^L\sum_{j=1}^{N_i}\frac{1}{N}\phi(\boldsymbol x^{(i)}_j)\phi(\boldsymbol x^{(i)}_j)^T SwGDA=i=1Lj=1NiN1ϕ(xj(i))ϕ(xj(i))T
而在LDA中,
S w L D A = ∑ i = 1 L ∑ j = 1 N i 1 N ( ϕ ( x j ( i ) ) − m i ) ( ϕ ( x j ( i ) ) − m i ) T m i = 1 N i ∑ j = 1 N i ϕ ( x j ( i ) ) S^{LDA}_w=\sum_{i=1}^{L}\sum_{j=1}^{N_i}\frac{1}{N}(\phi(\boldsymbol x_j^{(i)})-\boldsymbol m_i)(\phi(\boldsymbol x_j^{(i)})-\boldsymbol m_i)^T\\ \boldsymbol m_i=\frac{1}{N_i}\sum_{j=1}^{N_i}\phi(\boldsymbol x^{(i)}_j) SwLDA=i=1Lj=1NiN1(ϕ(xj(i))mi)(ϕ(xj(i))mi)Tmi=Ni1j=1Niϕ(xj(i))
能否由LDA推导GDA组内分散矩阵,并且找到 W W W,使得 S w L D A = X T W X S^{LDA}_w=X^TWX SwLDA=XTWX.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值