LLE 证明

a LLE 权重 求解

x i ∈ R d × 1 x_i \in \mathbb{R}^{d \times 1} xiRd×1, w i ∈ R 1 × n w_i \in \mathbb{R}^{1 \times n} wiR1×n, X = [ x 1 … x n ] X = [x_1 \dots x_n] X=[x1xn], w i = [ w i 1 … w i n ] w_i = [w_{i1} \dots w_{in}] wi=[wi1win].

min ⁡ w i ∑ i = 1 n ∥ x i − ∑ j = 1 n w i j x j ∥ 2 2 s . t . ∑ j = 1 n w i j = 1 \begin{aligned} \min_{w_i} & \sum_{i=1}^{n} \left\| x_i - \sum_{j=1}^{n} w_{ij} x_j \right\|_2^2 \\ {\rm s.t.} & \sum_{j=1}^{n} w_{ij} = 1 \\ \end{aligned} wimins.t.i=1nxij=1nwijxj22j=1nwij=1

∑ i = 1 n ∥ x i − ∑ j = 1 n w i j x j ∥ 2 2 = ∑ i = 1 n ∥ x i − ∑ j = 1 n w i j x j ∥ 2 2 = ∑ i = 1 n ∥ ∑ j = 1 n w i j x i − ∑ j = 1 n w i j x j ∥ 2 2 = ∑ i = 1 n ∥ ∑ j = 1 n w i j ( x i − x j ) ∥ 2 2 = ∑ i = 1 n ∥ ( x i 1 T − X ) w i T ∥ 2 2 = ∑ i = 1 n w i ( x i 1 T − X ) T ( x i 1 T − X ) w i T \begin{aligned} \sum_{i=1}^{n} \left\| x_i - \sum_{j=1}^{n} w_{ij} x_j \right\|_2^2 &= \sum_{i=1}^{n} \left\| x_i - \sum_{j=1}^{n} w_{ij} x_j \right\|_2^2 \\ &= \sum_{i=1}^{n} \left\| \sum_{j=1}^{n} w_{ij} x_i - \sum_{j=1}^{n} w_{ij} x_j \right\|_2^2 \\ &= \sum_{i=1}^{n} \left\| \sum_{j=1}^{n} w_{ij} (x_i - x_j) \right\|_2^2 \\ &= \sum_{i=1}^{n} \left\| (x_i1^T-X)w_i^T \right\|_2^2 \\ &= \sum_{i=1}^{n} w_i(x_i1^T-X)^T(x_i1^T-X)w_i^T \\ \end{aligned} i=1nxij=1nwijxj22=i=1nxij=1nwijxj22=i=1nj=1nwijxij=1nwijxj22=i=1nj=1nwij(xixj)22=i=1n(xi1TX)wiT22=i=1nwi(xi1TX)T(xi1TX)wiT

∑ j = 1 n w i j = 1 ⇔ w i 1 = 1 \sum_{j=1}^{n} w_{ij} = 1 \Leftrightarrow w_i 1 = 1 j=1nwij=1wi1=1

L = ∑ i = 1 n ( w i ( x i 1 T − X ) T ( x i 1 T − X ) w i T + μ i ( w i 1 − 1 ) ) L = \sum_{i=1}^{n} \left( w_i(x_i1^T-X)^T(x_i1^T-X)w_i^T + \mu_i (w_i1-1) \right) L=i=1n(wi(xi1TX)T(xi1TX)wiT+μi(wi11))

0 = ∂ ∂ w i L = ∂ ∂ w i ∑ i = 1 n ( w i ( x i 1 T − X ) T ( x i 1 T − X ) w i T + μ i ( w i 1 − 1 ) ) = ∂ ∂ w i ( w i ( x i 1 T − X ) T ( x i 1 T − X ) w i T + μ i ( w i 1 − 1 ) ) = 2 w i ( x i 1 T − X ) T ( x i 1 T − X ) + μ i 1 T \begin{aligned} 0 = \frac{\partial}{\partial w_i} L &= \frac{\partial}{\partial w_i} \sum_{i=1}^{n} \left( w_i(x_i1^T-X)^T(x_i1^T-X)w_i^T + \mu_i (w_i1-1) \right) \\ &= \frac{\partial}{\partial w_i} \left( w_i(x_i1^T-X)^T(x_i1^T-X)w_i^T + \mu_i (w_i1-1) \right) \\ &= 2w_i(x_i1^T-X)^T(x_i1^T-X) + \mu_i 1^T \\ \end{aligned} 0=wiL=wii=1n(wi(xi1TX)T(xi1TX)wiT+μi(wi11))=wi(wi(xi1TX)T(xi1TX)wiT+μi(wi11))=2wi(xi1TX)T(xi1TX)+μi1T

w i = − 1 2 μ i 1 T ( x i 1 T − X ) − 1 ( x i 1 T − X ) − T w_i = -\frac{1}{2} \mu_i 1^T (x_i1^T-X)^{-1}(x_i1^T-X)^{-T} wi=21μi1T(xi1TX)1(xi1TX)T

1 = w i 1 = − 1 2 μ i 1 T ( x i 1 T − X ) − 1 ( x i 1 T − X ) − T 1 1 = w_i1 = -\frac{1}{2} \mu_i 1^T (x_i1^T-X)^{-1}(x_i1^T-X)^{-T} 1 \\ 1=wi1=21μi1T(xi1TX)1(xi1TX)T1

− 1 2 μ i = 1 1 T ( x i 1 T − X ) − 1 ( x i 1 T − X ) − T 1 -\frac{1}{2} \mu_i = \frac{1}{1^T (x_i1^T-X)^{-1}(x_i1^T-X)^{-T} 1} \\ 21μi=1T(xi1TX)1(xi1TX)T11

w i = − 1 2 μ i 1 T ( x i 1 T − X ) − 1 ( x i 1 T − X ) − T = 1 T ( x i 1 T − X ) − 1 ( x i 1 T − X ) − T 1 T ( x i 1 T − X ) − 1 ( x i 1 T − X ) − T 1 = 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 \begin{aligned} w_i &= -\frac{1}{2} \mu_i 1^T (x_i1^T-X)^{-1}(x_i1^T-X)^{-T} \\ &= \frac{1^T (x_i1^T-X)^{-1}(x_i1^T-X)^{-T}}{1^T (x_i1^T-X)^{-1}(x_i1^T-X)^{-T} 1} \\ &= \frac{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1}}{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1} 1} \\ \end{aligned} wi=21μi1T(xi1TX)1(xi1TX)T=1T(xi1TX)1(xi1TX)T11T(xi1TX)1(xi1TX)T=1T[(xi1TX)T(xi1TX)]111T[(xi1TX)T(xi1TX)]1

w i j = ( 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 ) j = ( 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 ) j 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 = ( 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 ) j ∑ j = 1 n ( 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 ) j = ∑ k = 1 n ( [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 ) k j ∑ j = 1 n ∑ k = 1 n ( [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 ) k j \begin{aligned} w_{ij} &= \left( \frac{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1}}{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1} 1} \right)_j \\ &= \frac{\left( 1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1} \right)_j}{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1} 1} \\ &= \frac{\left( 1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1} \right)_j}{\sum\limits_{j=1}^{n} \left( 1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1} \right)_j} \\ &= \frac{\sum\limits_{k=1}^{n} \left( [(x_i1^T-X)^T(x_i1^T-X)]^{-1} \right)_{kj}}{\sum\limits_{j=1}^{n} \sum\limits_{k=1}^{n} \left( [(x_i1^T-X)^T(x_i1^T-X)]^{-1} \right)_{kj}} \\ \end{aligned} wij=(1T[(xi1TX)T(xi1TX)]111T[(xi1TX)T(xi1TX)]1)j=1T[(xi1TX)T(xi1TX)]11(1T[(xi1TX)T(xi1TX)]1)j=j=1n(1T[(xi1TX)T(xi1TX)]1)j(1T[(xi1TX)T(xi1TX)]1)j=j=1nk=1n([(xi1TX)T(xi1TX)]1)kjk=1n([(xi1TX)T(xi1TX)]1)kj

实际上 [ ( x i 1 T − X ) T ( x i 1 T − X ) ] [(x_i1^T-X)^T(x_i1^T-X)] [(xi1TX)T(xi1TX)]不可逆, 原因是 ( x i 1 T − X ) (x_i1^T-X) (xi1TX) i i i列是 0 0 0, 所以我们实际采用

w i = 1 T [ ( x i 1 T − X ( i ) ) T ( x i 1 T − X ( i ) ) ] − 1 1 T [ ( x i 1 T − X ( i ) ) T ( x i 1 T − X ( i ) ) ] − 1 1 w_i = \frac{1^T [(x_i1^T-X^{(i)})^T(x_i1^T-X^{(i)})]^{-1}}{1^T [(x_i1^T-X^{(i)})^T(x_i1^T-X^{(i)})]^{-1} 1} wi=1T[(xi1TX(i))T(xi1TX(i))]111T[(xi1TX(i))T(xi1TX(i))]1

w i j = ∑ k = 1 n ( i ) ( [ ( x i 1 T − X ( i ) ) T ( x i 1 T − X ( i ) ) ] − 1 ) k j ∑ j ′ = 1 n ( i ) ∑ k = 1 n ( i ) ( [ ( x i 1 T − X ( i ) ) T ( x i 1 T − X ( i ) ) ] − 1 ) k j ′ w_{ij} = \frac{\sum\limits_{k=1}^{n^{(i)}} \left( [(x_i1^T-X^{(i)})^T(x_i1^T-X^{(i)})]^{-1} \right)_{kj}}{\sum\limits_{j'=1}^{n^{(i)}} \sum\limits_{k=1}^{n^{(i)}} \left( [(x_i1^T-X^{(i)})^T(x_i1^T-X^{(i)})]^{-1} \right)_{kj'}} wij=j=1n(i)k=1n(i)([(xi1TX(i))T(xi1TX(i))]1)kjk=1n(i)([(xi1TX(i))T(xi1TX(i))]1)kj

其中 X ( i ) X^{(i)} X(i)仅包含 x i x_i xi的近邻(不包括 x i x_i xi自身), 共 n ( i ) n^{(i)} n(i)个近邻.

b LLE 权重 旋转/平移/缩放 不变性

w i ( X ) = 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 w_i(X) = \frac{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1}}{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1} 1} wi(X)=1T[(xi1TX)T(xi1TX)]111T[(xi1TX)T(xi1TX)]1

记对 x i x_i xi的变换为 f ( x i ) f(x_i) f(xi), 记对 X X X的变换为 F ( X ) F(X) F(X). w i ( T ( X ) ) = w i ( X ) w_i(T(X))=w_i(X) wi(T(X))=wi(X), 的充分条件有

1 T [ ( f ( x i ) 1 T − F ( X ) ) T ( f ( x i ) 1 T − F ( X ) ) ] − 1 1 T [ ( f ( x i ) 1 T − F ( X ) ) T ( f ( x i ) 1 T − F ( X ) ) ] − 1 1 = 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 ⇐ ( f ( x i ) 1 T − F ( X ) ) T ( f ( x i ) 1 T − F ( X ) ) = ( x i 1 T − X ) T ( x i 1 T − X ) ⇐ f ( x i ) 1 T − F ( X ) = x i 1 T − X \begin{aligned} & \frac{1^T [(f(x_i)1^T-F(X))^T(f(x_i)1^T-F(X))]^{-1}}{1^T [(f(x_i)1^T-F(X))^T(f(x_i)1^T-F(X))]^{-1} 1} = \frac{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1}}{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1} 1} \\ \Leftarrow& (f(x_i)1^T-F(X))^T(f(x_i)1^T-F(X)) = (x_i1^T-X)^T(x_i1^T-X) \\ \Leftarrow& f(x_i)1^T-F(X) = x_i1^T-X \\ \end{aligned} 1T[(f(xi)1TF(X))T(f(xi)1TF(X))]111T[(f(xi)1TF(X))T(f(xi)1TF(X))]1=1T[(xi1TX)T(xi1TX)]111T[(xi1TX)T(xi1TX)]1(f(xi)1TF(X))T(f(xi)1TF(X))=(xi1TX)T(xi1TX)f(xi)1TF(X)=xi1TX

旋转 f ( x i ) = Q x i , F ( X ) = Q X f(x_i)=Qx_i, F(X)=QX f(xi)=Qxi,F(X)=QX, 证明充分条件

( f ( x i ) 1 T − F ( X ) ) T ( f ( x i ) 1 T − F ( X ) ) = ( Q x i 1 T − Q X ) T ( Q x i 1 T − Q X ) = [ Q ( x i 1 T − X ) ] T [ Q ( x i 1 T − X ) ] = ( x i 1 T − X ) T Q T Q ( x i 1 T − X ) = ( x i 1 T − X ) T I ( x i 1 T − X ) = ( x i 1 T − X ) T ( x i 1 T − X ) \begin{aligned} & (f(x_i)1^T-F(X))^T(f(x_i)1^T-F(X)) \\ &= (Qx_i1^T-QX)^T(Qx_i1^T-QX) \\ &= [Q(x_i1^T-X)]^T[Q(x_i1^T-X)] \\ &= (x_i1^T-X)^TQ^TQ(x_i1^T-X) \\ &= (x_i1^T-X)^TI(x_i1^T-X) \\ &= (x_i1^T-X)^T(x_i1^T-X) \\ \end{aligned} (f(xi)1TF(X))T(f(xi)1TF(X))=(Qxi1TQX)T(Qxi1TQX)=[Q(xi1TX)]T[Q(xi1TX)]=(xi1TX)TQTQ(xi1TX)=(xi1TX)TI(xi1TX)=(xi1TX)T(xi1TX)

平移 f ( x i ) = x i + v , F ( X ) = X + v 1 T f(x_i)=x_i+v, F(X)=X+v1^T f(xi)=xi+v,F(X)=X+v1T, 证明充分条件

f ( x i ) 1 T − F ( X ) = ( x i + v ) 1 T − ( X + v 1 T ) = x i 1 T + v 1 T − X − v 1 T = x i 1 T − X \begin{aligned} & f(x_i)1^T-F(X) \\ &= (x_i+v)1^T-(X+v1^T) \\ &= x_i1^T+v1^T-X-v1^T \\ &= x_i1^T-X \\ \end{aligned} f(xi)1TF(X)=(xi+v)1T(X+v1T)=xi1T+v1TXv1T=xi1TX

伸缩 f ( x i ) = a x i , F ( X ) = a X f(x_i)=ax_i, F(X)=aX f(xi)=axi,F(X)=aX, 证明充要条件

1 T [ ( f ( x i ) 1 T − F ( X ) ) T ( f ( x i ) 1 T − F ( X ) ) ] − 1 1 T [ ( f ( x i ) 1 T − F ( X ) ) T ( f ( x i ) 1 T − F ( X ) ) ] − 1 1 = 1 T [ ( a x i 1 T − a X ) T ( a x i 1 T − a X ) ] − 1 1 T [ ( a x i 1 T − a X ) T ( a x i 1 T − a X ) ] − 1 1 = 1 T [ a 2 ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 T [ a 2 ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 = a − 2 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 a − 2 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 = 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 T [ ( x i 1 T − X ) T ( x i 1 T − X ) ] − 1 1 \begin{aligned} & \frac{1^T [(f(x_i)1^T-F(X))^T(f(x_i)1^T-F(X))]^{-1}}{1^T [(f(x_i)1^T-F(X))^T(f(x_i)1^T-F(X))]^{-1} 1} \\ &= \frac{1^T [(ax_i1^T-aX)^T(ax_i1^T-aX)]^{-1}}{1^T [(ax_i1^T-aX)^T(ax_i1^T-aX)]^{-1} 1} \\ &= \frac{1^T [a^2(x_i1^T-X)^T(x_i1^T-X)]^{-1}}{1^T [a^2(x_i1^T-X)^T(x_i1^T-X)]^{-1} 1} \\ &= \frac{a^{-2} 1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1}}{a^{-2} 1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1} 1} \\ &= \frac{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1}}{1^T [(x_i1^T-X)^T(x_i1^T-X)]^{-1} 1} \\ \end{aligned} 1T[(f(xi)1TF(X))T(f(xi)1TF(X))]111T[(f(xi)1TF(X))T(f(xi)1TF(X))]1=1T[(axi1TaX)T(axi1TaX)]111T[(axi1TaX)T(axi1TaX)]1=1T[a2(xi1TX)T(xi1TX)]111T[a2(xi1TX)T(xi1TX)]1=a21T[(xi1TX)T(xi1TX)]11a21T[(xi1TX)T(xi1TX)]1=1T[(xi1TX)T(xi1TX)]111T[(xi1TX)T(xi1TX)]1

c LLE 低维表示 意义

max ⁡ y i ∑ i = 1 n ∥ y i − ∑ j = 1 n w i j y j ∥ 2 2 s . t . ∑ i = 1 n y i = 0 Y 1 = 0 ∑ i = 1 n y i y i T = I Y Y T = I \begin{aligned} \max_{y_i} & \sum_{i=1}^{n} \left\| y_i - \sum_{j=1}^{n} w_{ij} y_j \right\|_2^2 \\ {\rm s.t.} & \sum\limits_{i=1}^n y_i=0 & Y1=0 \\ & \sum\limits_{i=1}^n y_iy_i^T=I & YY^T=I \\ \end{aligned} yimaxs.t.i=1nyij=1nwijyj22i=1nyi=0i=1nyiyiT=IY1=0YYT=I

为什么保持了局部几何性质?
∑ i = 1 n ∥ x i − ∑ j = 1 n w i j x j ∥ 2 2 = ∑ i = 1 n ∥ y i − ∑ j = 1 n w i j y j ∥ 2 2 \sum_{i=1}^{n} \left\| x_i - \sum_{j=1}^{n} w_{ij} x_j \right\|_2^2 = \sum_{i=1}^{n} \left\| y_i - \sum_{j=1}^{n} w_{ij} y_j \right\|_2^2 i=1nxij=1nwijxj22=i=1nyij=1nwijyj22
x i x_i xi y i y_i yi共享了局部的权重 w i w_i wi, 保持邻域内样本之间的线性关系.
样本点 x i x_i xi的坐标能通过它的邻域样本 x j x_j xj通过线性组合而重构出来, 降维后样本点 y i y_i yi的坐标能通过它的邻域样本 y j y_j yj通过相同的线性组合而重构出来. 从而, 原空间邻域内样本之间的线性关系在降维后的低维空间以保持.

从统计的角度

  • Y 1 = 0 Y1=0 Y1=0消去各个维度均值不确定性.
  • Y Y T = I YY^T=I YYT=I消去各个维度方差不确定性,消去各个维度之间线性相关性不确定性.

从几何的角度

  • Y 1 = 0 Y1=0 Y1=0消去平移不确定性.
  • Y Y T = I YY^T=I YYT=I消去伸缩不确定性.

证明:

  • Y 1 = 0 Y1=0 Y1=0消去平移不确定性?
    Y ′ = Y + v 1 T Y'=Y+v1^T Y=Y+v1T, Y ′ 1 = 0. Y'1=0. Y1=0..
    Y ′ 1 = Y 1 + v 1 T 1 = 0 + v = v = s e t 0 ⇒ v = 0 Y'1=Y1+v1^T1=0+v=v\stackrel{\rm set}{=}0 \Rightarrow v=0 Y1=Y1+v1T1=0+v=v=set0v=0
    v v v没有自由度, 所以消去了平移不确定性.
  • Y Y T = I YY^T=I YYT=I消去旋转不确定性?
    Y ′ = Q Y Y'=QY Y=QY, Q T Q = I Q^TQ=I QTQ=I, Y ′ 1 = 0 Y'1=0 Y1=0, Y ′ Y ′ T = I Y'Y'^T=I YYT=I.
    Y ′ 1 = Q Y 1 = 0 ≡ s e t 0 Y'1=QY1=0\stackrel{\rm set}{\equiv}0 Y1=QY1=0set0
    Y ′ Y ′ T = Q Y Y T Q T = Q Q T = I ≡ s e t I Y'Y'^T=QYY^TQ^T=QQ^T=I\stackrel{\rm set}{\equiv}I YYT=QYYTQT=QQT=IsetI
    Q Q Q仍有自由度, 所以没有消去旋转不确定性.
  • Y Y T = I YY^T=I YYT=I消去伸缩不确定性?
    Y ′ = a Y Y'=aY Y=aY, a ≠ 0 a \neq 0 a=0, Y ′ Y ′ T = I Y'Y'^T=I YYT=I.
    Y ′ Y ′ T = ( a Y ) ( a Y ) T = a 2 Y Y T = a 2 I = s e t I ⇒ a = 1 Y'Y'^T=(aY)(aY)^T=a^2YY^T=a^2I\stackrel{\rm set}{=}I \Rightarrow a=1 YYT=(aY)(aY)T=a2YYT=a2I=setIa=1
    a a a没有自由度, 所以消去了伸缩不确定性.

没有消去旋转不确定性, 会对新的表示产生负面的影响.

  • y i y_i yi分量之间可能独立, 但由于没有消去旋转不确定, 实际得到 y i y_i yi分量之间不独立.
  • Y Y Y有稀疏解, 但由于没有消去旋转不确定性, 实际得到一个稠密解.
    实际上 Q T Q = I Q^TQ=I QTQ=I, Q Q Q的语义除了旋转, 还有反射和排列.
    从信号处理的角度, 消去旋转不确定性相当于盲信号分离(通常假设信号之间独立).
    从机器学习的角度, 我们可以添加适当的正则化项, 例如 ∥ Y ∥ 1 \|Y\|_1 Y1, 迫使尽量多的分量与坐标轴同向, 消除一部分旋转不确定性, 进而获得稀疏解.

d LLE 低维表示 优化
引理 迹(trace)的性质
tr(AB) = tr(BA)   tr(ABC) = tr(CAB) = tr(BCA)

y i ∈ R d ′ × 1 y_i \in \mathbb{R}^{d' \times 1} yiRd×1, w i ∈ R 1 × n w_i \in \mathbb{R}^{1 \times n} wiR1×n, Y = [ y 1 … y n ] Y = [y_1 \dots y_n] Y=[y1yn], w i = [ w i 1 … w i n ] w_i = [w_{i1} \dots w_{in}] wi=[wi1win].
( e i ) k = { 1 i f   k = 1 0 o t h e r w i s e (e_i)_k = \begin{cases} 1 & {\rm if} ~ k=1 \\ 0 & {\rm otherwise} \\ \end{cases} (ei)k={10if k=1otherwise, I = [ e 1 … e n ] I = [e_1 \dots e_n] I=[e1en].

注意 ( I − W T ) T (I-W^T)^T (IWT)T Y T Y Y^TY YTY ( I − W T ) (I-W^T) (IWT)都是 n × n n \times n n×n的方形矩阵, 可以使用引理.

∑ i = 1 n ∥ y i − ∑ j = 1 n w i j y j ∥ 2 2 = ∑ i = 1 n ∥ Y e i − Y w i T ∥ 2 2 = ∥ Y I − Y W T ∥ F 2 = ∥ Y ( I − W T ) ∥ F 2 = t r { [ Y ( I − W T ) ] T [ Y ( I − W T ) ] } = t r [ ( I − W T ) T Y T Y ( I − W T ) ] = t r [ ( I − W T ) ( I − W T ) T Y T Y ] = t r [ ( I − W ) T ( I − W ) Y T Y ] = t r [ M Y T Y ] = ∑ k = 1 n ∑ i = 1 n M k i ( Y T Y ) i k = ∑ k = 1 n ∑ i = 1 n M k i y i T y k \begin{aligned} \sum_{i=1}^{n} \left\| y_i - \sum_{j=1}^{n} w_{ij} y_j \right\|_2^2 &= \sum_{i=1}^{n} \left\| Ye_i - Yw_i^T \right\|_2^2 \\ &= \left\| YI - YW^T \right\|_F^2 \\ &= \left\| Y(I-W^T) \right\|_F^2 \\ &= {\rm tr}\{[Y(I-W^T)]^T[Y(I-W^T)]\} \\ &= {\rm tr}[(I-W^T)^TY^TY(I-W^T)] \\ &= {\rm tr}[(I-W^T)(I-W^T)^TY^TY] \\ &= {\rm tr}[(I-W)^T(I-W)Y^TY] \\ &= {\rm tr}[MY^TY] \\ &= \sum_{k=1}^{n}\sum_{i=1}^{n} M_{ki} \left(Y^TY\right)_{ik} \\ &= \sum_{k=1}^{n}\sum_{i=1}^{n} M_{ki} y_i^Ty_k \\ \end{aligned} i=1nyij=1nwijyj22=i=1nYeiYwiT22=YIYWTF2=Y(IWT)F2=tr{[Y(IWT)]T[Y(IWT)]}=tr[(IWT)TYTY(IWT)]=tr[(IWT)(IWT)TYTY]=tr[(IW)T(IW)YTY]=tr[MYTY]=k=1ni=1nMki(YTY)ik=k=1ni=1nMkiyiTyk

e LLE 低维表示 优化 求解

M = ( I − W ) T ( I − W ) = ( I − W T ) ( I − W ) = I − W − W T + W T W \begin{aligned} M &= (I-W)^T(I-W) \\ &= (I-W^T)(I-W) \\ &= I-W-W^T+W^TW \\ \end{aligned} M=(IW)T(IW)=(IWT)(IW)=IWWT+WTW

e.1 M M M的半正定性

要证明 M M M是半正定矩阵, 只需证明对任意 n n n维向量 v ≠ 0 v \neq 0 v=0都有 v T M v ⩾ 0 v^TMv \geqslant 0 vTMv0

  • v T M v = v T ( I − W ) T ( I − W ) v = [ ( I − W ) v ] T [ ( I − W ) v ] = ∥ ( I − W ) v ∥ 2 2 ⩾ 0 v^TMv = v^T(I-W)^T(I-W)v = [(I-W)v]^T[(I-W)v] = \|(I-W)v\|_2^2 \geqslant 0 vTMv=vT(IW)T(IW)v=[(IW)v]T[(IW)v]=(IW)v220
    v T X v = ( L T v ) T D ( L T v ) v v^TXv = (L^Tv)^TD(L^Tv)v vTXv=(LTv)TD(LTv)v
  • 综上所述, 对任意 n n n维向量 v ≠ 0 v \neq 0 v=0都有 v T M v ⩾ 0 v^TMv \geqslant 0 vTMv0, 即 M M M是半正定矩阵.

e.2 M M M的特征向量 1 1 1

注意 w i 1 = 1 w_i1=1 wi1=1, 所以 W 1 = 1 W1=1 W1=1.
(注意 w ⃗ i 1 ⃗ = 1 \vec w_i \vec 1 = 1 w i1 =1, 所以 W 1 ⃗ = 1 ⃗ \boldsymbol W \vec 1 = \vec 1 W1 =1 .)

M 1 = ( I − W − W T + W T W ) 1 = I 1 − W 1 − W T 1 + W T W 1 = 1 − 1 − W T 1 + W T 1 = 0 \begin{aligned} M1 &= (I-W-W^T+W^TW)1 \\ &= I1-W1-W^T1+W^TW1 \\ &= 1-1-W^T1+W^T1 \\ &= 0 \\ \end{aligned} M1=(IWWT+WTW)1=I1W1WT1+WTW1=11WT1+WT1=0

注意 M 1 = 01 M1=01 M1=01, 所以 M M M的特征值 0 0 0有一个特征向量是 1 1 1.
(注意 M 1 ⃗ = 0 ⋅ 1 ⃗ \boldsymbol M \vec 1 = 0 \cdot \vec 1 M1 =01 , 所以 M \boldsymbol M M的特征值 0 0 0有一个特征向量是 1 ⃗ \vec 1 1 .)

f 实际求解过程中, 对 M M M特征值分解, 特征值升序排列, 去除最小特征值的特征向量 ξ 1 \xi_1 ξ1, Y T = [ y 1 … y n ] T = [ ξ 2 … ξ d ′ + 1 ] Y^T=[y_1 \dots y_n]^T = [\xi_2 \dots \xi_{d'+1}] YT=[y1yn]T=[ξ2ξd+1]就是最终的低维表示.

f.1 舍弃 ξ 1 \xi_1 ξ1, 实际上是舍弃 1 1 1

  • 由于 M M M是半正定矩阵, 所以 M M M的特征值及其特征向量
    0 ⩽ σ 1 ⩽ σ 2 ⩽ ⋯ ⩽ σ n 0 ⩽ ξ 1 ⩽ ξ 2 ⩽ … ⩽ ξ n \begin{aligned} 0 \leqslant \sigma_1 \leqslant \sigma_2 \leqslant \dots \leqslant \sigma_n \\ \phantom{0 \leqslant} \xi_1 \phantom{\leqslant} \xi_2 \phantom{\leqslant} \dots \phantom{\leqslant} \xi_n \\ \end{aligned} 0σ1σ2σn0ξ1ξ2ξn
  • M M M具有特征值 0 0 0及其特征向量是 1 1 1, 所以 M M M的特征值及其特征向量
    0 = σ 1 ⩽ σ 2 ⩽ ⋯ ⩽ σ n 0 = 1 ⩽ ξ 2 ⩽ … ⩽ ξ n \begin{aligned} 0 = \sigma_1 \leqslant \sigma_2 \leqslant \dots \leqslant \sigma_n \\ \phantom{0 =} 1 \phantom{\leqslant} \xi_2 \phantom{\leqslant} \dots \phantom{\leqslant} \xi_n \\ \end{aligned} 0=σ1σ2σn0=1ξ2ξn

f.2 舍弃 1 1 1的原因

  • 从直观的角度, 每个 y i y_i yi的这个分量都是1, 因此这个分量不包含任何信息, 所以舍弃.
  • 从优化的角度, 舍弃 1 1 1能确保约束条件 ∑ i = 1 n y i = 0 \sum\limits_{i=1}^n y_i=0 i=1nyi=0.
    • 如果不舍弃 1 1 1, 由于每个 y i y_i yi的这个分量都是1, 所以 ( ∑ i = 1 n y i ) \left(\sum\limits_{i=1}^n y_i\right) (i=1nyi)的这个分量都是 n n n, 不等于0.
    • 如果舍弃 1 1 1, 由于 M M M矩阵是规正矩阵(实对称矩阵/Hermitian矩阵是规正矩阵的子集), 所以 M M M矩阵的特征向量相互正交.
      首先注意到 Y T = [ y 1 … y n ] T = [ ξ 2 … ξ d ′ + 1 ] Y^T=[y_1 \dots y_n]^T = [\xi_2 \dots \xi_{d'+1}] YT=[y1yn]T=[ξ2ξd+1]
      然后注意到 ξ 1 T ξ i = 0 , i ≠ 1 \xi_1^T\xi_i=0, i \neq 1 ξ1Tξi=0,i=1
      ξ 1 T [ ξ 2 … ξ d ′ + 1 ] = 0 \xi_1^T[\xi_2 \dots \xi_{d'+1}]=0 ξ1T[ξ2ξd+1]=0
      1 T [ ξ 2 … ξ d ′ + 1 ] = 0 1^T[\xi_2 \dots \xi_{d'+1}]=0 1T[ξ2ξd+1]=0
      1 T [ ξ 2 … ξ d ′ + 1 ] = 0 1^T[\xi_2 \dots \xi_{d'+1}]=0 1T[ξ2ξd+1]=0
      1 T Y T = 0 1^TY^T=0 1TYT=0
      Y 1 = 0 Y1=0 Y1=0
      ∑ i = 1 n y i = 0 \sum\limits_{i=1}^n y_i=0 i=1nyi=0
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
毕设新项目-基于Java开发的智慧养老院信息管理系统源码+数据库(含vue前端源码).zip 【备注】 1、该资源内项目代码都经过测试运行成功,功能ok的情况下才上传的,请放心下载使用!有问题请及时沟通交流。 2、适用人群:计算机相关专业(如计科、信息安全、数据科学与大数据技术、人工智能、通信、物联网、自动化、电子信息等)在校学生、专业老师或者企业员工下载使用。 3、用途:项目具有较高的学习借鉴价值,不仅适用于小白学习入门进阶。也可作为毕设项目、课程设计、大作业、初期项目立项演示等。 4、如果基础还行,或热爱钻研,亦可在此项目代码基础上进行修改添加,实现其他不同功能。 欢迎下载!欢迎交流学习!不清楚的可以私信问我! 毕设新项目-基于Java开发的智慧养老院信息管理系统源码+数据库(含vue前端源码).zip毕设新项目-基于Java开发的智慧养老院信息管理系统源码+数据库(含vue前端源码).zip毕设新项目-基于Java开发的智慧养老院信息管理系统源码+数据库(含vue前端源码).zip毕设新项目-基于Java开发的智慧养老院信息管理系统源码+数据库(含vue前端源码).zip毕设新项目-基于Java开发的智慧养老院信息管理系统源码+数据库(含vue前端源码).zip毕设新项目-基于Java开发的智慧养老院信息管理系统源码+数据库(含vue前端源码).zip毕设新项目-基于Java开发的智慧养老院信息管理系统源码+数据库(含vue前端源码).zip毕设新项目-基于Java开发的智慧养老院信息管理系统源码+数据库(含vue前端源码).zip毕设新项目-基于Java开发的智慧养老院信息管理系统源码+数据库(含vue前端源码).zip
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值