Kernel PCA介绍
三维点云课程---Kernel PCA介绍
经过上面PCA的介绍,发现PCA是个好东西,但是仔细分析发现PCA是线性,对于下面的情况,PCA就没有办法:
此时需要将数据先通过核函数转换到一个新的空间,也就是升维过程,然后再利用PCA进行降维处理。
1.推导过程
1.输入数据
x
i
∈
R
n
0
x_i \in R^{n_0}
xi∈Rn0 ,存在一个非线性函数映射
ϕ
:
R
n
0
→
R
n
1
\phi :{R^{{n_0}}} \to {R^{{n_1}}}
ϕ:Rn0→Rn1,其实就是升维过程
2.按照线性PCA的过程,对
R
n
1
R^{n_1}
Rn1进行PCA操作
-
2.1假设 ϕ ( x i ) \phi(x_i) ϕ(xi)总是零中心
1 N ∑ i = 1 N ϕ ( x i ) = 0 \frac{1}{N}\sum\limits_{i = 1}^N {\phi ({x_i}) = 0} N1i=1∑Nϕ(xi)=0 -
2.2计算协方差矩阵
H ~ = 1 N ∑ i = 1 N ϕ ( x i ) ϕ T ( x i ) \widetilde H = \frac{1}{N}\sum\limits_{i = 1}^N {\phi ({x_i})} {\phi ^T}({x_i}) H =N1i=1∑Nϕ(xi)ϕT(xi)
- 2.3计算协方差矩阵的特征值和特征向量
H ~ z ~ = λ ~ z ~ \widetilde H \widetilde z = \widetilde \lambda \widetilde z H z =λ z
上述过程看似很简单,但是还是有一些问题,第一个就是怎么确定上述的非线性函数 ϕ \phi ϕ;第二个怎么避免高维数据的运算。
通过2.2和2.3可以推出
H
~
z
~
=
1
N
∑
i
=
1
N
ϕ
(
x
i
)
ϕ
T
(
x
i
)
z
~
=
λ
~
z
~
\widetilde H \widetilde z = \frac{1}{N}\sum\limits_{i = 1}^N {\phi ({x_i})} {\phi ^T}({x_i})\widetilde z = \widetilde \lambda \widetilde z
H
z
=N1i=1∑Nϕ(xi)ϕT(xi)z
=λ
z
因为
ϕ
T
(
x
i
)
z
~
λ
~
N
\frac{{{\phi ^T}({x_i})\widetilde z}}{{\widetilde \lambda N}}
λ
NϕT(xi)z
是常数,故
z
~
=
∑
i
=
1
N
ϕ
(
x
i
)
ϕ
T
(
x
i
)
z
~
λ
~
N
=
∑
j
=
1
N
α
j
ϕ
(
x
j
)
\widetilde z = \sum\limits_{i = 1}^N {\phi ({x_i})} \frac{{{\phi ^T}({x_i})\widetilde z}}{{\widetilde \lambda N}} = \sum\limits_{j = 1}^N {{\alpha _j}\phi ({x_j})}
z
=i=1∑Nϕ(xi)λ
NϕT(xi)z
=j=1∑Nαjϕ(xj)
发现求解特征向量 z ~ \widetilde z z 可以转化为求解一系列系数 α j \alpha_j αj和一个非线性函数 ϕ ( x ) \phi(x) ϕ(x)
1.1求解 α j \alpha_j αj
再次利用
H
~
z
~
=
λ
~
z
~
\widetilde H \widetilde z = \widetilde \lambda \widetilde z
H
z
=λ
z
,以及上式
z
~
\widetilde z
z
的信息,可以得出
1
N
∑
i
=
1
N
ϕ
(
x
i
)
ϕ
T
(
x
i
)
(
∑
j
=
1
N
α
j
ϕ
(
x
j
)
)
=
λ
‾
∑
j
=
1
N
α
j
ϕ
(
x
j
)
1
N
∑
i
=
1
N
ϕ
(
x
i
)
(
∑
j
=
1
N
α
j
ϕ
T
(
x
i
)
ϕ
(
x
j
)
)
=
λ
‾
∑
j
=
1
N
α
j
ϕ
(
x
j
)
\frac{1}{N}\sum\limits_{i = 1}^N {\phi ({x_i}){\phi ^T}({x_i})} (\sum\limits_{j = 1}^N {{\alpha _j}\phi ({x_j})} ) = \overline \lambda \sum\limits_{j = 1}^N {{\alpha _j}\phi ({x_j})}\\ \frac{1}{N}\sum\limits_{i = 1}^N {\phi ({x_i})} (\sum\limits_{j = 1}^N {{\alpha _j}{\phi ^T}({x_i})\phi ({x_j})} ) = \overline \lambda \sum\limits_{j = 1}^N {{\alpha _j}\phi ({x_j})}
N1i=1∑Nϕ(xi)ϕT(xi)(j=1∑Nαjϕ(xj))=λj=1∑Nαjϕ(xj)N1i=1∑Nϕ(xi)(j=1∑NαjϕT(xi)ϕ(xj))=λj=1∑Nαjϕ(xj)
定义核函数
k
(
x
i
,
x
j
)
=
ϕ
T
(
x
i
)
ϕ
(
x
j
)
k(x_i,x_j)=\phi^T(x_i) \phi(x_j)
k(xi,xj)=ϕT(xi)ϕ(xj),那么上式化简为
1
N
∑
i
=
1
N
ϕ
(
x
i
)
(
∑
j
=
1
N
α
j
k
(
x
i
,
x
j
)
)
=
λ
‾
∑
j
=
1
N
α
j
ϕ
(
x
j
)
\frac{1}{N}\sum\limits_{i = 1}^N {\phi ({x_i})} (\sum\limits_{j = 1}^N {{\alpha _j}k({x_i},{x_j})} ) = \overline \lambda \sum\limits_{j = 1}^N {{\alpha _j}\phi ({x_j})}
N1i=1∑Nϕ(xi)(j=1∑Nαjk(xi,xj))=λj=1∑Nαjϕ(xj)
两边同时乘以
ϕ
T
(
x
k
)
\phi^T(x_k)
ϕT(xk),k=1,2…,N,化简得
1
N
∑
i
=
1
N
ϕ
T
(
x
k
)
ϕ
(
x
i
)
(
∑
j
=
1
N
α
j
k
(
x
i
,
x
j
)
)
=
λ
‾
∑
j
=
1
N
α
j
ϕ
T
(
x
k
)
ϕ
(
x
j
)
1
N
∑
i
=
1
N
k
(
x
k
,
x
i
)
∑
j
=
1
N
α
j
k
(
x
i
,
x
j
)
)
=
λ
‾
∑
j
=
1
N
α
j
k
(
x
k
,
x
j
)
,
k
=
1
,
.
.
.
.
,
N
\frac{1}{N}\sum\limits_{i = 1}^N {{\phi ^T}({x_k})\phi ({x_i})} (\sum\limits_{j = 1}^N {{\alpha _j}k({x_i},{x_j})} ) = \overline \lambda \sum\limits_{j = 1}^N {{\alpha _j}{\phi ^T}({x_k})\phi ({x_j})} \\ \frac{1}{N}\sum\limits_{i = 1}^N {k({x_k},{x_i})} \sum\limits_{j = 1}^N {{\alpha _j}k({x_i},{x_j})} ) = \overline \lambda \sum\limits_{j = 1}^N {{\alpha _j}k({x_k},{x_j})},k=1,....,N
N1i=1∑NϕT(xk)ϕ(xi)(j=1∑Nαjk(xi,xj))=λj=1∑NαjϕT(xk)ϕ(xj)N1i=1∑Nk(xk,xi)j=1∑Nαjk(xi,xj))=λj=1∑Nαjk(xk,xj),k=1,....,N
进而得代数形式
∑
i
=
1
N
∑
j
=
1
N
α
j
k
(
x
k
,
x
i
)
k
(
x
i
,
x
j
)
)
=
N
λ
‾
∑
j
=
1
N
α
j
k
(
x
k
,
x
j
)
,
k
=
1
,
.
.
.
.
,
N
\sum\limits_{i = 1}^N {\sum\limits_{j = 1}^N {{\alpha _j}k({x_k},{x_i})k({x_i},{x_j})} ) = N\overline \lambda \sum\limits_{j = 1}^N {{\alpha _j}k({x_k},{x_j})} } , k=1,....,N
i=1∑Nj=1∑Nαjk(xk,xi)k(xi,xj))=Nλj=1∑Nαjk(xk,xj),k=1,....,N
现在定义核矩阵
K
∈
R
n
×
n
,
K
(
i
,
j
)
=
k
(
x
i
,
x
j
)
K\in R^{n \times n},K(i,j)=k(x_i,x_j)
K∈Rn×n,K(i,j)=k(xi,xj),K是对称矩阵
通过将代数形式变为矩阵形式,思考一下怎么变形的?其实也要用到k=1,2,…,N这个信息,将每种k情况进行展开,然后在合并。
K
2
α
=
N
λ
~
K
α
K^2\alpha=N\widetilde \lambda K \alpha
K2α=Nλ
Kα
化简得
K
α
=
N
λ
~
α
令
N
λ
~
=
λ
K
α
=
λ
α
K \alpha=N\widetilde \lambda \alpha \\ 令N\widetilde \lambda=\lambda \\ K \alpha= \lambda \alpha
Kα=Nλ
α令Nλ
=λKα=λα
对上式进行特征值分解,可以得到特征向量
α
γ
\alpha_\gamma
αγ和特征值
λ
γ
,
γ
=
1
,
.
.
.
,
l
\lambda_\gamma,\gamma=1,...,l
λγ,γ=1,...,l
但是
z
~
\widetilde z
z
是个单位向量,而如果令
α
j
=
α
γ
\alpha_j=\alpha_\gamma
αj=αγ,就不能保证这个性质了。于是归一化单位向量
z
~
\widetilde z
z
1
=
z
~
γ
T
z
~
γ
1
=
∑
i
=
1
N
∑
j
=
1
N
α
γ
i
α
γ
j
ϕ
T
(
x
i
)
ϕ
(
x
j
)
=
∑
i
=
1
N
∑
j
=
1
N
α
γ
i
α
γ
j
k
(
x
i
,
k
j
)
1=\widetilde z^T_{\gamma} \widetilde z_{\gamma} \\ 1 = \sum\limits_{i = 1}^N {\sum\limits_{j = 1}^N {{\alpha _{\gamma_i}}{\alpha _{\gamma_j}}{\phi ^T}({x_i})\phi ({x_j})} } = \sum\limits_{i = 1}^N {\sum\limits_{j = 1}^N {{\alpha _{\gamma_i}}{\alpha _{\gamma_j}}k({x_i},{k_j})} }
1=z
γTz
γ1=i=1∑Nj=1∑NαγiαγjϕT(xi)ϕ(xj)=i=1∑Nj=1∑Nαγiαγjk(xi,kj)
写成矩阵形式,z证明见附录
1
=
α
γ
T
K
α
γ
1 = \alpha ^T_{\gamma }K{\alpha _\gamma }
1=αγTKαγ
因为
K
α
=
λ
α
K \alpha= \lambda \alpha
Kα=λα,所以上式
α
γ
T
α
γ
=
1
λ
γ
\alpha_\gamma^T \alpha_\gamma=\frac{1}{{{\lambda _\gamma }}}
αγTαγ=λγ1
于是我们只要对求出的 α γ \alpha_\gamma αγ类似归一化就是 α j \alpha_j αj,那么怎么进行类似归一化呢,详见附录
1.2 求解非线性函数 ϕ ( x ) \phi(x) ϕ(x)
因为非线性函数
ϕ
(
x
)
\phi(x)
ϕ(x)不能直接求出,那么特征向量
z
~
\widetilde z
z
就不能直接求出。但是我们并不需要特征向量的直接信息,因为我们最终会将数据重新投影到特征向量
z
~
\widetilde z
z
上,那么
y
r
=
ϕ
T
(
x
)
z
~
γ
=
∑
j
=
1
N
α
γ
j
ϕ
T
(
x
)
ϕ
(
x
j
)
=
∑
j
=
1
N
α
γ
j
k
(
x
,
x
j
)
{y_r} = {\phi ^T}(x){\widetilde z_\gamma } = \sum\limits_{j = 1}^N {{\alpha _{\gamma j}}{\phi ^T}(x)\phi ({x_j})} = \sum\limits_{j = 1}^N {{\alpha _{\gamma j}}k(x,{x_j})}
yr=ϕT(x)z
γ=j=1∑NαγjϕT(x)ϕ(xj)=j=1∑Nαγjk(x,xj)
根据上式可以知道,我们只需要知道$\alpha
和
核
函
数
k
即
可
求
出
投
影
后
的
和核函数k即可求出投影后的
和核函数k即可求出投影后的{y_r}$
说了这么多核函数,核函数有以下几个形式
-
线性核函数: k ( x i , x j ) = x i T x j k(x_i,x_j)=x^T_{i}x_j k(xi,xj)=xiTxj
-
多项式核函数: k ( x i , x j ) = ( 1 + x i T x j ) p k(x_i,x_j)=(1+x^T_{i}x_j)^p k(xi,xj)=(1+xiTxj)p
-
高斯核函数: k ( x i , x j ) = e − β ∣ ∣ x i − x j ∣ ∣ 2 k(x_i,x_j)=e^{-\beta||x_i-x_j||_2} k(xi,xj)=e−β∣∣xi−xj∣∣2
-
拉普拉斯核函数: k ( x i , x j ) = e − β ∣ ∣ x i − x j ∣ ∣ 1 k(x_i,x_j)=e^{-\beta||x_i-x_j||_1} k(xi,xj)=e−β∣∣xi−xj∣∣1
关于核函数的选择,没有明确的方法,需要在实验中不停的尝试。
1.3注意点
在推导Kernel PCA自前,我们假设了
ϕ
(
x
i
)
\phi(x_i)
ϕ(xi)总是零中心,而实际的
ϕ
(
x
i
)
\phi(x_i)
ϕ(xi)并不是零中心,那么
ϕ
~
(
x
i
)
=
ϕ
(
x
i
)
−
1
N
∑
j
=
1
N
ϕ
(
x
j
)
\widetilde \phi ({x_i}) = \phi ({x_i}) - \frac{1}{N}\sum\limits_{j = 1}^N {\phi ({x_j})}
ϕ
(xi)=ϕ(xi)−N1j=1∑Nϕ(xj)
那么单位化的核
k
~
(
x
i
,
x
j
)
\widetilde k(x_i,x_j)
k
(xi,xj)
k
~
(
x
i
,
x
j
)
=
ϕ
~
x
i
T
ϕ
~
(
x
j
)
=
(
ϕ
(
x
i
)
−
1
N
∑
k
=
1
N
ϕ
(
x
k
)
)
T
(
ϕ
(
x
j
)
−
1
N
∑
l
=
1
N
ϕ
(
x
l
)
)
=
k
(
x
i
,
x
j
)
−
1
N
∑
k
=
1
N
k
(
x
i
,
x
k
)
−
1
N
∑
k
=
1
N
k
(
x
j
,
x
k
)
+
1
N
2
∑
k
=
1
N
∑
l
=
1
N
k
(
x
k
,
x
l
)
\widetilde k(x_i,x_j)=\widetilde \phi^T_{x_i} \widetilde \phi(x_j)\\ ={(\phi ({x_i}) - \frac{1}{N}\sum\limits_{k = 1}^N {\phi ({x_k})} )^T}(\phi ({x_j}) - \frac{1}{N}\sum\limits_{l = 1}^N {\phi ({x_l})} )\\ =k({x_i},{x_j}) - \frac{1}{N}\sum\limits_{k = 1}^N {k({x_i},{x_k}) - \frac{1}{N}\sum\limits_{k = 1}^N {k({x_j},{x_k}) + \frac{1}{{{N^2}}}\sum\limits_{k = 1}^N {\sum\limits_{l = 1}^N {k({x_k},{x_l})} } } }
k
(xi,xj)=ϕ
xiTϕ
(xj)=(ϕ(xi)−N1k=1∑Nϕ(xk))T(ϕ(xj)−N1l=1∑Nϕ(xl))=k(xi,xj)−N1k=1∑Nk(xi,xk)−N1k=1∑Nk(xj,xk)+N21k=1∑Nl=1∑Nk(xk,xl)
转换成矩阵形式,具体证明参见附录
K
~
=
K
−
2
T
1
N
K
+
T
1
N
K
T
1
N
,
其
中
T
1
N
为
N
×
N
的
矩
阵
,
矩
阵
中
每
一
个
元
素
均
为
1
\widetilde K = K - 2{T_{\frac{1}{N}}}K + {T_{\frac{1}{N}}}K{T_{\frac{1}{N}}},其中T_{\frac{1}{N}}为N \times N的矩阵,矩阵中每一个元素均为1
K
=K−2TN1K+TN1KTN1,其中TN1为N×N的矩阵,矩阵中每一个元素均为1
2.Kernel PCA总结
-
选择一个核函数 k ( x i , x j ) k(x_i,x_j) k(xi,xj),计算核矩阵 K ( i , j ) = k ( x i , x j ) K(i,j)=k(x_i,x_j) K(i,j)=k(xi,xj)
-
单位化K
K ~ = K − 2 T 1 N K + T 1 N K T 1 N , 其 中 T 1 N 为 N × N 的 矩 阵 , 矩 阵 中 每 一 个 元 素 均 为 1 \widetilde K = K - 2{T_{\frac{1}{N}}}K + {T_{\frac{1}{N}}}K{T_{\frac{1}{N}}},其中T_{\frac{1}{N}}为N \times N的矩阵,矩阵中每一个元素均为1 K =K−2TN1K+TN1KTN1,其中TN1为N×N的矩阵,矩阵中每一个元素均为1 -
求解 K ~ \widetilde K K 特征值和特征向量
K ~ α γ = λ γ α γ \widetilde K \alpha_\gamma=\lambda_\gamma\alpha_\gamma K αγ=λγαγ
-
单位化 α γ T α γ = 1 λ γ \alpha_\gamma^T\alpha_\gamma=\frac{1}{\lambda_\gamma} αγTαγ=λγ1
-
将任意的点 x ∈ R n x \in R^n x∈Rn计算其投影到 r t h r^{th} rth的主成分 y r ∈ R y_r \in R yr∈R
y r = ϕ T ( x ) z ~ γ = ∑ j = 1 N α γ j ϕ T ( x ) ϕ ( x j ) = ∑ j = 1 N α γ j k ( x , x j ) {y_r} = {\phi ^T}(x){\widetilde z_\gamma } = \sum\limits_{j = 1}^N {{\alpha _{\gamma j}}{\phi ^T}(x)\phi ({x_j})} = \sum\limits_{j = 1}^N {{\alpha _{\gamma j}}k(x,{x_j})} yr=ϕT(x)z γ=j=1∑NαγjϕT(x)ϕ(xj)=j=1∑Nαγjk(x,xj)
附录
1.证明
代 数 形 式 ⇔ ∑ i = 1 N ∑ j = 1 N α γ i α γ j k ( x i , k j ) = α γ T K α γ ⇔ 矩 阵 形 式 代数形式\Leftrightarrow \sum\limits_{i = 1}^N {\sum\limits_{j = 1}^N {{\alpha _{\gamma_i}}{\alpha _{\gamma_j}}k({x_i},{k_j})} }=\alpha ^T_{\gamma}K{\alpha _\gamma } \Leftrightarrow 矩阵形式 代数形式⇔i=1∑Nj=1∑Nαγiαγjk(xi,kj)=αγTKαγ⇔矩阵形式
证明如下
先展开求和公式
∑
i
=
1
N
∑
j
=
1
N
α
γ
i
α
γ
j
k
(
x
i
,
k
j
)
=
α
γ
1
(
α
γ
1
k
(
x
1
,
k
1
)
+
α
γ
2
k
(
x
1
,
k
2
)
+
.
.
.
+
α
γ
N
k
(
x
1
,
k
N
)
)
+
α
γ
2
(
α
γ
1
k
(
x
1
,
k
1
)
+
α
γ
2
k
(
x
1
,
k
2
)
+
.
.
.
+
α
γ
N
k
(
x
1
,
k
N
)
)
+
.
.
.
+
α
γ
N
(
α
γ
1
k
(
x
1
,
k
1
)
+
α
γ
2
k
(
x
1
,
k
2
)
+
.
.
.
+
α
γ
N
k
(
x
1
,
k
N
)
)
\sum\limits_{i = 1}^N {\sum\limits_{j = 1}^N {{\alpha _{{\gamma _i}}}{\alpha _{{\gamma _j}}}k({x_i},{k_j})} } = {\alpha _{\gamma 1}}({\alpha _{\gamma 1}}k({x_1},{k_1}) + {\alpha _{\gamma 2}}k({x_1},{k_2}) + ... + {\alpha _{\gamma N}}k({x_1},{k_N})) + {\alpha _{\gamma 2}}({\alpha _{\gamma 1}}k({x_1},{k_1}) + {\alpha _{\gamma 2}}k({x_1},{k_2}) + ... + {\alpha _{\gamma N}}k({x_1},{k_N})) + ... + {\alpha _{\gamma N}}({\alpha _{\gamma 1}}k({x_1},{k_1}) + {\alpha _{\gamma 2}}k({x_1},{k_2}) + ... + {\alpha _{\gamma N}}k({x_1},{k_N}))
i=1∑Nj=1∑Nαγiαγjk(xi,kj)=αγ1(αγ1k(x1,k1)+αγ2k(x1,k2)+...+αγNk(x1,kN))+αγ2(αγ1k(x1,k1)+αγ2k(x1,k2)+...+αγNk(x1,kN))+...+αγN(αγ1k(x1,k1)+αγ2k(x1,k2)+...+αγNk(x1,kN))
将后面的求和写成矩阵形式
右 式 = [ α γ 1 α γ 2 . . . α γ N ] [ [ α γ 1 α γ 2 . . . α γ N ] [ k ( x 1 , x 1 ) k ( x 1 , x 2 ) . . . k ( x 1 x N ) ] [ α γ 1 α γ 2 . . . α γ N ] [ k ( x 2 , x 1 ) k ( x 2 , x 2 ) . . . k ( x 2 x N ) ] . . . [ α γ 1 α γ 2 . . . α γ N ] [ k ( x N , x 1 ) k ( x N , x 2 ) . . . k ( x N x N ) ] ] = [ [ α γ 1 α γ 2 . . . α γ N ] ] [ k ( x 1 , x 1 ) k ( x 1 , x 2 ) . . . k ( x 1 , x N ) k ( x 2 , x 1 ) k ( x 2 , x 2 ) . . . k ( x 2 , x N ) . . . . . . . . . . . . k ( x N , x 1 ) k ( x N , x 2 ) . . . k ( x N , x N ) ] [ α γ 1 α γ 2 . . . α γ N ] = α γ T K α γ 右式= \begin{bmatrix}{{\alpha _{\gamma 1}}}&{{\alpha _{\gamma 2}}}&{...}&{{\alpha _{\gamma N}}}\end{bmatrix} \begin{bmatrix} {\begin{bmatrix} {{\alpha _{\gamma 1}}}&{{\alpha _{\gamma 2}}}&{...}&{{\alpha _{\gamma N}}} \end{bmatrix} \begin{bmatrix} {k({x_1},{x_1})}\\ {k({x_1},{x_2})}\\ {...}\\ {k({x_1}{x_N})} \end{bmatrix}}\\ {\begin{bmatrix} {{\alpha _{\gamma 1}}}&{{\alpha _{\gamma 2}}}&{...}&{{\alpha _{\gamma N}}} \end{bmatrix} \begin{bmatrix} {k({x_2},{x_1})}\\ {k({x_2},{x_2})}\\ {...}\\ {k({x_2}{x_N})} \end{bmatrix}}\\ {...}\\ {\begin{bmatrix} {{\alpha _{\gamma 1}}}&{{\alpha _{\gamma 2}}}&{...}&{{\alpha _{\gamma N}}} \end{bmatrix} \begin{bmatrix} {k({x_N},{x_1})}\\ {k({x_N},{x_2})}\\ {...}\\ {k({x_N}{x_N})} \end{bmatrix}} \end{bmatrix}\\ = \left[ {\begin{bmatrix} {{\alpha _{\gamma 1}}}&{{\alpha _{\gamma 2}}}&{...}&{{\alpha _{\gamma N}}} \end{bmatrix}} \right] \begin{bmatrix} {k({x_1},{x_1})}&{k({x_1},{x_2})}&{...}&{k({x_1},{x_N})}\\ {k({x_2},{x_1})}&{k({x_2},{x_2})}&{...}&{k({x_2},{x_N})}\\ {...}&{...}&{...}&{...}\\ {k({x_N},{x_1})}&{k({x_N},{x_2})}&{...}&{k({x_N},{x_N})} \end{bmatrix} \begin{bmatrix} {{\alpha _{\gamma 1}}}\\ {{\alpha _{\gamma 2}}}\\ {...}\\ {{\alpha _{\gamma N}}} \end{bmatrix}\\ =\alpha {\gamma ^T}K{\alpha _\gamma } 右式=[αγ1αγ2...αγN]⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡[αγ1αγ2...αγN]⎣⎢⎢⎡k(x1,x1)k(x1,x2)...k(x1xN)⎦⎥⎥⎤[αγ1αγ2...αγN]⎣⎢⎢⎡k(x2,x1)k(x2,x2)...k(x2xN)⎦⎥⎥⎤...[αγ1αγ2...αγN]⎣⎢⎢⎡k(xN,x1)k(xN,x2)...k(xNxN)⎦⎥⎥⎤⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤=[[αγ1αγ2...αγN]]⎣⎢⎢⎡k(x1,x1)k(x2,x1)...k(xN,x1)k(x1,x2)k(x2,x2)...k(xN,x2)............k(x1,xN)k(x2,xN)...k(xN,xN)⎦⎥⎥⎤⎣⎢⎢⎡αγ1αγ2...αγN⎦⎥⎥⎤=αγTKαγ
2 类似归一化的推导
已知
α
γ
T
α
γ
=
1
λ
γ
\alpha_\gamma^T \alpha_\gamma=\frac{1}{{{\lambda _\gamma }}}
αγTαγ=λγ1
因为
α
γ
=
[
x
1
,
x
2
,
.
.
.
,
x
N
]
T
\alpha_\gamma=[x_1,x_2,...,x_N]^T
αγ=[x1,x2,...,xN]T,它本身并没有归一化,即
α
γ
T
α
γ
≠
1
λ
γ
\alpha_\gamma^T \alpha_\gamma \ne \frac{1}{{{\lambda _\gamma }}}
αγTαγ=λγ1,需要重新找到一个向量
β
γ
=
[
x
1
′
,
x
2
′
,
.
.
.
,
x
N
′
]
\beta_\gamma=[x'_1,x'_2,...,x'_N]
βγ=[x1′,x2′,...,xN′]满足
β
γ
T
β
γ
=
1
λ
γ
\beta_\gamma^T \beta_\gamma=\frac{1}{{{\lambda _\gamma }}}
βγTβγ=λγ1,并且满足以下条件,那么
x
′
1
x
1
=
x
′
2
x
2
=
.
.
.
=
x
′
N
x
N
=
k
\frac{{{{x'}_1}}}{{{x_1}}} = \frac{{{{x'}_2}}}{{{x_2}}} = ... = \frac{{{{x'}_N}}}{{{x_N}}} = k
x1x′1=x2x′2=...=xNx′N=k
综上所述:
{
β
γ
T
β
γ
=
x
′
1
2
+
x
′
2
2
+
.
.
.
+
x
′
N
2
=
1
λ
γ
x
′
1
=
k
x
1
x
′
2
=
k
x
2
.
.
.
x
′
N
=
k
x
N
\left\{ \begin{array}{l} \beta _\gamma ^T{\beta _\gamma } = {{x'}_1}^2 + {{x'}_2}^2 + ... + {{x'}_N}^2 = \frac{1}{{{\lambda _\gamma }}}\\ {{x'}_1} = k{x_1}\\ {{x'}_2} = k{x_2}\\ ...\\ {{x'}_N} = k{x_N} \end{array} \right.
⎩⎪⎪⎪⎪⎨⎪⎪⎪⎪⎧βγTβγ=x′12+x′22+...+x′N2=λγ1x′1=kx1x′2=kx2...x′N=kxN
进行求解
{
x
′
1
=
x
1
λ
γ
(
x
1
2
+
x
2
2
+
.
.
.
+
x
N
2
)
x
′
2
=
x
2
λ
γ
(
x
1
2
+
x
2
2
+
.
.
.
+
x
N
2
)
.
.
.
x
′
N
=
x
N
λ
γ
(
x
1
2
+
x
2
2
+
.
.
.
+
x
N
2
)
\left\{ \begin{array}{l} {{x'}_1} = \frac{{{x_1}}}{{\sqrt {\lambda_\gamma ({x_1}^2 + {x_2}^2 + ... + {x_N}^2)} }}\\ {{x'}_2} = \frac{{{x_2}}}{{\sqrt {\lambda_\gamma ({x_1}^2 + {x_2}^2 + ... + {x_N}^2)} }}\\ ...\\ {{x'}_N} = \frac{{{x_N}}}{{\sqrt {\lambda_\gamma ({x_1}^2 + {x_2}^2 + ... + {x_N}^2)} }} \end{array} \right.
⎩⎪⎪⎪⎪⎨⎪⎪⎪⎪⎧x′1=λγ(x12+x22+...+xN2)x1x′2=λγ(x12+x22+...+xN2)x2...x′N=λγ(x12+x22+...+xN2)xN
此时
β
γ
=
[
x
1
′
,
x
2
′
,
.
.
.
,
x
N
′
]
\beta_\gamma=[x'_1,x'_2,...,x'_N]
βγ=[x1′,x2′,...,xN′]就是类似归一化的结果
α
γ
\alpha_\gamma
αγ。
3.证明
k ( x i , x j ) − 1 N ∑ k = 1 N k ( x i , x k ) − 1 N ∑ k = 1 N k ( x j , x k ) + 1 N 2 ∑ k = 1 N ∑ l = 1 N k ( x k , x l ) = K − 2 T 1 N K + T 1 N K T 1 N k({x_i},{x_j}) - \frac{1}{N}\sum\limits_{k = 1}^N {k({x_i},{x_k}) - \frac{1}{N}\sum\limits_{k = 1}^N {k({x_j},{x_k}) + \frac{1}{{{N^2}}}\sum\limits_{k = 1}^N {\sum\limits_{l = 1}^N {k({x_k},{x_l})} } } }= K - 2{T_{\frac{1}{N}}}K + {T_{\frac{1}{N}}}K{T_{\frac{1}{N}}} k(xi,xj)−N1k=1∑Nk(xi,xk)−N1k=1∑Nk(xj,xk)+N21k=1∑Nl=1∑Nk(xk,xl)=K−2TN1K+TN1KTN1
先引入一个性质
[
k
11
k
12
k
13
k
21
k
22
k
23
k
31
k
32
k
33
]
[
1
1
1
1
1
1
1
1
1
]
=
[
k
11
+
k
12
+
k
13
k
11
+
k
12
+
k
13
k
11
+
k
12
+
k
13
k
21
+
k
22
+
k
23
k
21
+
k
22
+
k
23
k
21
+
k
22
+
k
23
k
31
+
k
32
+
k
33
k
31
+
k
32
+
k
33
k
31
+
k
32
+
k
33
]
\begin{bmatrix} {{k_{11}}}&{{k_{12}}}&{{k_{13}}}\\ {{k_{21}}}&{{k_{22}}}&{{k_{23}}}\\ {{k_{31}}}&{{k_{32}}}&{{k_{33}}} \end{bmatrix} \begin{bmatrix} 1&1&1\\ 1&1&1\\ 1&1&1 \end{bmatrix} = \begin{bmatrix} {{k_{11}} + {k_{12}} + {k_{13}}}&{{k_{11}} + {k_{12}} + {k_{13}}}&{{k_{11}} + {k_{12}} + {k_{13}}}\\ {{k_{21}} + {k_{22}} + {k_{23}}}&{{k_{21}} + {k_{22}} + {k_{23}}}&{{k_{21}} + {k_{22}} + {k_{23}}}\\ {{k_{31}} + {k_{32}} + {k_{33}}}&{{k_{31}} + {k_{32}} + {k_{33}}}&{{k_{31}} + {k_{32}} + {k_{33}}} \end{bmatrix}
⎣⎡k11k21k31k12k22k32k13k23k33⎦⎤⎣⎡111111111⎦⎤=⎣⎡k11+k12+k13k21+k22+k23k31+k32+k33k11+k12+k13k21+k22+k23k31+k32+k33k11+k12+k13k21+k22+k23k31+k32+k33⎦⎤
即当一个矩阵K和1矩阵进行右乘时
K
∗
1
K*1
K∗1,得到的矩阵,每一行都是K矩阵对应那一行所有元素之和;同理可以得到,当进行左乘时
1
∗
K
1*K
1∗K,得到的矩阵,每一列都是每一列都是K矩阵对应那一列所有元素之和;当进行
1
∗
K
∗
1
1*K*1
1∗K∗1,矩阵的每一个元素均是原先K矩阵所有元素之和。
通过以上性质,将左式的求和公式展开,即:
1
N
∑
k
=
1
N
k
(
x
i
,
x
k
)
→
K
∗
1
N
矩
阵
1
N
∑
k
=
1
N
k
(
x
j
,
x
k
)
→
1
N
矩
阵
∗
K
1
N
2
∑
k
=
1
N
∑
l
=
1
N
k
(
x
k
,
x
l
)
→
1
N
矩
阵
∗
K
∗
1
N
矩
阵
\frac{1}{N}\sum\limits_{k = 1}^N {k({x_i},{x_k})} \to K*\frac{1}{N}矩阵\\ \frac{1}{N}\sum\limits_{k = 1}^N {k({x_j},{x_k})} \to \frac{1}{N}矩阵*K\\ \frac{1}{{{N^2}}}\sum\limits_{k = 1}^N {\sum\limits_{l = 1}^N {k({x_k},{x_l})} } \to \frac{1}{N}矩阵*K*\frac{1}{N}矩阵
N1k=1∑Nk(xi,xk)→K∗N1矩阵N1k=1∑Nk(xj,xk)→N1矩阵∗KN21k=1∑Nl=1∑Nk(xk,xl)→N1矩阵∗K∗N1矩阵
最后在合并即可,其中
1
N
\frac{1}{N}
N1矩阵表示
N
×
N
N \times N
N×N,所有元素均为1的矩阵。