经典的NIPALS流程如下
for
a
=
1
:
A
,
a=1: A, \quad
a=1:A, (A-the number of components to be extracted)
- v a = X a − 1 t y \mathbf{v}_{a}=\mathbf{X}_{a-1}^{t} \mathbf{y} va=Xa−1ty
- w a = v a / ∥ v a ∥ \mathbf{w}_{a}=\mathbf{v}_{a} /\left\|\mathbf{v}_{a}\right\| wa=va/∥va∥
- τ a = X a − 1 w a \boldsymbol{\tau}_{a}=\mathbf{X}_{a-1} \mathbf{w}_{a} τa=Xa−1wa
- t a = τ a / ∥ τ a ∥ \mathbf{t}_{a}=\boldsymbol{\tau}_{a} /\left\|\boldsymbol{\tau}_{a}\right\| ta=τa/∥τa∥
- p a = X a − 1 t t a \mathbf{p}_{a}=\mathbf{X}_{a-1}^{t} \mathbf{t}_{a} pa=Xa−1tta
- x a = X a − 1 − t a p a t \mathbf{x}_{a}=\mathbf{X}_{a-1}-\mathbf{t}_{a} \mathbf{p}_{a}^{t} xa=Xa−1−tapat (the deflation step)
- q a = t a t y q_{a}=\mathbf{t}_{a}^{t} \mathbf{y} qa=taty
end
T
=
[
t
1
t
2
…
t
A
]
\mathbf{T}=\left[\mathbf{t}_{1} \mathbf{t}_{2} \ldots \mathbf{t}_{A}\right]
T=[t1t2…tA] (the orthonormal scores)
W
=
[
w
1
w
2
…
w
A
]
\mathbf{W}=\left[\mathbf{w}_{1} \mathbf{w}_{2} \ldots \mathbf{w}_{A}\right]
W=[w1w2…wA] (the orthonormal weights)
P
=
[
p
1
p
2
…
p
A
]
\mathbf{P}=\left[\mathbf{p}_{1} \mathbf{p}_{2} \ldots \mathbf{p}_{A}\right]
P=[p1p2…pA] (the matrix of
X
\mathbf{X}
X -loadings)
q
t
=
[
q
1
q
2
…
q
A
]
\mathbf{q}^{t}=\left[q_{1} q_{2} \ldots q_{A}\right]
qt=[q1q2…qA] (the vector of
y
\mathbf{y}
y -loadings)
PLS1的系数计算
y
=
T
q
y = Tq
y=Tq我们希望得到T如何用
X
0
X_0
X0表达
假设有
T
=
X
0
W
∗
T = X_0W^*
T=X0W∗,显然,
w
a
∗
w_a^*
wa∗可以和
w
1
,
⋯
,
w
a
−
1
{w_1,\cdots,w_{a-1}}
w1,⋯,wa−1关于
X
0
T
X
0
X_0^TX_0
X0TX0共轭得到
观察以下常用的计算公式
β
P
L
S
=
W
(
P
t
W
)
−
1
q
\boldsymbol{\beta}_{PLS}=\mathbf{W}\left(\mathbf{P}^{t} \mathbf{W}\right)^{-1} \mathbf{q}
βPLS=W(PtW)−1q
可以得出
W
∗
=
W
(
P
t
W
)
−
1
\mathbf{W^*}=\mathbf{W}\left(\mathbf{P}^{t} \mathbf{W}\right)^{-1}
W∗=W(PtW)−1
v \mathbf{v} v或者 w \mathbf{w} w两两正交,但并非关于 X 0 T X 0 \mathbf{X_0^TX_0} X0TX0共轭,即 X 0 w i \mathbf{X_0w_i} X0wi之间并不存在正交关系。
令 τ a ∗ = X 0 w a \boldsymbol{\tau}_{a}^*=\mathbf{X}_{0} \mathbf{w}_{a} τa∗=X0wa, τ a ∗ \boldsymbol{\tau}_{a}^* τa∗ 需要与 s p a n { t 1 , ⋯ , t a − 1 } span\{t_1,\cdots,t_{a-1}\} span{t1,⋯,ta−1}经过一个 Gram–Schmidt 正交化得到 t a t_{a} ta
y
T
X
a
=
y
T
X
a
−
1
−
y
T
t
a
p
a
T
→
v
a
+
1
T
=
v
a
T
−
v
a
T
w
a
p
a
T
/
∣
∣
τ
a
∣
∣
→
\mathbf{y^TX}_a =\mathbf{y^TX}_{a-1}-\mathbf{y^Tt}_a\mathbf{p}_a^T\rightarrow \mathbf{v}_{a+1}^T =\mathbf{v}_{a}^T-\mathbf{v}_{a}^Tw_ap_a^T/||\boldsymbol{\tau}_{a}|| \rightarrow
yTXa=yTXa−1−yTtapaT→va+1T=vaT−vaTwapaT/∣∣τa∣∣→
v
a
+
1
=
v
a
−
∣
∣
v
a
∣
∣
∣
∣
τ
a
∣
∣
p
a
v_{a+1} =v_{a} -\frac{||v_a||}{||\tau_a||}p_a
va+1=va−∣∣τa∣∣∣∣va∣∣pa
我们可以发现,不用再去计算残差,只需要去迭代就能计算出权值
做一些简单的变换可以得到
p
a
=
∥
τ
a
∥
(
w
a
−
∥
v
a
+
1
∥
∥
v
a
∥
w
a
+
1
)
,
a
=
1
,
…
,
A
<
m
\mathbf{p}_{a}=\left\|\boldsymbol{\tau}_{a}\right\|\left(\mathbf{w}_{a}-\frac{\left\|\mathbf{v}_{a+1}\right\|}{\left\|\mathbf{v}_{a}\right\|} \mathbf{w}_{a+1}\right), a=1, \ldots, A<m
pa=∥τa∥(wa−∥va∥∥va+1∥wa+1),a=1,…,A<m
P
=
W
+
B
2
\mathbf{P}=\mathbf{W}_{+} \mathbf{B}_{2}
P=W+B2
W
+
=
[
W
w
a
+
1
]
W_{+} = [W w_{a+1}]
W+=[Wwa+1]
W ∗ ( P T W ) = W W^*(P^TW) = W W∗(PTW)=W
由此,得到 W ∗ → W W^*\rightarrow W W∗→W,是基于双对角的lanczos正交化方法
参考: The geometry of PLS1 explained properly