拟牛顿法的补充推导
假定现在已经得到
Δ
B
(
t
)
=
α
u
u
T
+
β
v
v
T
=
α
y
(
t
)
(
y
(
t
)
)
T
+
β
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
\begin{aligned} \Delta \mathbf{B}^{(t)} &= \alpha \mathbf{u} \mathbf{u}^T + \beta \mathbf{v} \mathbf{v}^T \\ &= \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T + \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T \end{aligned}
ΔB(t)=αuuT+βvvT=αy(t)(y(t))T+βB(t)s(t)(B(t)s(t))T
其中
α
=
1
(
y
(
t
)
)
T
s
(
t
)
β
=
−
1
(
s
(
t
)
)
T
(
B
(
t
)
)
T
s
(
t
)
\alpha = \frac{1}{(\mathbf{y}^{(t)})^T \mathbf{s}^{(t)}} \\ \beta = - \frac{1}{(\mathbf{s}^{(t)})^T (\mathbf{B}^{(t)})^T \mathbf{s}^{(t)}}
α=(y(t))Ts(t)1β=−(s(t))T(B(t))Ts(t)1
那么
B
(
t
+
1
)
=
B
(
t
)
+
Δ
B
(
t
)
=
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
+
β
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
⇒
(
B
(
t
+
1
)
)
−
1
=
(
B
(
t
)
+
Δ
B
(
t
)
)
−
1
=
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
+
β
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
]
−
1
\begin{aligned} \mathbf{B}^{(t+1)} &= \mathbf{B}^{(t)} + \Delta \mathbf{B}^{(t)} \\ &= \mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T + \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T \\ \Rightarrow \quad (\mathbf{B}^{(t+1)})^{-1} &= (\mathbf{B}^{(t)} + \Delta \mathbf{B}^{(t)})^{-1} \\ &= [\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T + \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T]^{-1} \end{aligned}
B(t+1)⇒(B(t+1))−1=B(t)+ΔB(t)=B(t)+αy(t)(y(t))T+βB(t)s(t)(B(t)s(t))T=(B(t)+ΔB(t))−1=[B(t)+αy(t)(y(t))T+βB(t)s(t)(B(t)s(t))T]−1
根据Sherman-Morison公式:
假设
A
∈
R
n
×
n
\mathbf{A} \in \R^{n \times n}
A∈Rn×n为可逆矩阵,
x
,
y
∈
R
n
\mathbf{x},\mathbf{y} \in \R^n
x,y∈Rn为列向量,当
A
\mathbf{A}
A为非奇异矩阵,且
1
+
x
T
A
−
1
y
≠
0
1+\mathbf{x}^T \mathbf{A}^{-1} \mathbf{y} \ne 0
1+xTA−1y=0,有
(
A
+
p
q
T
)
−
1
=
A
−
1
−
A
−
1
p
q
T
A
−
1
1
+
q
T
A
−
1
p
(\mathbf{A} + \mathbf{p} \mathbf{q}^T)^{-1} = \mathbf{A}^{-1} - \frac{\mathbf{A}^{-1} \mathbf{p} \mathbf{q}^T \mathbf{A}^{-1}}{1 + \mathbf{q}^T \mathbf{A}^{-1} \mathbf{p}}
(A+pqT)−1=A−1−1+qTA−1pA−1pqTA−1
令
A
=
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
p
=
β
B
(
t
)
s
(
t
)
q
=
B
(
t
)
s
(
t
)
\begin{aligned} \mathbf{A} &= \mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T \\ \mathbf{p} &= \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} \\ \mathbf{q} &= \mathbf{B}^{(t)} \mathbf{s}^{(t)} \end{aligned}
Apq=B(t)+αy(t)(y(t))T=βB(t)s(t)=B(t)s(t)
有
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
+
β
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
]
−
1
=
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
−
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
⋅
β
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
1
+
(
B
(
t
)
s
(
t
)
)
T
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
β
B
(
t
)
s
(
t
)
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
\begin{aligned} &[\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T + \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T]^{-1} \\ =& [\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1} - [\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1} \cdot \\ & \frac{ \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T }{1 + (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T [\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1} \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)}} [\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1} \end{aligned}
=[B(t)+αy(t)(y(t))T+βB(t)s(t)(B(t)s(t))T]−1[B(t)+αy(t)(y(t))T]−1−[B(t)+αy(t)(y(t))T]−1⋅1+(B(t)s(t))T[B(t)+αy(t)(y(t))T]−1βB(t)s(t)βB(t)s(t)(B(t)s(t))T[B(t)+αy(t)(y(t))T]−1
对其中的
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
[\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1}
[B(t)+αy(t)(y(t))T]−1再用一次Sherman-Morison公式:
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
=
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
[\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1} = (\mathbf{B}^{(t)})^{-1} - \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}
[B(t)+αy(t)(y(t))T]−1=(B(t))−1−1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1
那么
1
+
(
B
(
t
)
s
(
t
)
)
T
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
β
B
(
t
)
s
(
t
)
=
1
+
(
B
(
t
)
s
(
t
)
)
T
[
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
]
β
B
(
t
)
s
(
t
)
=
1
+
(
B
(
t
)
s
(
t
)
)
T
(
B
(
t
)
)
−
1
β
B
(
t
)
s
(
t
)
−
(
B
(
t
)
s
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
β
B
(
t
)
s
(
t
)
1 + (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T [\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1} \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} \\ = 1 + (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T [(\mathbf{B}^{(t)})^{-1} - \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}] \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} \\ = 1 + (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} - (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)}
1+(B(t)s(t))T[B(t)+αy(t)(y(t))T]−1βB(t)s(t)=1+(B(t)s(t))T[(B(t))−1−1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1]βB(t)s(t)=1+(B(t)s(t))T(B(t))−1βB(t)s(t)−(B(t)s(t))T1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1βB(t)s(t)
由于
β
=
−
1
(
s
(
t
)
)
T
(
B
(
t
)
)
T
s
(
t
)
\beta = - \frac{1}{(\mathbf{s}^{(t)})^T (\mathbf{B}^{(t)})^T \mathbf{s}^{(t)}}
β=−(s(t))T(B(t))Ts(t)1,所以
(
B
(
t
)
s
(
t
)
)
T
(
B
(
t
)
)
−
1
β
B
(
t
)
s
(
t
)
=
β
(
s
(
t
)
)
T
(
B
(
t
)
)
T
(
B
(
t
)
)
−
1
B
(
t
)
s
(
t
)
=
β
(
s
(
t
)
)
T
(
B
(
t
)
)
T
s
(
t
)
=
−
1
\begin{aligned} & (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} \\ =& \beta (\mathbf{s}^{(t)})^T (\mathbf{B}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{B}^{(t)} \mathbf{s}^{(t)} \\ =& \beta (\mathbf{s}^{(t)})^T (\mathbf{B}^{(t)})^T \mathbf{s}^{(t)} = -1 \\ \end{aligned}
==(B(t)s(t))T(B(t))−1βB(t)s(t)β(s(t))T(B(t))T(B(t))−1B(t)s(t)β(s(t))T(B(t))Ts(t)=−1
注意
B
(
t
)
\mathbf{B}^{(t)}
B(t)是实对称矩阵,有
(
B
(
t
)
)
T
=
B
(
t
)
(\mathbf{B}^{(t)})^T = \mathbf{B}^{(t)}
(B(t))T=B(t),那么
(
B
(
t
)
s
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
β
B
(
t
)
s
(
t
)
=
α
β
(
s
(
t
)
)
T
(
B
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
B
(
t
)
s
(
t
)
=
α
β
(
s
(
t
)
)
T
B
(
t
)
(
B
(
t
)
)
−
1
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
B
(
t
)
s
(
t
)
=
α
β
(
s
(
t
)
)
T
y
(
t
)
(
y
(
t
)
)
T
s
(
t
)
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
\begin{aligned} & (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} \\ =& \alpha \beta (\mathbf{s}^{(t)})^T (\mathbf{B}^{(t)})^T \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \mathbf{B}^{(t)} \mathbf{s}^{(t)} \\ =& \alpha \beta (\mathbf{s}^{(t)})^T \mathbf{B}^{(t)} \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \mathbf{B}^{(t)} \mathbf{s}^{(t)} \\ =& \frac{\alpha \beta (\mathbf{s}^{(t)})^T\mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T \mathbf{s}^{(t)}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \end{aligned}
===(B(t)s(t))T1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1βB(t)s(t)αβ(s(t))T(B(t))T1+(y(t))T(B(t))−1αy(t)(B(t))−1y(t)(y(t))T(B(t))−1B(t)s(t)αβ(s(t))TB(t)1+(y(t))T(B(t))−1αy(t)(B(t))−1y(t)(y(t))T(B(t))−1B(t)s(t)1+(y(t))T(B(t))−1αy(t)αβ(s(t))Ty(t)(y(t))Ts(t)
把上面两个结果代回
1
+
(
B
(
t
)
s
(
t
)
)
T
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
β
B
(
t
)
s
(
t
)
1 + (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T [\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1} \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)}
1+(B(t)s(t))T[B(t)+αy(t)(y(t))T]−1βB(t)s(t),有
1
+
(
B
(
t
)
s
(
t
)
)
T
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
β
B
(
t
)
s
(
t
)
=
1
+
(
B
(
t
)
s
(
t
)
)
T
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
β
B
(
t
)
s
(
t
)
=
−
α
β
(
s
(
t
)
)
T
y
(
t
)
(
y
(
t
)
)
T
s
(
t
)
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
\begin{aligned} & 1 + (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T [\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1} \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} \\ = & 1 + (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T [\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1} \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} \\ =& -\frac{\alpha \beta (\mathbf{s}^{(t)})^T\mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T \mathbf{s}^{(t)}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \end{aligned}
==1+(B(t)s(t))T[B(t)+αy(t)(y(t))T]−1βB(t)s(t)1+(B(t)s(t))T[B(t)+αy(t)(y(t))T]−1βB(t)s(t)−1+(y(t))T(B(t))−1αy(t)αβ(s(t))Ty(t)(y(t))Ts(t)
再把这个结果和
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
]
−
1
[\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T]^{-1}
[B(t)+αy(t)(y(t))T]−1的结果,代回
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
+
β
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
]
−
1
[\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T + \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T]^{-1}
[B(t)+αy(t)(y(t))T+βB(t)s(t)(B(t)s(t))T]−1,有
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
+
β
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
]
−
1
=
[
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
]
−
[
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
]
⋅
β
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
−
α
β
(
s
(
t
)
)
T
y
(
t
)
(
y
(
t
)
)
T
s
(
t
)
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
[
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
]
=
[
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
]
+
[
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
]
⋅
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
[
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
]
\begin{aligned} &[\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T + \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T]^{-1} \\ =&[ (\mathbf{B}^{(t)})^{-1} - \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}] - [(\mathbf{B}^{(t)})^{-1} - \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}] \cdot \\ & \frac{\beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T}{-\frac{\alpha \beta (\mathbf{s}^{(t)})^T\mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T \mathbf{s}^{(t)}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha\mathbf{y}^{(t)}}} [(\mathbf{B}^{(t)})^{-1} - \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}] \\ =&[ (\mathbf{B}^{(t)})^{-1} - \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}] + [(\mathbf{B}^{(t)})^{-1} - \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}] \cdot \\ & \frac{\mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T}{\frac{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} }{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}} [(\mathbf{B}^{(t)})^{-1} - \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}] \end{aligned}
==[B(t)+αy(t)(y(t))T+βB(t)s(t)(B(t)s(t))T]−1[(B(t))−1−1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1]−[(B(t))−1−1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1]⋅−1+(y(t))T(B(t))−1αy(t)αβ(s(t))Ty(t)(y(t))Ts(t)βB(t)s(t)(B(t)s(t))T[(B(t))−1−1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1][(B(t))−1−1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1]+[(B(t))−1−1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1]⋅1+(y(t))T(B(t))−1αy(t)(s(t))Ty(t)B(t)s(t)(B(t)s(t))T[(B(t))−1−1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1]
公式太复杂,为方便推导,用符号代替比较复杂的式子,令
C
(
t
)
=
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
D
(
t
)
=
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
\mathbf{C}^{(t)} = \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \\ \mathbf{D}^{(t)} = \frac{\mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T}{\frac{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} }{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}}
C(t)=1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1D(t)=1+(y(t))T(B(t))−1αy(t)(s(t))Ty(t)B(t)s(t)(B(t)s(t))T
那么
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
+
β
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
]
−
1
=
[
(
B
(
t
)
)
−
1
−
C
(
t
)
]
+
[
(
B
(
t
)
)
−
1
−
C
(
t
)
]
D
(
t
)
[
(
B
(
t
)
)
−
1
−
C
(
t
)
]
=
[
(
B
(
t
)
)
−
1
−
C
(
t
)
]
+
(
B
(
t
)
)
−
1
D
(
t
)
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
D
(
t
)
C
(
t
)
−
C
(
t
)
D
(
t
)
(
B
(
t
)
)
−
1
+
C
(
t
)
D
(
t
)
C
(
t
)
\begin{aligned} &[\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T + \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T]^{-1} \\ =&[ (\mathbf{B}^{(t)})^{-1} - \mathbf{C}^{(t)}] + [(\mathbf{B}^{(t)})^{-1} - \mathbf{C}^{(t)}] \mathbf{D}^{(t)} [(\mathbf{B}^{(t)})^{-1} - \mathbf{C}^{(t)}] \\ =&[ (\mathbf{B}^{(t)})^{-1} - \mathbf{C}^{(t)}] + (\mathbf{B}^{(t)})^{-1} \mathbf{D}^{(t)} (\mathbf{B}^{(t)})^{-1} - (\mathbf{B}^{(t)})^{-1} \mathbf{D}^{(t)} \mathbf{C}^{(t)} - \mathbf{C}^{(t)} \mathbf{D}^{(t)} (\mathbf{B}^{(t)})^{-1} + \mathbf{C}^{(t)} \mathbf{D}^{(t)} \mathbf{C}^{(t)} \\ \end{aligned}
==[B(t)+αy(t)(y(t))T+βB(t)s(t)(B(t)s(t))T]−1[(B(t))−1−C(t)]+[(B(t))−1−C(t)]D(t)[(B(t))−1−C(t)][(B(t))−1−C(t)]+(B(t))−1D(t)(B(t))−1−(B(t))−1D(t)C(t)−C(t)D(t)(B(t))−1+C(t)D(t)C(t)
分别推导:
(
B
(
t
)
)
−
1
D
(
t
)
(
B
(
t
)
)
−
1
=
(
B
(
t
)
)
−
1
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
(
B
(
t
)
)
−
1
=
s
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
=
s
(
t
)
(
s
(
t
)
)
T
(
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
)
(
s
(
t
)
)
T
y
(
t
)
=
s
(
t
)
(
s
(
t
)
)
T
[
(
s
(
t
)
)
T
y
(
t
)
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
]
[
(
s
(
t
)
)
T
y
(
t
)
]
2
\begin{aligned} & (\mathbf{B}^{(t)})^{-1} \mathbf{D}^{(t)} (\mathbf{B}^{(t)})^{-1} \\ =& (\mathbf{B}^{(t)})^{-1} \frac{\mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T}{\frac{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} }{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}} (\mathbf{B}^{(t)})^{-1} \\ =& \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T}{\frac{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} }{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}} \\ =& \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T (1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha\mathbf{y}^{(t)})}{(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}} \\ =& \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T [(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)} + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)}]}{[(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}]^2} \end{aligned}
====(B(t))−1D(t)(B(t))−1(B(t))−11+(y(t))T(B(t))−1αy(t)(s(t))Ty(t)B(t)s(t)(B(t)s(t))T(B(t))−11+(y(t))T(B(t))−1αy(t)(s(t))Ty(t)s(t)(s(t))T(s(t))Ty(t)s(t)(s(t))T(1+(y(t))T(B(t))−1αy(t))[(s(t))Ty(t)]2s(t)(s(t))T[(s(t))Ty(t)+(y(t))T(B(t))−1y(t)]
( B ( t ) ) − 1 D ( t ) C ( t ) = ( B ( t ) ) − 1 ⋅ B ( t ) s ( t ) ( B ( t ) s ( t ) ) T ( s ( t ) ) T y ( t ) 1 + ( y ( t ) ) T ( B ( t ) ) − 1 α y ( t ) ⋅ ( B ( t ) ) − 1 α y ( t ) ( y ( t ) ) T ( B ( t ) ) − 1 1 + ( y ( t ) ) T ( B ( t ) ) − 1 α y ( t ) = α s ( t ) ( s ( t ) ) T y ( t ) ( y ( t ) ) T ( B ( t ) ) − 1 ( s ( t ) ) T y ( t ) = s ( t ) ( s ( t ) ) T y ( t ) ( y ( t ) ) T ( B ( t ) ) − 1 [ ( s ( t ) ) T y ( t ) ] 2 = s ( t ) ( y ( t ) ) T ( B ( t ) ) − 1 ( s ( t ) ) T y ( t ) \begin{aligned} & (\mathbf{B}^{(t)})^{-1} \mathbf{D}^{(t)} \mathbf{C}^{(t)} \\ =& (\mathbf{B}^{(t)})^{-1} \cdot \frac{\mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T}{\frac{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} }{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}} \cdot\frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \\ =& \frac{\alpha \mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}} {(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} \\ =& \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}} {[(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}]^2} \\ =& \frac{\mathbf{s}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}} {(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} \end{aligned} ====(B(t))−1D(t)C(t)(B(t))−1⋅1+(y(t))T(B(t))−1αy(t)(s(t))Ty(t)B(t)s(t)(B(t)s(t))T⋅1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1(s(t))Ty(t)αs(t)(s(t))Ty(t)(y(t))T(B(t))−1[(s(t))Ty(t)]2s(t)(s(t))Ty(t)(y(t))T(B(t))−1(s(t))Ty(t)s(t)(y(t))T(B(t))−1
C ( t ) D ( t ) ( B ( t ) ) − 1 = ( B ( t ) ) − 1 α y ( t ) ( y ( t ) ) T ( B ( t ) ) − 1 1 + ( y ( t ) ) T ( B ( t ) ) − 1 α y ( t ) ⋅ B ( t ) s ( t ) ( B ( t ) s ( t ) ) T ( s ( t ) ) T y ( t ) 1 + ( y ( t ) ) T ( B ( t ) ) − 1 α y ( t ) ( B ( t ) ) − 1 = α ( B ( t ) ) − 1 y ( t ) ( y ( t ) ) T s ( t ) ( s ( t ) ) T ( s ( t ) ) T y ( t ) = ( B ( t ) ) − 1 y ( t ) ( y ( t ) ) T s ( t ) ( s ( t ) ) T [ ( s ( t ) ) T y ( t ) ] 2 = ( B ( t ) ) − 1 y ( t ) ( s ( t ) ) T ( s ( t ) ) T y ( t ) \begin{aligned} & \mathbf{C}^{(t)} \mathbf{D}^{(t)} (\mathbf{B}^{(t)})^{-1} \\ =& \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \cdot \frac{\mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T}{\frac{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} }{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}} (\mathbf{B}^{(t)})^{-1} \\ =& \frac{\alpha (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} } \\ =& \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T}{[(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}]^2} \\ =& \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} \end{aligned} ====C(t)D(t)(B(t))−11+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1⋅1+(y(t))T(B(t))−1αy(t)(s(t))Ty(t)B(t)s(t)(B(t)s(t))T(B(t))−1(s(t))Ty(t)α(B(t))−1y(t)(y(t))Ts(t)(s(t))T[(s(t))Ty(t)]2(B(t))−1y(t)(y(t))Ts(t)(s(t))T(s(t))Ty(t)(B(t))−1y(t)(s(t))T
C
(
t
)
D
(
t
)
C
(
t
)
=
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
⋅
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
⋅
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
=
(
B
(
t
)
)
−
1
y
(
t
)
(
y
(
t
)
)
T
s
(
t
)
(
s
(
t
)
)
T
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
(
y
(
t
)
)
T
s
(
t
)
s
(
t
)
)
T
y
(
t
)
[
(
y
(
t
)
)
T
s
(
t
)
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
]
=
(
B
(
t
)
)
−
1
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
(
y
(
t
)
)
T
s
(
t
)
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
\begin{aligned} & \mathbf{C}^{(t)} \mathbf{D}^{(t)} \mathbf{C}^{(t)} \\ =& \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \cdot \frac{\mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T}{\frac{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} }{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}} \cdot \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}} \\ =& \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T \mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{(\mathbf{y}^{(t)})^T \mathbf{s}^{(t)} \mathbf{s}^{(t)})^T \mathbf{y}^{(t)}[(\mathbf{y}^{(t)})^T \mathbf{s}^{(t)} + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)}]}\\ =& \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{(\mathbf{y}^{(t)})^T \mathbf{s}^{(t)} + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)}} \end{aligned}
===C(t)D(t)C(t)1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1⋅1+(y(t))T(B(t))−1αy(t)(s(t))Ty(t)B(t)s(t)(B(t)s(t))T⋅1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1(y(t))Ts(t)s(t))Ty(t)[(y(t))Ts(t)+(y(t))T(B(t))−1y(t)](B(t))−1y(t)(y(t))Ts(t)(s(t))Ty(t)(y(t))T(B(t))−1(y(t))Ts(t)+(y(t))T(B(t))−1y(t)(B(t))−1y(t)(y(t))T(B(t))−1
代回原式
[
B
(
t
)
+
α
y
(
t
)
(
y
(
t
)
)
T
+
β
B
(
t
)
s
(
t
)
(
B
(
t
)
s
(
t
)
)
T
]
−
1
=
[
(
B
(
t
)
)
−
1
−
C
(
t
)
]
−
(
B
(
t
)
)
−
1
D
(
t
)
(
B
(
t
)
)
−
1
+
(
B
(
t
)
)
−
1
D
(
t
)
C
(
t
)
+
C
(
t
)
D
(
t
)
(
B
(
t
)
)
−
1
−
C
(
t
)
D
(
t
)
C
(
t
)
=
[
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
α
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
1
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
α
y
(
t
)
]
+
s
(
t
)
(
s
(
t
)
)
T
[
(
s
(
t
)
)
T
y
(
t
)
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
]
[
(
s
(
t
)
)
T
y
(
t
)
]
2
−
s
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
(
s
(
t
)
)
T
y
(
t
)
−
(
B
(
t
)
)
−
1
y
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
+
(
B
(
t
)
)
−
1
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
(
y
(
t
)
)
T
s
(
t
)
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
=
[
(
B
(
t
)
)
−
1
−
(
B
(
t
)
)
−
1
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
(
y
(
t
)
)
T
s
(
t
)
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
]
+
s
(
t
)
(
s
(
t
)
)
T
[
(
s
(
t
)
)
T
y
(
t
)
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
]
[
(
s
(
t
)
)
T
y
(
t
)
]
2
−
s
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
(
s
(
t
)
)
T
y
(
t
)
−
(
B
(
t
)
)
−
1
y
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
+
(
B
(
t
)
)
−
1
y
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
(
y
(
t
)
)
T
s
(
t
)
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
=
(
B
(
t
)
)
−
1
+
s
(
t
)
(
s
(
t
)
)
T
[
(
s
(
t
)
)
T
y
(
t
)
+
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
]
[
(
s
(
t
)
)
T
y
(
t
)
]
2
−
s
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
(
s
(
t
)
)
T
y
(
t
)
−
(
B
(
t
)
)
−
1
y
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
=
(
B
(
t
)
)
−
1
+
s
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
[
(
s
(
t
)
)
T
y
(
t
)
]
2
+
s
(
t
)
(
s
(
t
)
)
T
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
y
(
t
)
[
(
s
(
t
)
)
T
y
(
t
)
]
2
−
s
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
(
s
(
t
)
)
T
y
(
t
)
−
(
B
(
t
)
)
−
1
y
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
=
(
B
(
t
)
)
−
1
[
1
−
y
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
]
+
s
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
−
s
(
t
)
(
y
(
t
)
)
T
(
B
(
t
)
)
−
1
(
s
(
t
)
)
T
y
(
t
)
[
1
−
y
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
]
=
[
1
−
s
(
t
)
(
y
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
]
(
B
(
t
)
)
−
1
[
1
−
y
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
]
+
s
(
t
)
(
s
(
t
)
)
T
(
s
(
t
)
)
T
y
(
t
)
\begin{aligned} & [\mathbf{B}^{(t)} + \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T + \beta \mathbf{B}^{(t)} \mathbf{s}^{(t)} (\mathbf{B}^{(t)} \mathbf{s}^{(t)})^T]^{-1} \\ = & [ (\mathbf{B}^{(t)})^{-1} - \mathbf{C}^{(t)}] - (\mathbf{B}^{(t)})^{-1} \mathbf{D}^{(t)} (\mathbf{B}^{(t)})^{-1} + (\mathbf{B}^{(t)})^{-1} \mathbf{D}^{(t)} \mathbf{C}^{(t)} + \mathbf{C}^{(t)} \mathbf{D}^{(t)} (\mathbf{B}^{(t)})^{-1} - \mathbf{C}^{(t)} \mathbf{D}^{(t)} \mathbf{C}^{(t)} \\ = & [ (\mathbf{B}^{(t)})^{-1} - \frac{(\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{1 + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \alpha \mathbf{y}^{(t)}}] + \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T [(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)} + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)}]}{[(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}]^2} \\ & - \frac{\mathbf{s}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}} {(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} - \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} + \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{(\mathbf{y}^{(t)})^T \mathbf{s}^{(t)} + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)}} \\ = & [ (\mathbf{B}^{(t)})^{-1} - \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{(\mathbf{y}^{(t)})^T \mathbf{s}^{(t)} + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)}}] + \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T [(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)} + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)}]}{[(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}]^2} \\ & - \frac{\mathbf{s}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}} {(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} - \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} + \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}}{(\mathbf{y}^{(t)})^T \mathbf{s}^{(t)} + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)}} \\ = & (\mathbf{B}^{(t)})^{-1} + \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T [(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)} + (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)}]}{[(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}]^2} \\ & - \frac{\mathbf{s}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}} {(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} - \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} \\ = & (\mathbf{B}^{(t)})^{-1} + \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T (\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}}{[(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}]^2} + \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)}}{[(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}]^2} \\ & - \frac{\mathbf{s}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}} {(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} - \frac{(\mathbf{B}^{(t)})^{-1} \mathbf{y}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}} \\ = & (\mathbf{B}^{(t)})^{-1} [1 - \frac{\mathbf{y}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}}] + \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}} - \frac{\mathbf{s}^{(t)} (\mathbf{y}^{(t)})^T (\mathbf{B}^{(t)})^{-1}} {(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}}[1-\frac{\mathbf{y}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}}] \\ = & [1 - \frac{\mathbf{s}^{(t)} (\mathbf{y}^{(t)})^T} {(\mathbf{s}^{(t)})^T \mathbf{y}^{(t)}}] (\mathbf{B}^{(t)})^{-1} [1-\frac{\mathbf{y}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}}] + \frac{\mathbf{s}^{(t)} (\mathbf{s}^{(t)})^T}{(\mathbf{s}^{(t)})^T\mathbf{y}^{(t)}} \end{aligned}
=======[B(t)+αy(t)(y(t))T+βB(t)s(t)(B(t)s(t))T]−1[(B(t))−1−C(t)]−(B(t))−1D(t)(B(t))−1+(B(t))−1D(t)C(t)+C(t)D(t)(B(t))−1−C(t)D(t)C(t)[(B(t))−1−1+(y(t))T(B(t))−1αy(t)(B(t))−1αy(t)(y(t))T(B(t))−1]+[(s(t))Ty(t)]2s(t)(s(t))T[(s(t))Ty(t)+(y(t))T(B(t))−1y(t)]−(s(t))Ty(t)s(t)(y(t))T(B(t))−1−(s(t))Ty(t)(B(t))−1y(t)(s(t))T+(y(t))Ts(t)+(y(t))T(B(t))−1y(t)(B(t))−1y(t)(y(t))T(B(t))−1[(B(t))−1−(y(t))Ts(t)+(y(t))T(B(t))−1y(t)(B(t))−1y(t)(y(t))T(B(t))−1]+[(s(t))Ty(t)]2s(t)(s(t))T[(s(t))Ty(t)+(y(t))T(B(t))−1y(t)]−(s(t))Ty(t)s(t)(y(t))T(B(t))−1−(s(t))Ty(t)(B(t))−1y(t)(s(t))T+(y(t))Ts(t)+(y(t))T(B(t))−1y(t)(B(t))−1y(t)(y(t))T(B(t))−1(B(t))−1+[(s(t))Ty(t)]2s(t)(s(t))T[(s(t))Ty(t)+(y(t))T(B(t))−1y(t)]−(s(t))Ty(t)s(t)(y(t))T(B(t))−1−(s(t))Ty(t)(B(t))−1y(t)(s(t))T(B(t))−1+[(s(t))Ty(t)]2s(t)(s(t))T(s(t))Ty(t)+[(s(t))Ty(t)]2s(t)(s(t))T(y(t))T(B(t))−1y(t)−(s(t))Ty(t)s(t)(y(t))T(B(t))−1−(s(t))Ty(t)(B(t))−1y(t)(s(t))T(B(t))−1[1−(s(t))Ty(t)y(t)(s(t))T]+(s(t))Ty(t)s(t)(s(t))T−(s(t))Ty(t)s(t)(y(t))T(B(t))−1[1−(s(t))Ty(t)y(t)(s(t))T][1−(s(t))Ty(t)s(t)(y(t))T](B(t))−1[1−(s(t))Ty(t)y(t)(s(t))T]+(s(t))Ty(t)s(t)(s(t))T
公式推导参考:https://liuxiaofei.com.cn/blog/lbfgs%E6%96%B9%E6%B3%95%E6%8E%A8%E5%AF%BC/
Sherman-Morrison公式证明:https://www.cnblogs.com/jclian91/p/9132834.html