对(7.22)到(7.23)进行一个推导(不保证严谨性)
已知:
H
i
,
i
>
0
,
α
>
0
,
w
i
在
w
i
∗
某
邻
域
U
(
w
i
,
δ
)
内
,
δ
充
分
小
使
得
s
i
g
n
(
w
i
)
=
s
i
g
n
(
w
i
∗
)
H_{i,i}>0,\alpha>0,w_i在w_i^*某邻域U(w_i,\delta)内,\delta充分小使得\mathrm{sign}(w_i)=\mathrm{sign}(w_i^*)
Hi,i>0,α>0,wi在wi∗某邻域U(wi,δ)内,δ充分小使得sign(wi)=sign(wi∗)。
证明:
令
f
i
(
w
i
)
=
1
2
H
i
,
i
(
w
i
−
w
i
∗
)
2
+
α
∣
w
i
∣
f_i(w_i)=\dfrac{1}{2}H_{i,i}(w_i-w_i^*)^2+\alpha|w_i|
fi(wi)=21Hi,i(wi−wi∗)2+α∣wi∣则
J
^
(
w
;
X
,
y
)
=
J
(
w
∗
;
X
,
y
)
+
∑
i
f
i
(
w
i
)
(
1
)
\hat{J}(w;X,y)=J(w^*;X,y)+\sum_if_i(w_i) \qquad(1)
J^(w;X,y)=J(w∗;X,y)+i∑fi(wi)(1)令
∂
J
^
∂
w
i
=
∂
f
i
∂
w
i
=
0
\frac{\partial \hat{J}}{\partial w_i}=\frac{\partial f_i}{\partial w_i}=0
∂wi∂J^=∂wi∂fi=0则
H
i
,
i
(
w
i
−
w
i
∗
)
+
α
s
i
g
n
(
w
i
)
=
0
(
2
)
H_{i,i}(w_i-w_i^*)+\alpha \mathrm{sign}(w_i)=0\qquad(2)
Hi,i(wi−wi∗)+αsign(wi)=0(2)
(i) 当
w
i
<
0
,
s
i
g
n
(
w
i
)
=
−
1
,
代
入
(
2
)
,
解
得
w_i<0, \mathrm{sign}(w_i)=-1,代入(2),解得
wi<0,sign(wi)=−1,代入(2),解得
w
i
=
H
i
,
i
w
i
∗
+
α
H
i
,
i
=
w
i
∗
+
α
H
i
,
i
=
s
i
g
n
(
w
i
∗
)
(
∣
w
i
∗
∣
−
α
H
i
,
i
)
(
3
)
w_i=\frac{H_{i,i}w^*_i+\alpha}{H_{i,i}}=w^*_i+\frac{\alpha}{H_{i,i}}=\mathrm{sign}(w_i^*)(|w_i^*|-\frac{\alpha}{H_{i,i}}) \qquad(3)
wi=Hi,iHi,iwi∗+α=wi∗+Hi,iα=sign(wi∗)(∣wi∗∣−Hi,iα)(3)
∵
w
i
<
0
∴
w
i
∗
<
−
α
H
i
,
i
或
∣
w
i
∗
∣
>
α
H
i
,
i
\because w_i<0 \quad\therefore w_i^*<-\frac{\alpha}{H_{i,i}}或|w_i^*|>\frac{\alpha}{H_{i,i}}
∵wi<0∴wi∗<−Hi,iα或∣wi∗∣>Hi,iα
如
−
α
H
i
,
i
<
w
i
∗
<
0
或
∣
w
i
∗
∣
<
α
H
i
,
i
-\frac{\alpha}{H_{i,i}}<w_i^*<0或|w_i^*|<\frac{\alpha}{H_{i,i}}
−Hi,iα<wi∗<0或∣wi∗∣<Hi,iα
则
∂
f
i
∂
w
i
=
H
i
,
i
w
i
−
H
i
,
i
w
i
∗
−
α
<
−
H
i
,
i
w
i
∗
−
α
<
0
\frac{\partial f_i}{\partial w_i}=H_{i,i}w_i-H_{i,i}w_i^*-\alpha<-H_{i,i}w_i^*-\alpha<0
∂wi∂fi=Hi,iwi−Hi,iwi∗−α<−Hi,iwi∗−α<0
(ii) 当
w
i
>
0
,
s
i
g
n
(
w
i
)
=
1
,
代
入
(
2
)
,
解
得
w_i>0, \mathrm{sign}(w_i)=1,代入(2),解得
wi>0,sign(wi)=1,代入(2),解得
w
i
=
H
i
,
i
w
i
∗
−
α
H
i
,
i
=
w
i
∗
−
α
H
i
,
i
=
s
i
g
n
(
w
i
∗
)
(
∣
w
i
∗
∣
−
α
H
i
,
i
)
(
4
)
w_i=\frac{H_{i,i}w^*_i-\alpha}{H_{i,i}}=w^*_i-\frac{\alpha}{H_{i,i}}=\mathrm{sign}(w_i^*)(|w_i^*|-\frac{\alpha}{H_{i,i}})\qquad(4)
wi=Hi,iHi,iwi∗−α=wi∗−Hi,iα=sign(wi∗)(∣wi∗∣−Hi,iα)(4)
∵
w
i
>
0
∴
w
i
∗
>
α
H
i
,
i
或
∣
w
i
∗
∣
>
α
H
i
,
i
\because w_i>0 \quad\therefore w_i^*>\frac{\alpha}{H_{i,i}}或|w_i^*|>\frac{\alpha}{H_{i,i}}
∵wi>0∴wi∗>Hi,iα或∣wi∗∣>Hi,iα
如
0
<
w
i
∗
<
α
H
i
,
i
或
∣
w
i
∗
∣
<
α
H
i
,
i
0<w_i^*<\frac{\alpha}{H_{i,i}}或|w_i^*|<\frac{\alpha}{H_{i,i}}
0<wi∗<Hi,iα或∣wi∗∣<Hi,iα
则
∂
f
i
∂
w
i
=
H
i
,
i
w
i
−
H
i
,
i
w
i
∗
+
α
>
−
H
i
,
i
w
i
∗
+
α
>
0
\frac{\partial f_i}{\partial w_i}=H_{i,i}w_i-H_{i,i}w_i^*+\alpha>-H_{i,i}w_i^*+\alpha>0
∂wi∂fi=Hi,iwi−Hi,iwi∗+α>−Hi,iwi∗+α>0
由(i)(ii)(iii),当
∣
w
i
∗
∣
≥
α
H
i
,
i
|w_i^*|\ge\dfrac{\alpha}{H_{i,i}}
∣wi∗∣≥Hi,iα,解析解
w
i
=
s
i
g
n
(
w
i
∗
)
(
∣
w
i
∗
∣
−
α
H
i
,
i
)
w_i=\mathrm{sign}(w_i^*)(|w_i^*|-\dfrac{\alpha}{H_{i,i}})
wi=sign(wi∗)(∣wi∗∣−Hi,iα)
当
∣
w
i
∗
∣
≤
α
H
i
,
i
|w_i^*|\le\dfrac{\alpha}{H_{i,i}}
∣wi∗∣≤Hi,iα,有
∂
f
i
∂
w
i
=
{
−
H
i
,
i
w
i
∗
−
α
<
0
w
i
<
0
−
H
i
,
i
w
i
∗
+
α
>
0
w
i
>
0
\frac{\partial f_i}{\partial w_i}= \begin{cases} -H_{i,i} w_i^*-\alpha<0 & w_i<0 \\ -H_{i,i} w_i^*+\alpha>0& w_i>0 \end{cases}
∂wi∂fi={−Hi,iwi∗−α<0−Hi,iwi∗+α>0wi<0wi>0
故在负半轴偏导数单调递减,在正半轴偏导数单调递增,即在
w
i
=
0
w_i=0
wi=0处取得偏导数的极小值。
综上,
w
i
=
s
i
g
n
(
w
i
∗
)
max
{
∣
w
i
∗
∣
−
α
H
i
,
i
,
0
}
w_i=\mathrm{sign}(w_i^*)\max\{|w_i^*|-\dfrac{\alpha}{H_{i,i}},0\}
wi=sign(wi∗)max{∣wi∗∣−Hi,iα,0}
证毕。