对于给定的训练样本
{
x
i
,
d
i
}
i
=
1
N
\lbrace x_i, d_i\rbrace _{i=1} ^{N}
{xi,di}i=1N,最小二乘估计的正则化代价函数由下式定义:
ε
(
w
)
=
1
2
∑
i
=
1
N
(
d
i
−
w
T
X
i
)
2
+
1
2
λ
∣
∣
w
∣
∣
2
\varepsilon (w) = \frac{1}{2} \sum _{i=1} ^N(d_i - w^TX_i)^2 + \frac{1}{2} \lambda ||w||^2
ε(w)=21i=1∑N(di−wTXi)2+21λ∣∣w∣∣2
正则化项以w的形式简单地定义:
∣
∣
D
F
∣
∣
2
=
∣
∣
W
∣
∣
2
=
W
T
W
||DF||^2 = ||W||^2 = W^TW
∣∣DF∣∣2=∣∣W∣∣2=WTW
关于权值向量
W
^
\hat{W}
W^都正则化解的预期响应d的表达式有:
W
^
=
(
R
x
x
+
λ
I
)
−
1
r
d
x
\hat{W} = (R_xx + \lambda I ) ^{-1} r_{dx}
W^=(Rxx+λI)−1rdx
R
x
x
=
∑
i
=
1
N
∑
j
=
1
N
X
i
X
j
T
R_{xx} = \sum _{i=1} ^{N}\sum _{j=1} ^N X_iX_j^T
Rxx=i=1∑Nj=1∑NXiXjT
r
d
x
=
∑
i
=
1
N
X
i
d
i
r_{dx} = \sum _{i=1} ^N X_i d_i
rdx=i=1∑NXidi
以训练样本
{
x
i
,
d
i
}
i
=
1
N
\lbrace x_i, d_i\rbrace _{i=1} ^{N}
{xi,di}i=1N的形式重申
W
^
\hat{W}
W^,有:
W
^
=
(
X
T
X
+
λ
I
)
−
1
X
T
d
\hat{W} = (X^TX+ \lambda I ) ^{-1}X^Td
W^=(XTX+λI)−1XTd
X为输入数据矩阵
把最小二乘估计看成一个"核机器",把它的核表示成内积的形式:
k
(
X
,
X
i
)
=
<
X
,
X
i
>
=
X
T
X
i
,
i
=
1
,
2
,
.
.
.
,
N
k(X,X_i) = <X,X_i>=X^TX_i,i=1,2,...,N
k(X,Xi)=<X,Xi>=XTXi,i=1,2,...,N
定义正则化最小二乘估计表示逼近函数:
F
λ
(
X
)
=
∑
i
=
1
N
a
i
k
(
X
,
X
i
)
F_{\lambda}(X) = \sum_{i=1}^{N}a_i k(X,X_i)
Fλ(X)=i=1∑Naik(X,Xi)