在回归过程中,为了避免过拟合(Overfitting)现象,在代价函数中加入正则项,成为正则化(Regularization)。
线性回归中的正则化
代价函数
J
(
θ
)
=
1
2
m
[
∑
i
=
1
m
(
h
θ
(
x
(
i
)
)
−
y
(
i
)
)
2
+
λ
∑
j
=
1
n
θ
j
2
]
J(\theta)={\frac 1 {2m}} [\sum_{i=1}^{m}(h_\theta (x^{(i)})-y^{(i)})^2+\lambda\sum_{j=1}^{n}\theta_j^2]
J(θ)=2m1[i=1∑m(hθ(x(i))−y(i))2+λj=1∑nθj2]
求解方法
- 梯度下降法(Gradient Descent)
通过迭代
θ = θ ( 1 − α λ m ) − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x ( i ) \theta = \theta(1-\alpha{\frac {\lambda} {m}})-\alpha{\frac 1 m}\sum_{i=1}^{m}(h_\theta (x^{(i)})-y^{(i)})x^{(i)} θ=θ(1−αmλ)−αm1i=1∑m(hθ(x(i))−y(i))x(i)确定参数 θ 1 , θ 2 , . . . θ n \theta_1,\theta_2,...\theta_n θ1,θ2,...θn - 正规方程法(Normal equation)
利用公式
θ = ( X T X + λ [ 0 1 1 ⋱ 1 ] ) − 1 X T y \theta=(X^TX+\lambda\left[ \begin{matrix} 0 & & & & \\ & 1 & & & \\ & & 1 & & \\ & & & \ddots & \\ & & & & 1 \end{matrix} \right])^{-1}X^Ty θ=(XTX+λ⎣⎢⎢⎢⎢⎡011⋱1⎦⎥⎥⎥⎥⎤)−1XTy确定参数 θ 1 , θ 2 , . . . θ n \theta_1,\theta_2,...\theta_n θ1,θ2,...θn
逻辑回归中的正则化
代价函数
J
(
θ
)
=
−
1
m
∑
i
=
1
m
[
y
(
i
)
log
h
θ
(
x
(
i
)
)
+
(
1
−
y
(
i
)
)
log
(
1
−
h
θ
(
x
(
i
)
)
)
]
+
1
2
m
λ
∑
j
=
1
n
θ
j
2
J(\theta)=-{\frac {1} {m}}\sum_{i=1}^m[y^{(i)}\log h_{\theta}(x^{(i))}+(1-y^{(i)})\log (1-h_{\theta}(x^{(i)}))]+{\frac 1 {2m}}\lambda\sum_{j=1}^{n}\theta_j^2
J(θ)=−m1i=1∑m[y(i)loghθ(x(i))+(1−y(i))log(1−hθ(x(i)))]+2m1λj=1∑nθj2
求解方法
- 梯度下降法(Gradient Descent)
通过迭代
θ = θ ( 1 − α λ m ) − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x ( i ) \theta = \theta(1-\alpha{\frac {\lambda} {m}})-\alpha{\frac 1 m}\sum_{i=1}^{m}(h_\theta (x^{(i)})-y^{(i)})x^{(i)} θ=θ(1−αmλ)−αm1i=1∑m(hθ(x(i))−y(i))x(i)确定参数 θ 1 , θ 2 , . . . θ n \theta_1,\theta_2,...\theta_n θ1,θ2,...θn