吴恩达机器学习笔记（五）正则化Regularization

哇哈哈哈哈呀哇哈哈哈

于 2022-01-07 18:12:36 发布

阅读量464

点赞数

分类专栏：机器学习文章标签：机器学习人工智能逻辑回归

本文链接：https://blog.csdn.net/weixin_43818397/article/details/122355499

版权

机器学习专栏收录该内容

6 篇文章 0 订阅

订阅专栏

正则化（regularization）

过拟合问题（overfitting)

Underfitting（欠拟合）–>high bias(高偏差)
Overfitting（过拟合）–>high variance(高方差)
Overfitting:If we have too many features, the learned hypothesis
may fit the training set very well , but fail to generalize to new examples (predict prices on new examples).模型泛化能力差
addressing overfitting
options:
1)reduce number of features（减少特征数量）
–Manually select which features to keep
–Model selection algorithm(模型选择算法)
2)regularization（正则化）
–keep all the features but reduce magnitude/values(但减少参数的大小/值) of parameters.
–Works well when we have a lot of features,each of which contributes a bit to predicting y.

代价函数Cost function（正则化代价函数）

the effect of penalizing two of the parameter values being large.
加入惩罚增大了两个参数带来的效果。
对 $\theta_j$ 加入惩罚项：
In regularized linear regression,we choose $\theta$ to minimize.
Regularization线性回归代价函数：
$J(\theta)=\frac{1}{2m}\left[ \sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})^2+\lambda\sum_{j=1}^m\theta_j^2\right]$
目标： $\underset{\theta}{\min}J(\theta)$
$\lambda$ :regularization parameter（正则参数）

λ很大的结果？

在这里插入图片描述

线性回归的正则化（Regularized linear regression）

梯度下降（Gradient descent)
在这里插入图片描述梯度下降算法：
repeat： $\theta_0:= \theta_0-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x_0^{(i)}$
$\theta_j:= \theta_j-\alpha\frac{1}{m}\left[\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x_0^{(i)}+\lambda\theta_j\right]$
等价于：
$\theta_0:= \theta_0-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x_0^{(i)}$
$\theta_j:= \theta_j(1-\alpha\frac{1}{m})-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x_j^{(i)}$

正规方程（Normal equation）
在这里插入图片描述
正规方程：
假设： $m\leq n(examples\leq features)$
$\theta=(X^TX)^{-1}X^Ty$
if λ>0，
$\theta=\left(X^TX+\lambda\underbrace{\begin{bmatrix} 0 \\ & 1 & &&\\&&1\\&&&⋱\\&&&&1 \end{bmatrix} }_{(n+1)\times(n+1)}\right)^{-1}X^Ty$
只要λ>0，那么括号内的矩阵一定不是奇异矩阵，也就是可逆的。
在这里插入图片描述

逻辑回归的正则化（Regularization logistic regression）

逻辑回归代价函数：
$J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}\log(h_\theta(x^{(i)}))+(1-y^{(i)})\log(1-h_\theta(x^{(i)})))+\frac{\lambda}{2m}\sum_{j=1}^{m}\theta_j^2$
在这里插入图片描述