台湾大学林轩田《机器学习基石》学习笔记第14讲——Regularization

最新推荐文章于 2020-05-27 01:54:48 发布

1021stones

最新推荐文章于 2020-05-27 01:54:48 发布

阅读量367

点赞数

分类专栏：林轩田机器学习笔记文章标签： Machine Learning 机器学习基石林轩田学习笔记

本文链接：https://blog.csdn.net/Stoneeeee/article/details/82930761

版权

本文详细讲解了Regularization（规则化）技术，用于解决机器学习中的过拟合问题。通过限制目标函数的优化问题，如采用L1和L2正则化，可以避免过拟合。规则化有助于降低模型复杂度，提高泛化能力，并与VC维理论相关联。λ的选择至关重要，影响模型的拟合程度和泛化误差。

摘要由CSDN通过智能技术生成

上节课我们介绍了过拟合发生的原因：excessive power, stochastic/deterministic noise 和limited data。并介绍了解决overfitting的简单方法。本节课，我们将介绍解决overfitting的另一种非常重要的方法：Regularization规则化。

In general, regularization is a technique that applies to objective functions in ill-posed optimization problems.
The mathematical term well-posed problem stems from a definition given by Jacques Hadamard. He believed that mathematical models of physical phenomena should have the properties that:
1.a solution exists,
2.the solution is unique,
3.the solution’s behavior changes continuously with the initial conditions.
Problems that are not well-posed in the sense of Hadamard are termed ill-posed problems.
——from Wikipedia

一、Regularized Hypothesis Set
在这里插入图片描述

如右图所示，在数据量不够大的情况下，如果我们使用一个高阶多项式（图中红色曲线所示），例如10阶，对目标函数（蓝色曲线）进行拟合。拟合曲线波动很大，虽然Ein很小，但是Eout很大，也就造成了overfitting.
那么如何对过拟合现象进行修正，使hypothesis更接近于target function呢？一种方法就是左图的regularized fit。
注意到左图的红色曲线比右图的红色曲线要更平滑，因为使用了更低阶的多项式，例如2阶；那么如何把10阶的hypothesis set转化成2阶呢？
我们注意到不同阶数的hypothesis set是有包含关系的，如果我们在 $H_{10}$ 的w向量做一些限制条件，例如规定 $w_3=w_4=...=w_{10}=0$ ，那么 $H_{10}$ 就转化成了 $H_2$ 了，我们把这种操作叫做constraint；
但为什么不直接使用 $H_2$ 的hypothesis呢？
上图对刚才的限制条件做了一些宽松化：即不再要求固定的 $w_q=0$ 了，而是只要求w向量中的其中至少8个元素要等于0，从而得到一个新的hypothesis set $H_2 '$ ，我们称这种操作为Looser constraint.
这样有两个好处：一方面 $H_2 '$ 要比 $H_2$ 更灵活，另一方面 $H_2 '$