李宏毅 2020 Machine Learning：Regression

最新推荐文章于 2023-10-15 17:37:47 发布

達某

最新推荐文章于 2023-10-15 17:37:47 发布

阅读量193

点赞数

分类专栏：机器学习

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_42102452/article/details/107775002

版权

机器学习专栏收录该内容

5 篇文章 0 订阅

订阅专栏

思维导图

Model

多元线性回归

Loss function

如何评价某函数的好坏，因而引出了损失函数来决定一个最好的拟合，Input: a function, output: how bad it is.

由此引出了梯度下降法。

梯度下降法

梯度：某一函数在该点处的方向导数沿着该方向取得最大值，即函数在该点处沿着该方向（此梯度的方向）变化最快，变化率最大（为该梯度的模）。

现有的数据为输入x和输出y，将它们代入代价函数，用无穷无尽的w和b编织成一副图形，然后用梯度下降法求出极值，即代价函数最小处的w和b的值，这就是最佳model。

多参数的梯度下降法：

注：线性回归中，默认一切函数为凸函数，没有局部最优点。

Tips 1：tuning your learning rates，set the learning rate η carefully.

1、Popular & Simple Idea: Reduce the learning rate by some factor every few epochs.（adaptive）

At the beginning, we are far from the destination, so we use larger learning rate.

After several epochs, we are close to the destination, so we reduce the learning rate.

E.g.

2、Learning rate cannot be one-size-fits-all.（adagrad）

Divide the learning rate of each parameter by the root mean square of its previous derivatives.

Tips 2：Stochastic Gradient Descent, make the training faster. （随机梯度下降）

Loss is the summation over all training examples.

Tips 3：Feature Scaling:（特征缩放/归一化）

Formal Derivation（选学）

Given a point, we can easily find the point with the smallest value nearby.

Taylor Series：Let h(x) be any function infinitely differentiable around x = x0.

error

error误差包括偏差（bias）和方差（variance）。

Bias → Underfitting

If your model cannot even fit the training examples, then you have large bias. Smaller with a complex function.

Redesign your model：

1、Add more features as input

2、A more complex model

Variance → Overfitting

If you can fit the training data, but large error on testing data, then you probably have large variance. Simpler model is less influenced by the sampled data.

More data：

Very effective, but not always practical；

Regularization：

select λ obtaining the best model.

添加一项使得参数值接近0，能够得到较为平滑的曲线，在实验中会受到更小的影响。

数据选择

我们总共有三个数据集：training set（训练集），validation set（交叉验证集），testing set （测试集）

training set

所有的备选model中都有未知数θ，用training data计算出每一个model中θ的值。

validation set

经过交叉验证，挑选error最小的那个model。

testing set

仅仅用testing data来观测测试效果，理论上这个error就是我们未来投入实践的结果。

我们通常采用 K-fold cross-validation ，K折交叉验证：初始采样分割成K个子样本，一个单独的子样本被保留作为验证模型的数据，其他K-1个样本用来训练。交叉验证重复K次，每个子样本验证一次，平均K次的结果或者使用其它结合方式，最终得到一个单一估测。

这个方法的优势在于，同时重复运用随机产生的子样本进行训练和验证，每次的结果验证一次，10折交叉验证是最常用的。

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。