吴恩达机器学习笔记（三）

最新推荐文章于 2024-10-01 23:00:34 发布

哇哈哈哈哈呀哇哈哈哈

最新推荐文章于 2024-10-01 23:00:34 发布

阅读量163

点赞数

分类专栏：机器学习文章标签：机器学习逻辑回归人工智能

本文链接：https://blog.csdn.net/weixin_43818397/article/details/122293250

版权

机器学习专栏收录该内容

6 篇文章 0 订阅

订阅专栏

多元线性回归

多元线性回归（Multivariate linear regression）

假设函数： $h_\theta(x)=\theta_0+\theta_1x_1+\theta_2x_2+\dots+\theta_nx_n$ 。定义 $x_0=1$ ，
从而： $x=[x_0,x_1,x_2,\dots,x_n]^T,x\in\R^{n+1}$ ; $\theta=[\theta_0,\theta_1,\theta_2,\dots,\theta_n]^T$ , $\theta\in\R^{n+1}$
假设函数可记作： $h_\theta(x)=\theta^Tx$ 。

在这里插入图片描述

多元梯度下降法演练（I)：特征缩放

Gradient descent in practice I:Feature Scaling
特征缩放：Feature Scaling
ldea: Make sure features are on a similar scale.
确保特征具有类似的规模。有利于加快梯度下降算法运行速度，加快收敛到全局最小值
方法：Mean normalization
$x_i=\frac{x_i-\mu}{\sigma}$
其中 $\mu$ 为平均值， σ 为标准差；
max-min： $x_i=\frac{x_i-min}{max-min}$

多元梯度下降法演练（II)：学习率

Gradient descent in practice II:Learning rate
梯度下降更新公式： $\theta_j:=\theta_j-\alpha\frac{\partial}{\partial\theta_j}J(\theta)$ ,

“Debugging”:'How to make sure gradient descent is working correctly.
How to choose learning rate.

在这里插入图片描述
总结：
1）如果α太小：收敛慢。
2）如果α太大：每一次迭代过程中 J(θ)将会不断的越过最小值，无法收敛。
3）choose α： … ， 0.001 ， 0.003 ， 0.01 ， 0.03 ， 0.1 ， 0.3 ， 1 …
寻找一个合适的较小值和较大值，保证结果和速度的同时选取较大的值，或者稍小的合理值。

特征和多项式回归

Features and polynomial regression
在这里插入图片描述假设有两个特征： $x_1$ 表示临街宽度， $x_2$ 表示纵向深度，可做假设：
$h_\theta(x)=\theta_0+\theta_1x_1+\theta_2x_2$
选择构造新特征值：房屋面积 $x=x_1*x_2$ ,
此时假设函数： $h_\theta(x)=\theta_0+\theta_1x$
多项回归Polynomial regression

二次模型：quadratic modle
三次模型：cubic modle
选择一个合理的模型

正规方程

Normal equation:Method to solve for θ analytically.
提供一种求的θ解析方法。
代价函数： $J(\theta)=\frac{1}{2m}\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})^2$
对 J(θ)求偏导并令导数为零可解得： $\theta=(X^TX)^{-1}X^Ty$ ，推导过程如下：
在这里插入图片描述又 $\frac{\partial}{\partial\theta}J(\theta)=\frac{1}{m}\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})x^{(i)}=0$ 可得 $\theta = (X^TX)^{-1}X^Ty$
原文链接：https://blog.csdn.net/qq_29317617/article/details/86312154
总结：梯度下降VS正规方程
梯度下降算法：需要选择学习速率 α；需要许多次迭代；当特征数量n较大时也能够运转正常；
正规方程：无需选择参数；无需迭代；需要计算 $X^TX)^{-1}$ (X^TX){-1} ；当n较大时计算速度慢
————————————————
版权声明：本文为CSDN博主「阳阳yyx」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/qq_29317617/article/details/86312154