【HowTo ML】回归问题

最新推荐文章于 2024-01-02 22:16:04 发布

Hivoodoo

最新推荐文章于 2024-01-02 22:16:04 发布

阅读量699

点赞数

分类专栏：机器学习

本文链接：https://blog.csdn.net/Hivoodoo/article/details/50909913

版权

机器学习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

tips:

样例的表示
$x^{(1)} = \begin{bmatrix} x_1\\ \vdots \\ x_n \end{bmatrix}$
参数表示
$\theta = \begin{bmatrix} \theta _ 0\\ \vdots \\ \theta _ n \end{bmatrix}$
$(x,y)$ 代表样例集.
$(x^{(1)},y^{(1)})$ 为第一组样例.
function hypothesis: 假设函数
- 表示:
- $h_\theta(x) = \theta_0+\theta_1*x$
- $\theta_i$ : Parameter 参数

回归问题(Regression Problem)

术语上回归意味着要预测这类连续值属性的种类。

代价函数

代价函数可以评估函数的正确度,也可以说惩罚度,模型越不准惩罚越大.

平方误差函数

J (θ) = 1 2 m \sum i = 1 m (h θ (x (i)) - y (i)) 2

$J(\theta) = \frac{1}{2m}\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})^2$
为了待会梯度下降的书写方便我们来求个偏导数.

\partial J ( θ ) \partial θ j = 1 m \sum i = 1 m (h θ (x (i)) - y (i)) x (i)

$\frac {\partial J(\theta)}{\partial \theta _ j} = \frac{1}{m}\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})x^{(i)}$

contour plots or contour figures:轮廓图
视觉化的参数估计

回归的目标：

a r g m i n (θ) J (θ)

$argmin(\theta) J(\theta)$
意义是使得J函数最小化的参数

多元线性回归(Multivariate Linear Regression)

当我们拥有多个特征时，称为多元线性回归

$h(x)=\theta^TX$
$x = [x_0 x_1 x_2 \cdots x_n] , x_0 = 1$
$\theta = [\theta_0 \theta_1\theta_2 \cdots \theta_3]$

多项式回归(Polynomial regression)

当我们使用一项特征不能很好拟合时,我们使用该特征的多项式, 参数表示与假设函数同上,但是特征变量为我们假设的多项式.

梯度下降(Gradient descent algorithm)

hint:

$:=$ 赋值
$\alpha$ 学习速率(learning rate)
梯度下降中需要同时更新 $\theta_0$ 和 $\theta_1$

梯度下降算法伪代码:

r e p e a t u n t i l c o n v e r g e n c e {θ j : = θ j - α \partial \partial θ j J (θ 0, θ 1) (f o r j : n, n = 2)}

$\begin{array}{l} repeat\;until \; convergence \{ \\ \qquad \theta_j := \theta_j - \alpha \frac{\partial}{\partial\theta_j}J(\theta_0,\theta_1) (for\ j : n,n=2) \\ \} \end{array}$
带入待解函数后可化简为

r e p e a t u n t i l c o n v e r g e n c e {θ j : = θ j - α 1 m \sum i = 1 m (h θ (x (i)) - y (i)) x (i) j (f o r j : n)}

$\begin{array}{l} repeat\;until \; convergence \{ \\ \qquad \theta_j := \theta_j - \alpha\frac{1}{m} \sum^m_{i=1}( h_\theta(x^{(i)})-y^{(i)})x_j^{(i)} (for\ j : n) \\ \} \end{array}$

梯度下降的技巧

特征缩放(Feature Scaling)

normalization is USEFUL
思想：使特征处于相似的值将加快缩放
一般来说将特征缩放至 $x_i \in [-3,3]$ 都行(经验)

均值归一化(mean normalization)

(我的理解是带偏移量的缩放,使 $x_i \in [-1,1]$ )
使用 $x_i$ 来替代 $x_i-\mu_i$ 来使特征接近0 (除了 $x_0$ 定义为 $1$ ）
一般公式： $x_i \gets (x_i-\mu_i)/s_i$
$\mu_i$ 一个任意的 $x_i$ 值
$s_i$ , $x_i$ 的范围( $max-min$ )