机器学习-Linear Regression with One Variable

最新推荐文章于 2022-11-19 20:59:13 发布

jefferyqjy

最新推荐文章于 2022-11-19 20:59:13 发布

阅读量846

点赞数

分类专栏：机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/jefferyqjy/article/details/54091104

版权

机器学习专栏收录该内容

2 篇文章 0 订阅

订阅专栏

Model and Cost Function

Linear regression predicts a real-valued output based on an input value. We discuss the application of linear regression to housing price prediction, present the notion of a cost function, and introduce the gradient descent method for learning

supervised learning: given the ‘right answer’ for each example in the data
regression problem: predict real-valued output

Notation:
m = Number of training examples()
x’s = “input” variable / features （输入值，也叫特征量）
y’s = “output” variable / “target” variable （输出变量或目标变量）
(x,y) = one training example （一个训练样本）这里写图片描述

上图这个模型被称为线性回归（linear regression）模型，另外，这是关于单变量线性回归（linear regression with one variable or univariable linear regression）

上面这个模型，是为了更友好的描述监督学习问题而得来的，在这个模型中的h是hypothesis（意为预测）的简写，当这个模型中的预测值y，也就是target value是连续的时候，这就是个线性问题（regression problem），如果是离散的，则就是分类问题（classification）

代价函数（或成本函数）Cost Function
这里写图片描述
这个函数也叫作平方差函数

这个代价函数的目的是为了使目标变量的真实值和预测值的差距最小
Idea：Choose Θ0，Θ1 so that hΘ(x) is close to y for out training examples (x,y)

Cost Function - Intuition Ⅰ
需要注意的就是，hΘ(x)是关于x的函数，而J(Θ1)是关于参数Θ1的函数，我们的目标就是找到这个参数Θ1来是J(Θ1)的值最小，则这个参数就是能使我们的预测值和真实值差距最小的那个参数

Cost Function - Intuition Ⅱ
这一讲主要介绍了轮廓图（contour plot / contour graph）来更好的理解J函数的意义

Parameter Learning

Gradient Descent（梯度下降算法）
原文： Now we need estimate the parameters in the hypothesis function. That’s where gradient descent comes in.
梯度下降算法，在这里用来最小化j函数
这里写图片描述
上图就是梯度下降算法的定义图其中的：=表示赋值，α叫做learning rate（学校效率），它控制着以多大的幅度更新参数Θ0和Θ1；
另外，上面的simultaneous update（同步更新），需要Θ0和Θ1都更新后再赋值，否则第二个赋值的值会不正确

Gradient Descent Intuition
在梯度下降算法公式等号右边的部分中，α称为learning rate，剩下的部分合起来称为derivative term（导数项）
这里写图片描述
上图显示了，当公式中的导数项为正时，Θ1的值随着导数项的增大（也就是途中斜率的增大）而减小（接近最小值）；反之当导数项为负时，Θ1的值随着导数项的增大而减小（同样也是接近最小值）