21Fall\ 单变量线性回归

Notes taken in the course Machine Learning by Andrew Ng.


  • intuition about Gradient Descent: How the algorithm works & why the  updating step makes sense

to know how the formula works, again we reduce the original problem to a simplified problem with only one parameter.

𝑎0≔𝑎0− 𝛼𝑑𝐽(𝑎0)𝑑𝑎0         (𝑗=0)

𝑑𝐽(𝑎0)𝑑𝑎0 means the partial derivative of one point, and has its geometrical meaning - the slope of the tangent to the point.

If alpha is too small, gradient descent may be low; while alpha is too large, it may overshoot the minimum, making it fail to converge, or may diverge.

If you have already reached the local optimum, the derivative term will be 0 so you won't take steps any more.

Remember the magnitude of the step you take is both related to the Learning Rate and the derivative value of the last point. So when you are stepping closer to the minimum point, you will automatically take smaller steps.

Finally, if we put the Cost Function and Gradient Descent together, we will accomplish the first Learning Algorithm - Linear Regression.

that is finally how we realized the algorithm.

Now we look back to the original problem.

In a one-parameter function f(x_1),  the graph of function f is just a 2-D curve. In a two-parameter function, however, the graph of the function is a 3-D curved surface with three axes - we call it axis x, axis y and axis z.

we assume that the coordinate of the point P is (𝑥0,𝑦0,𝑧0) and the vertical axis, z represents the value of the function J(x,y). So it's not too hard to understand the meaning of the partial derivative. The derivative to x is in the (x,𝑦0,z) surface, and the derivative to y is in the  (𝑥0,𝑦,𝑧) surface.

In this 3-D surface, you can imagine you are going down a real hill and should decide which direction to go.

The 𝒂𝟏𝒂𝒏𝒅 𝒂𝟐 should update every time so that you find the right point (𝒂𝟏 , 𝒂𝟐) in the horizontal surface corresponding to the minimum J.

That's why 𝑎1𝑎𝑛𝑑 𝑎2 should update simultaneously instead of separately.

So congrats on finishing the first Machine Learning Algorithm.

  • double-parameter Linear Regression Realization
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值