泰勒(Taylor)级数和梯度下降(gradient descend)的关系

So taylor series is used to approximate the value of the function at X+h when we know the value of the function at X.

f(x+h) = f(x)+f'(x)*h+f''(x)*h*h/2!+f'''(x)*h*h*h/3!+...........

Gradient descent is an iterative algorithm used to find the minimum of function by moving in a direction where the function value decreases.

So from the above equation we are at currently x and we need to move in a direction so that f(x+h) is less than f(x).

if you assume the function is linear or you want to approximate only to the first order then you can write taylor series as

f(x+h) = f(x)+f’(x)*h.

so we need to find h which minimizes f(x+h) => minimize f’(x)*h

we need to find h which minimizes f’(x)*h.

if we take derivative of that you will get f’(x). it means if we use h = f’(x) the function maximizes . but if you choose h = -f’(x) then function minimizes.

so h = -f’(x).

The update rule or the new direction to move is x+h = x+-f’(x).

update rule is xnew = x - f’(x).

If you want gradient ascent xnew = x + f’(x).

if you want to use the second order derivatives as well newton method uses that and its called hessian.

copy from https://www.quora.com/What-is-the-relation-between-a-Taylor-series-approximation-and-gradient-descent-

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 7
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值