泰勒（Taylor）级数和梯度下降（gradient descend）的关系

最新推荐文章于 2022-09-06 12:34:43 发布

Michael_yan2015

最新推荐文章于 2022-09-06 12:34:43 发布

阅读量4.2k

点赞数 1

分类专栏：数学文章标签： Math

数学专栏收录该内容

2 篇文章 0 订阅

订阅专栏

So taylor series is used to approximate the value of the function at X+h when we know the value of the function at X.

f(x+h) = f(x)+f'(x)*h+f''(x)*h*h/2!+f'''(x)*h*h*h/3!+...........

Gradient descent is an iterative algorithm used to find the minimum of function by moving in a direction where the function value decreases.

So from the above equation we are at currently x and we need to move in a direction so that f(x+h) is less than f(x).

if you assume the function is linear or you want to approximate only to the first order then you can write taylor series as

f(x+h) = f(x)+f’(x)*h.

so we need to find h which minimizes f(x+h) => minimize f’(x)*h

we need to find h which minimizes f’(x)*h.

if we take derivative of that you will get f’(x). it means if we use h = f’(x) the function maximizes . but if you choose h = -f’(x) then function minimizes.

so h = -f’(x).

The update rule or the new direction to move is x+h = x+-f’(x).

update rule is xnew = x - f’(x).

If you want gradient ascent xnew = x + f’(x).

if you want to use the second order derivatives as well newton method uses that and its called hessian.

copy from https://www.quora.com/What-is-the-relation-between-a-Taylor-series-approximation-and-gradient-descent-

优惠劵

Michael_yan2015

关注关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
7
评论
泰勒（Taylor）级数和梯度下降（gradient descend）的关系

So taylor series is used to approximate the value of the function at X+h when we know the value of the function at X.f(x+h) = f(x)+f'(x)*h+f''(x)*h*h/2!+f'''(x)*h*h*h/3!+...........Gradient descent is
复制链接

扫一扫