So taylor series is used to approximate the value of the function at X+h when we know the value of the function at X.
f(x+h) = f(x)+f'(x)*h+f''(x)*h*h/2!+f'''(x)*h*h*h/3!+...........
Gradient descent is an iterative algorithm used to find the minimum of function by moving in a direction where the function value decreases.
So from the above equation we are at currently x and we need to move in a direction so that f(x+h) is less than f(x).
if you assume the function is linear or you want to approximate only to the first order then you can write taylor series as
f(x+h) = f(x)+f’(x)*h.
so we need to find h which minimizes f(x+h) => minimize f’(x)*h
we need to find h which minimizes f’(x)*h.
if we take derivative of that you will get f’(x). it means if we use h = f’(x) the function maximizes . but if you choose h = -f’(x) then function minimizes.
so h = -f’(x).
The update rule or the new direction to move is x+h = x+-f’(x).
update rule is xnew = x - f’(x).
If you want gradient ascent xnew = x + f’(x).
if you want to use the second order derivatives as well newton method uses that and its called hessian.
copy from https://www.quora.com/What-is-the-relation-between-a-Taylor-series-approximation-and-gradient-descent-