A different way to solve for θ \theta θ in cost function.
The equation for normal equation is
θ = ( X T X ) − 1 X T y \theta = (X^{T}X)^{-1}X^{T}y θ=(XTX)−1XTy
where is X is the matrix with all features values, and y is the output vector.
Gradient descent vs. Normal equation
Gradient Descent | Normal Equation |
---|---|
Human choose α \alpha α (learning rate) | Do not need to choose α \alpha α |
Need iteration | Do not need iteration |
O ( n 2 ) O(n^{2}) O(n2) | O ( n 3 ) O(n^{3}) O(n3) – Due to inverse |
Good for lots of features | Good for fewer features |