normal equation:
Method to solve for
θ
analytically.
one step to get the optimal value.
θ=(XTX)−1XTy
Octave:
pinv(X'*X)*X'*y
Gradient Descent
- Need to choose
α
- Needs many iterations
- Works well even when n is large
Normal Equation
- No need to choose alpha
- Don’t need to iteration
- Need to compute (XTX)−1
Slow if n is very large(beyond 104 )
What if XTX is non-invertible?
Redundant features
(e.g. x1=size in feet
x2= size in m)- Too many features(e.g.m<=n)
delete some features, or use regularization.